No, nutch has not such a functionality.
The quick and dirty solution to implement this would extracting the
term vector from the original document, calculate (there would be
different algorithms) somehow the most important terms for this
document and just do a query with these terms.
HTH
Getting the term vector should be easy, but when you said calculation, is it a
simple comparision of all term vectors, or is it whole another beast?
Stefan Groschupf [EMAIL PROTECTED] wrote: No, nutch has not such a
functionality.
The quick and dirty solution to implement this would extracting
I think clustering the documents would be a solution and just recommend
other documents in the same cluster. Is there a clustering algorithm in
nutch? May be very expensive to calculate.
Daqing Zhao
On 12/20/05, Victor Lee [EMAIL PROTECTED] wrote:
Getting the term vector should be easy, but
Real clustering is for a web search engine impossible except of you
have unlimited hardware resources.
However as Jerome suggest there is a search result clustering plugin.
If you are more family with Math and algorithms you will find this
article interesting:
Hi,
Does Nutch's search result show similar pages like Google? I went to
Modzex.com which is using Nutch but I don't see similar pages in its search
result.
Many thanks.
__
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam