> For example, can the input file (urls.txt) contain something like:
>
> http://msn.com|1000
> http://yahoo.com|5000
>
> Those weights (1000, 5000) would then be used by the searcher to cause
> the results to come up in a specific order.
>
> Is this possible with current nutch?

I don't think it is possible with current nutch. But, it should be easy to
add this feature.

nutch uses lucene to index HTML documents. lucene provides document
boosts which is basically a factor which is multiplied into the score of
each document. You should be able to hack nutch/lucene to store your
'website multiplicative factor' in the document boost.

Should be around 100 lines of code.

Are you using intranet crawling or extranet crawling?

-Vikas

____________________________________________________________________
Vikas Gupta
Masters Student,
Dept. of Computer Sciences,   http://www.cs.utexas.edu/users/vgupta
Univ. of Texas at Austin, USA
____________________________________________________________________



-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to