> For example, can the input file (urls.txt) contain something like: > > http://msn.com|1000 > http://yahoo.com|5000 > > Those weights (1000, 5000) would then be used by the searcher to cause > the results to come up in a specific order. > > Is this possible with current nutch?
I don't think it is possible with current nutch. But, it should be easy to add this feature. nutch uses lucene to index HTML documents. lucene provides document boosts which is basically a factor which is multiplied into the score of each document. You should be able to hack nutch/lucene to store your 'website multiplicative factor' in the document boost. Should be around 100 lines of code. Are you using intranet crawling or extranet crawling? -Vikas ____________________________________________________________________ Vikas Gupta Masters Student, Dept. of Computer Sciences, http://www.cs.utexas.edu/users/vgupta Univ. of Texas at Austin, USA ____________________________________________________________________ ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
