Hi! We're using nutch (1.7) and solr 3.6 for indexing about 80k pages on several 100 different hosts.
This works quiet well, but there is still room for improvement to search result ranking and "relevancy". When using nutch and solr there are basically two values that influence the score auf a query result (correct me if I'm wrong). The score from nutch, which becomes the "boost" value in solr and the boost value from solr, which is e.g. calculated at query time. The score in nutch is either calculated bei the "scoring-opic" plugin or with the "webgraph" toolchain described here: http://wiki.apache.org/nutch/NewScoringIndexingExample which gives the PageRank/LinkRank (btw. what with the "scoring-link" plugin? Does it do anything at all? What is it role in this?). We've been playing around with PageRank lately and it's scores look a little better than with opic, but on the downside, calculation really takes very long and is very cpu intensive. Well, to cut a long story short, what is your opinion on this? Which ranking do you use? Is PageRank worth the trouble? How do you boost solr queries (if you use solr at all)? BR, -- Tobias Marx Zentrum für Informations- und Medienverarbeitung - ZIM Bergische Universität Wuppertal Büro: T.11.08 +49 202 439 2237 [email protected]
smime.p7s
Description: S/MIME cryptographic signature

