NG-Marketing, M.Schneider wrote:
> I figured it out. I used in my nutch-site.xml the following config
>
> <property>
>   <name>searcher.max.hits</name>
>   <value>2048</value>
>       ....
>
> If I change the value to nothing "" it works all fine.  It took me a couple
> of hours to figure it out. This might be a bug.
>   

This was a bug in LuceneQueryOptimizer.LimitedCollector - I fixed it now 
in rev.  447359 (in branch-0.8) and in rev. 447363 (in trunk).

NOTE: you should NEVER use "searcher.max.hits" if you didn't run 
IndexSorter on your index! If you use this setting and you run a 
non-optimized index, then you will get random-quality results, because 
after collecting that many hits (no matter if they are high- or 
low-scoring) the searcher will stop collecting. In a sorted index, you 
are more or less guaranteed to get highly-scoring hits first, so this 
makes sense; but in a regular non-sorted index scores are distributed 
randomly.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to