[ http://issues.apache.org/jira/browse/LUCENE-456?page=comments#action_12332225 ]
Luc Vanlerberghe commented on LUCENE-456: ----------------------------------------- This could be related to issue 453 I submitted a few days ago. (Parallel-)MultiSearcher uses a FieldDocSortedHitQueue to merge the results from the underlying IndexSearchers into the final result list (even if there's only one). There's a bug if two documents are compared that don't have the field that is being compared. If one document doesn't have the field, then it should come first, but if both don't have it, they should be considered equal and the next SortField tried. In the current implementation (1.4.3 and head) when both fields are null, one of the two is always chosen as being 'lessThan' the other. Since this is not consistent (the value of the lessThan depends of the order of the parameters in that case) it changes the sort order (as I observed) and can probably 'confuse' the queue implementation so you get dropped and double hits. Could you try to apply the patch I submitted for FieldDocSortedHitQueue and see if that helps in your case too? > Duplicate hits and missing hits in sorted search > ------------------------------------------------ > > Key: LUCENE-456 > URL: http://issues.apache.org/jira/browse/LUCENE-456 > Project: Lucene - Java > Type: Bug > Components: Search > Versions: 1.4 > Environment: JDK 1.4.2_06, probably OS independant, testet on Solaris 8 and > Win2000 > Reporter: Martin Seitz > Priority: Minor > Attachments: TestCustomSearcherSort_1_4_3.java, > TestCustomSearcherSort_HEAD.java > > If using a searcher that subclasses from IndexSearcher I get different result > sets (besides the ordering of course). The problem only occurrs if the > searcher is wrapped by (Parallel)MultiSearcher and the index is not too > small. The number of hits returned by un unsorted and a sorted search are > identical but the hits are referencing different documents. A closer look at > the result sets revealed that the sorted search returns duplicate hits. > I created test cases for Lucene 1.4.3 as well as for the head release. The > problem showed up for both, the number of duplicates beeing bigger for the > head realease. The test cases are written for package > org.apache.lucene.search. There are messages describing the problem written > to the console. In order to see all those hints the asserts are commented > out. So dont't be confused if junit reports no errors. (Sorry, beeing a > novice user of the bug tracker I don't see any means to attach the test cases > on this screen. Let's see.) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
