[ http://issues.apache.org/jira/browse/LUCENE-736?page=comments#action_12454971 ] Paul Elschot commented on LUCENE-736: -------------------------------------
I have had similar concerns when I implemented NearSpansOrdered.java and NearSpansUnordered.java, which are in the trunk now. These match somewhat different phrases, but it would be good to ensure that the same matches score the same for spans and phrases. > Sloppy Phrase Scoring Misbehavior > --------------------------------- > > Key: LUCENE-736 > URL: http://issues.apache.org/jira/browse/LUCENE-736 > Project: Lucene - Java > Issue Type: Bug > Components: Search > Reporter: Doron Cohen > Assigned To: Doron Cohen > Priority: Minor > Attachments: perf-search-new.log, perf-search-orig.log, > sloppy_phrase_java.patch.txt, sloppy_phrase_tests.patch.txt > > > This is an extension of https://issues.apache.org/jira/browse/LUCENE-697 > In addition to abnormalities Yonik pointed out in 697, there seem to be other > issues with slopy phrase search and scoring. > 1) A phrase with a repeated word would be detected in a document although it > is not there. > I.e. document = A B D C E , query = "B C B" would not find this document (as > expected), but query "B C B"~2 would find it. > I think that no matter how large the slop is, this document should not be a > match. > 2) A document containing both orders of a query, symmetrically, would score > differently for the queru and for its reveresed form. > I.e. document = A B C B A would score differently for queries "B C"~2 and "C > B"~2, although it is symmetric to both. > I will attach test cases that show both these problems and the one reported > by Yonik in 697. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
