[jira] [Updated] (LUCENE-3068) The repeats mechanism in SloppyPhraseScorer is broken when doc has tokens at same position
[ https://issues.apache.org/jira/browse/LUCENE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen updated LUCENE-3068: Attachment: LUCENE-3068.patch Attached patch fixes this bug by excluding fro the repeats check those PPs originated fro same offset in the query. This allows more strict phrase queries: strict on terms in same position (AND logic) but still sloppy. All tests pass, this is ready to go in (unless there are reservations). The repeats mechanism in SloppyPhraseScorer is broken when doc has tokens at same position -- Key: LUCENE-3068 URL: https://issues.apache.org/jira/browse/LUCENE-3068 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 3.0.3, 3.1, 4.0 Reporter: Michael McCandless Assignee: Doron Cohen Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3068.patch, LUCENE-3068.patch, LUCENE-3068.patch In LUCENE-736 we made fixes to SloppyPhraseScorer, because it was matching docs that it shouldn't; but I think those changes caused it to fail to match docs that it should, specifically when the doc itself has tokens at the same position. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3068) The repeats mechanism in SloppyPhraseScorer is broken when doc has tokens at same position
[ https://issues.apache.org/jira/browse/LUCENE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen updated LUCENE-3068: Attachment: LUCENE-3068.patch Patch with more test cases - AND/OR logic for MPQ is combined, and test code made simpler. The repeats mechanism in SloppyPhraseScorer is broken when doc has tokens at same position -- Key: LUCENE-3068 URL: https://issues.apache.org/jira/browse/LUCENE-3068 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 3.0.3, 3.1, 4.0 Reporter: Michael McCandless Assignee: Doron Cohen Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3068.patch, LUCENE-3068.patch, LUCENE-3068.patch, LUCENE-3068.patch In LUCENE-736 we made fixes to SloppyPhraseScorer, because it was matching docs that it shouldn't; but I think those changes caused it to fail to match docs that it should, specifically when the doc itself has tokens at the same position. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3068) The repeats mechanism in SloppyPhraseScorer is broken when doc has tokens at same position
[ https://issues.apache.org/jira/browse/LUCENE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3068: --- Attachment: LUCENE-3068.patch Patch w/ test case showing the problem. If you set slop to 0 for the PhraseQuery, the test passes. The MultiPhraseQuery passes with slop or no slop because it handles the same-position case itself (Union*Enum). That got me thinking... maybe any time a *PhraseQuery has overlapping positions, we should rewrite to a MultiPhraseQuery and let it handle the same positions...? Is there any downside to that? The repeats mechanism in SloppyPhraseScorer is broken when doc has tokens at same position -- Key: LUCENE-3068 URL: https://issues.apache.org/jira/browse/LUCENE-3068 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 3.0.3, 3.1, 4.0 Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3068.patch In LUCENE-736 we made fixes to SloppyPhraseScorer, because it was matching docs that it shouldn't; but I think those changes caused it to fail to match docs that it should, specifically when the doc itself has tokens at the same position. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3068) The repeats mechanism in SloppyPhraseScorer is broken when doc has tokens at same position
[ https://issues.apache.org/jira/browse/LUCENE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen updated LUCENE-3068: Attachment: LUCENE-3068.patch Attached modified version of the test - one that invokes the query parser to create an MFQ. The test passes. The repeats mechanism in SloppyPhraseScorer is broken when doc has tokens at same position -- Key: LUCENE-3068 URL: https://issues.apache.org/jira/browse/LUCENE-3068 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 3.0.3, 3.1, 4.0 Reporter: Michael McCandless Assignee: Doron Cohen Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-3068.patch, LUCENE-3068.patch In LUCENE-736 we made fixes to SloppyPhraseScorer, because it was matching docs that it shouldn't; but I think those changes caused it to fail to match docs that it should, specifically when the doc itself has tokens at the same position. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org