[ https://issues.apache.org/jira/browse/LUCENE-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650813#comment-16650813 ]
Jim Ferenczi commented on LUCENE-8531: -------------------------------------- (Multi)PhraseQuery-s allows some reordering but the semantic is different from an unordered span near query. I don't think we can respect the slop correctly if we continue to use span queries here. We switched to span queries to avoid searching duplicate terms in multiple phrase queries but I agree that the behavior is not consistent when using a slop. Maybe we could switch to the old method of building one phrase query per path if a slop is used ? This way we could apply the slop to each phrase query independently. This is more costly than the span method but it would be semantically correct. > QueryBuilder hard-codes inOrder=true for generated sloppy span near queries > --------------------------------------------------------------------------- > > Key: LUCENE-8531 > URL: https://issues.apache.org/jira/browse/LUCENE-8531 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser > Reporter: Steve Rowe > Assignee: Steve Rowe > Priority: Major > > QueryBuilder.analyzeGraphPhrase() generates SpanNearQuery-s with passed-in > phraseSlop, but hard-codes inOrder ctor param as true. > Before multi-term synonym support and graph token streams introduced the > possibility of generating SpanNearQuery-s, QueryBuilder generated > (Multi)PhraseQuery-s, which always interpret slop as allowing reordering > edits. Solr's eDismax query parser generates phrase queries when its > pf/pf2/pf3 params are specified, and when multi-term synonyms are used with a > graph-aware synonym filter, SpanNearQuery-s are generated that require > clauses to be in order; unlike with (Multi)PhraseQuery-s, reordering edits > are not allowed, so this is a kind of regression. See SOLR-12243 for edismax > pf/pf2/pf3 context. (Note that the patch on SOLR-12243 also addresses > another problem that blocks eDismax from generating queries *at all* under > the above-described circumstances.) > I propose adding a new analyzeGraphPhrase() method that allows configuration > of inOrder, which would allow eDismax to specify inOrder=false. The existing > analyzeGraphPhrase() method would remain with its hard-coded inOrder=true, so > existing client behavior would remain unchanged. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org