[ https://issues.apache.org/jira/browse/LUCENE-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661087#comment-16661087 ]
Michael Gibney edited comment on LUCENE-8531 at 10/23/18 8:35 PM: ------------------------------------------------------------------ > I think we should keep the default behavior as is. You can still override > QueryBuilder#analyzeGraphPhrase to apply a different logic on your side if > you want. Certainly agreed the default behavior should be left as-is. I'm content with the flexibility to override, but my suggestion was based on a sense that the desire to support {{inOrder=true}} could be a pretty common use case. The API does specify "phrase", but with a lower-case "p", does this necessarily imply that exclusively {{PhraseQuery}} semantics _should_ be supported? It's the de facto case that {{PhraseQuery}} semantics _have been_ supported, so it definitely makes sense for that to continue to be the default – but I don't think it'd be unreasonable to add configurable stock support for {{inOrder=true}}. If such support were to be added, {{QueryBuilder}} would seem like a logical place to do it, and since the logic necessary to implement is already here (in {{analyzeGraphPhrase}}), it should be a trivial addition. I'm thinking something along the lines of splitting the {{SpanNearQuery}} part of \{{analyzeGraphPhrase()}} (everything after the "\{{if (phraseSlop > 0)}}" shortcircuit) into its own method. Even if split into a protected method, this would allow any override of {{analyzeGraphPhrase()}} to more cleanly leverage the existing logic for building {{SpanNearQuery}}. I'm just explaining my thinking here; I guess the decision ultimately depends on how general a use case folks consider {{inOrder=true}} to be. was (Author: mgibney): > I think we should keep the default behavior as is. You can still override > QueryBuilder#analyzeGraphPhrase to apply a different logic on your side if > you want. Certainly agreed the default behavior should be left as-is. I'm content with the flexibility to override, but my suggestion was based on a sense that the desire to support {{inOrder=true}} could be a pretty common use case. The API does specify "phrase", but with a lower-case "p", does this necessarily imply that exclusively {{PhraseQuery}} semantics _should_ be supported? It's the de facto case that {{PhraseQuery}} semantics _have been_ supported, so it definitely makes sense for that to continue to be the default – but I don't think it'd be unreasonable to add configurable stock support for {{inOrder=true}}. If such support were to be added, {{QueryBuilder}} would seem like a logical place to do it, and since the logic necessary to implement is already here (in {{analyzeGraphPhrase}}), it should be a trivial addition. I'm thinking something along the lines of splitting the {{SpanNearQuery}} part of \{{analyzeGraphPhrase()}} everything after the "\{{if (phraseSlop > 0)}}" shortcircuit) into its own method. Even if split into a protected method, this would allow any override of {{analyzeGraphPhrase()}} to more cleanly leverage the existing logic for building {{SpanNearQuery}}. I'm just explaining my thinking here; I guess the decision ultimately depends on how general a use case folks consider {{inOrder=true}} to be. > QueryBuilder hard-codes inOrder=true for generated sloppy span near queries > --------------------------------------------------------------------------- > > Key: LUCENE-8531 > URL: https://issues.apache.org/jira/browse/LUCENE-8531 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser > Reporter: Steve Rowe > Assignee: Steve Rowe > Priority: Major > Fix For: 7.6, master (8.0) > > Attachments: LUCENE-8531.patch > > > QueryBuilder.analyzeGraphPhrase() generates SpanNearQuery-s with passed-in > phraseSlop, but hard-codes inOrder ctor param as true. > Before multi-term synonym support and graph token streams introduced the > possibility of generating SpanNearQuery-s, QueryBuilder generated > (Multi)PhraseQuery-s, which always interpret slop as allowing reordering > edits. Solr's eDismax query parser generates phrase queries when its > pf/pf2/pf3 params are specified, and when multi-term synonyms are used with a > graph-aware synonym filter, SpanNearQuery-s are generated that require > clauses to be in order; unlike with (Multi)PhraseQuery-s, reordering edits > are not allowed, so this is a kind of regression. See SOLR-12243 for edismax > pf/pf2/pf3 context. (Note that the patch on SOLR-12243 also addresses > another problem that blocks eDismax from generating queries *at all* under > the above-described circumstances.) > I propose adding a new analyzeGraphPhrase() method that allows configuration > of inOrder, which would allow eDismax to specify inOrder=false. The existing > analyzeGraphPhrase() method would remain with its hard-coded inOrder=true, so > existing client behavior would remain unchanged. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org