[jira] [Comment Edited] (LUCENE-8531) QueryBuilder hard-codes inOrder=true for generated sloppy span near queries

Michael Gibney (JIRA) Tue, 23 Oct 2018 13:36:24 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661087#comment-16661087
 ]


Michael Gibney edited comment on LUCENE-8531 at 10/23/18 8:35 PM:
------------------------------------------------------------------

> I think we should keep the default behavior as is. You can still override 
> QueryBuilder#analyzeGraphPhrase to apply a different logic on your side if 
> you want.

Certainly agreed the default behavior should be left as-is. I'm content with 
the flexibility to override, but my suggestion was based on a sense that the 
desire to support {{inOrder=true}} could be a pretty common use case.

The API does specify "phrase", but with a lower-case "p", does this necessarily 
imply that exclusively {{PhraseQuery}} semantics _should_ be supported? It's 
the de facto case that {{PhraseQuery}} semantics _have been_ supported, so it 
definitely makes sense for that to continue to be the default – but I don't 
think it'd be unreasonable to add configurable stock support for 
{{inOrder=true}}. If such support were to be added, {{QueryBuilder}} would seem 
like a logical place to do it, and since the logic necessary to implement is 
already here (in {{analyzeGraphPhrase}}), it should be a trivial addition.

I'm thinking something along the lines of splitting the {{SpanNearQuery}} part 
of \{{analyzeGraphPhrase()}} (everything after the "\{{if (phraseSlop > 0)}}" 
shortcircuit) into its own method. Even if split into a protected method, this 
would allow any override of {{analyzeGraphPhrase()}} to more cleanly leverage 
the existing logic for building {{SpanNearQuery}}.

I'm just explaining my thinking here; I guess the decision ultimately depends 
on how general a use case folks consider {{inOrder=true}} to be.


was (Author: mgibney):
> I think we should keep the default behavior as is. You can still override 
> QueryBuilder#analyzeGraphPhrase to apply a different logic on your side if 
> you want.

Certainly agreed the default behavior should be left as-is. I'm content with 
the flexibility to override, but my suggestion was based on a sense that the 
desire to support {{inOrder=true}} could be a pretty common use case.

The API does specify "phrase", but with a lower-case "p", does this necessarily 
imply that exclusively {{PhraseQuery}} semantics _should_ be supported? It's 
the de facto case that {{PhraseQuery}} semantics _have been_ supported, so it 
definitely makes sense for that to continue to be the default – but I don't 
think it'd be unreasonable to add configurable stock support for 
{{inOrder=true}}. If such support were to be added, {{QueryBuilder}} would seem 
like a logical place to do it, and since the logic necessary to implement is 
already here (in {{analyzeGraphPhrase}}), it should be a trivial addition.

I'm thinking something along the lines of splitting the {{SpanNearQuery}} part 
of \{{analyzeGraphPhrase()}} everything after the "\{{if (phraseSlop > 0)}}" 
shortcircuit) into its own method. Even if split into a protected method, this 
would allow any override of {{analyzeGraphPhrase()}} to more cleanly leverage 
the existing logic for building {{SpanNearQuery}}.

I'm just explaining my thinking here; I guess the decision ultimately depends 
on how general a use case folks consider {{inOrder=true}} to be.

> QueryBuilder hard-codes inOrder=true for generated sloppy span near queries
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-8531
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8531
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/queryparser
>            Reporter: Steve Rowe
>            Assignee: Steve Rowe
>            Priority: Major
>             Fix For: 7.6, master (8.0)
>
>         Attachments: LUCENE-8531.patch
>
>
> QueryBuilder.analyzeGraphPhrase() generates SpanNearQuery-s with passed-in 
> phraseSlop, but hard-codes inOrder ctor param as true.
> Before multi-term synonym support and graph token streams introduced the 
> possibility of generating SpanNearQuery-s, QueryBuilder generated 
> (Multi)PhraseQuery-s, which always interpret slop as allowing reordering 
> edits.  Solr's eDismax query parser generates phrase queries when its 
> pf/pf2/pf3 params are specified, and when multi-term synonyms are used with a 
> graph-aware synonym filter, SpanNearQuery-s are generated that require 
> clauses to be in order; unlike with (Multi)PhraseQuery-s, reordering edits 
> are not allowed, so this is a kind of regression.  See SOLR-12243 for edismax 
> pf/pf2/pf3 context.  (Note that the patch on SOLR-12243 also addresses 
> another problem that blocks eDismax from generating queries *at all* under 
> the above-described circumstances.)
> I propose adding a new analyzeGraphPhrase() method that allows configuration 
> of inOrder, which would allow eDismax to specify inOrder=false.  The existing 
> analyzeGraphPhrase() method would remain with its hard-coded inOrder=true, so 
> existing client behavior would remain unchanged.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-8531) QueryBuilder hard-codes inOrder=true for generated sloppy span near queries

Reply via email to