[
https://issues.apache.org/jira/browse/LUCENE-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erik Hatcher updated LUCENE-7638:
---------------------------------
Fix Version/s: 6.5
> Optimize graph query produced by QueryBuilder
> ---------------------------------------------
>
> Key: LUCENE-7638
> URL: https://issues.apache.org/jira/browse/LUCENE-7638
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Jim Ferenczi
> Fix For: 6.5
>
> Attachments: LUCENE-7638.patch, LUCENE-7638.patch
>
>
> The QueryBuilder creates a graph query when the underlying TokenStream
> contains token with PositionLengthAttribute greater than 1.
> These TokenStreams are in fact graphs (lattice to be more precise) where
> synonyms can span on multiple terms.
> Currently the graph query is built by visiting all the path of the graph
> TokenStream. For instance if you have a synonym like "ny, new york" and you
> search for "new york city", the query builder would produce two pathes:
> "new york city", "ny city"
> This can quickly explode when the number of multi terms synonyms increase.
> The query "ny ny" for instance would produce 4 pathes and so on.
> For boolean queries with should or must clauses it should be more efficient
> to build a boolean query that merges all the intersections in the graph. So
> instead of "new york city", "ny city" we could produce:
> "+((+new +york) ny) +city"
> The attached patch is a proposal to do that instead of the all path solution.
> The patch transforms multi terms synonyms in graph query for each
> intersection in the graph. This is not done in this patch but we could also
> create a specialized query that gives equivalent scores to multi terms
> synonyms like the SynonymQuery does for single term synonyms.
> For phrase query this patch does not change the current behavior but we could
> also use the new method to create optimized graph SpanQuery.
> [~mattweber] I think this patch could optimize a lot of cases where multiple
> muli-terms synonyms are present in a single request. Could you take a look ?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]