[jira] [Commented] (LUCENE-8708) Can we simplify conjunctions of range queries automatically?
[ https://issues.apache.org/jira/browse/LUCENE-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830965#comment-16830965 ] Atri Sharma commented on LUCENE-8708: - ToStringInteface is a necessary evil required to create new PointRangeQuery instances post the merging of the interval. In the current state of affairs, BooleanQuery has no visibility into the type of the range query that it is dealing with (which is a great thing). However, that limits the ability to create new range queries of the parent type directly. ToStringInterface allows a polymorphic way to allow new range queries to be created in BooleanQuery.rewrite. Regarding testInvalidPointLength, that test needs to be refactored to work with ToStringInterface. However, since that is not a blocker to the actual functionality, I chose not to spend time on it until we have more clarity on the direction in which we wish to head. > Can we simplify conjunctions of range queries automatically? > > > Key: LUCENE-8708 > URL: https://issues.apache.org/jira/browse/LUCENE-8708 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: interval_range_clauses_merging0704.patch > > > BooleanQuery#rewrite already has some logic to make queries more efficient, > such as deduplicating filters or rewriting boolean queries that wrap a single > positive clause to that clause. > It would be nice to also simplify conjunctions of range queries, so that eg. > {{foo: [5 TO *] AND foo:[* TO 20]}} would be rewritten to {{foo:[5 TO 20]}}. > When constructing queries manually or via the classic query parser, it feels > unnecessary as this is something that the user can fix easily. However if you > want to implement a query parser that only allows specifying one bound at > once, such as Gmail ({{after:2018-12-31}} > https://support.google.com/mail/answer/7190?hl=en) or GitHub > ({{updated:>=2018-12-31}} > https://help.github.com/en/articles/searching-issues-and-pull-requests#search-by-when-an-issue-or-pull-request-was-created-or-last-updated) > then you might end up with inefficient queries if the end user specifies > both an upper and a lower bound. It would be nice if we optimized those > automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8708) Can we simplify conjunctions of range queries automatically?
[ https://issues.apache.org/jira/browse/LUCENE-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830184#comment-16830184 ] Michael McCandless commented on LUCENE-8708: Hmm why do we need the {{PointRangeQuery.ToStringInterface}}? Also, why did we need to comment on that one test case – {{testInvalidPointLength}}? > Can we simplify conjunctions of range queries automatically? > > > Key: LUCENE-8708 > URL: https://issues.apache.org/jira/browse/LUCENE-8708 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: interval_range_clauses_merging0704.patch > > > BooleanQuery#rewrite already has some logic to make queries more efficient, > such as deduplicating filters or rewriting boolean queries that wrap a single > positive clause to that clause. > It would be nice to also simplify conjunctions of range queries, so that eg. > {{foo: [5 TO *] AND foo:[* TO 20]}} would be rewritten to {{foo:[5 TO 20]}}. > When constructing queries manually or via the classic query parser, it feels > unnecessary as this is something that the user can fix easily. However if you > want to implement a query parser that only allows specifying one bound at > once, such as Gmail ({{after:2018-12-31}} > https://support.google.com/mail/answer/7190?hl=en) or GitHub > ({{updated:>=2018-12-31}} > https://help.github.com/en/articles/searching-issues-and-pull-requests#search-by-when-an-issue-or-pull-request-was-created-or-last-updated) > then you might end up with inefficient queries if the end user specifies > both an upper and a lower bound. It would be nice if we optimized those > automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8708) Can we simplify conjunctions of range queries automatically?
[ https://issues.apache.org/jira/browse/LUCENE-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821001#comment-16821001 ] Atri Sharma commented on LUCENE-8708: - [~ivera] Thanks, that makes sense. I have created an issue for the same: https://issues.apache.org/jira/browse/LUCENE-8769 However, I think that we should still optimize overlapping ranges as this issue proposes so that existing users also get the performance advantage. [~jpountz] Any thoughts on how we could simplify the patch? > Can we simplify conjunctions of range queries automatically? > > > Key: LUCENE-8708 > URL: https://issues.apache.org/jira/browse/LUCENE-8708 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: interval_range_clauses_merging0704.patch > > > BooleanQuery#rewrite already has some logic to make queries more efficient, > such as deduplicating filters or rewriting boolean queries that wrap a single > positive clause to that clause. > It would be nice to also simplify conjunctions of range queries, so that eg. > {{foo: [5 TO *] AND foo:[* TO 20]}} would be rewritten to {{foo:[5 TO 20]}}. > When constructing queries manually or via the classic query parser, it feels > unnecessary as this is something that the user can fix easily. However if you > want to implement a query parser that only allows specifying one bound at > once, such as Gmail ({{after:2018-12-31}} > https://support.google.com/mail/answer/7190?hl=en) or GitHub > ({{updated:>=2018-12-31}} > https://help.github.com/en/articles/searching-issues-and-pull-requests#search-by-when-an-issue-or-pull-request-was-created-or-last-updated) > then you might end up with inefficient queries if the end user specifies > both an upper and a lower bound. It would be nice if we optimized those > automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8708) Can we simplify conjunctions of range queries automatically?
[ https://issues.apache.org/jira/browse/LUCENE-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813239#comment-16813239 ] Ignacio Vera commented on LUCENE-8708: -- Just an idea maybe bias for my background. One of the issues here is that we visit the tree for each range and this is what we are trying to improve. Maybe adding a query that can accept more than one range with a logical relationship ('AND', 'OR',...) might be less invasive and encapsulates the logic. > Can we simplify conjunctions of range queries automatically? > > > Key: LUCENE-8708 > URL: https://issues.apache.org/jira/browse/LUCENE-8708 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: interval_range_clauses_merging0704.patch > > > BooleanQuery#rewrite already has some logic to make queries more efficient, > such as deduplicating filters or rewriting boolean queries that wrap a single > positive clause to that clause. > It would be nice to also simplify conjunctions of range queries, so that eg. > {{foo: [5 TO *] AND foo:[* TO 20]}} would be rewritten to {{foo:[5 TO 20]}}. > When constructing queries manually or via the classic query parser, it feels > unnecessary as this is something that the user can fix easily. However if you > want to implement a query parser that only allows specifying one bound at > once, such as Gmail ({{after:2018-12-31}} > https://support.google.com/mail/answer/7190?hl=en) or GitHub > ({{updated:>=2018-12-31}} > https://help.github.com/en/articles/searching-issues-and-pull-requests#search-by-when-an-issue-or-pull-request-was-created-or-last-updated) > then you might end up with inefficient queries if the end user specifies > both an upper and a lower bound. It would be nice if we optimized those > automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8708) Can we simplify conjunctions of range queries automatically?
[ https://issues.apache.org/jira/browse/LUCENE-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813199#comment-16813199 ] Adrien Grand commented on LUCENE-8708: -- Thanks Atri for giving it a try! This change is a bit too invasive to my taste given that this is only a nice feature to have. That said I don't really have ideas how to make it better... > Can we simplify conjunctions of range queries automatically? > > > Key: LUCENE-8708 > URL: https://issues.apache.org/jira/browse/LUCENE-8708 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: interval_range_clauses_merging0704.patch > > > BooleanQuery#rewrite already has some logic to make queries more efficient, > such as deduplicating filters or rewriting boolean queries that wrap a single > positive clause to that clause. > It would be nice to also simplify conjunctions of range queries, so that eg. > {{foo: [5 TO *] AND foo:[* TO 20]}} would be rewritten to {{foo:[5 TO 20]}}. > When constructing queries manually or via the classic query parser, it feels > unnecessary as this is something that the user can fix easily. However if you > want to implement a query parser that only allows specifying one bound at > once, such as Gmail ({{after:2018-12-31}} > https://support.google.com/mail/answer/7190?hl=en) or GitHub > ({{updated:>=2018-12-31}} > https://help.github.com/en/articles/searching-issues-and-pull-requests#search-by-when-an-issue-or-pull-request-was-created-or-last-updated) > then you might end up with inefficient queries if the end user specifies > both an upper and a lower bound. It would be nice if we optimized those > automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8708) Can we simplify conjunctions of range queries automatically?
[ https://issues.apache.org/jira/browse/LUCENE-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811799#comment-16811799 ] Atri Sharma commented on LUCENE-8708: - Attached is a WIP patch for the same. There is one existing test which needs to be refactored to comply with the new API, which I will do before the final commit. The intent of this patch is to get early feedback and potential blockers. This commit introduces the concept of ToString interface. While a bit of a controversial change, ToString interface is necessary to allow creation of new Range Queries of a given type post the merge. I am happy to replace it with any other alternatives that seem more sane. [^interval_range_clauses_merging0704.patch] > Can we simplify conjunctions of range queries automatically? > > > Key: LUCENE-8708 > URL: https://issues.apache.org/jira/browse/LUCENE-8708 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: interval_range_clauses_merging0704.patch > > > BooleanQuery#rewrite already has some logic to make queries more efficient, > such as deduplicating filters or rewriting boolean queries that wrap a single > positive clause to that clause. > It would be nice to also simplify conjunctions of range queries, so that eg. > {{foo: [5 TO *] AND foo:[* TO 20]}} would be rewritten to {{foo:[5 TO 20]}}. > When constructing queries manually or via the classic query parser, it feels > unnecessary as this is something that the user can fix easily. However if you > want to implement a query parser that only allows specifying one bound at > once, such as Gmail ({{after:2018-12-31}} > https://support.google.com/mail/answer/7190?hl=en) or GitHub > ({{updated:>=2018-12-31}} > https://help.github.com/en/articles/searching-issues-and-pull-requests#search-by-when-an-issue-or-pull-request-was-created-or-last-updated) > then you might end up with inefficient queries if the end user specifies > both an upper and a lower bound. It would be nice if we optimized those > automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8708) Can we simplify conjunctions of range queries automatically?
[ https://issues.apache.org/jira/browse/LUCENE-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777169#comment-16777169 ] Adrien Grand commented on LUCENE-8708: -- Agreed it'd be nice to have this sort of case optimized as well. I have seen it happen with automatically-generated queries sometimes. I'm not going to work actively on it in the near future. Feel free to give it a try to see how this could look like. > Can we simplify conjunctions of range queries automatically? > > > Key: LUCENE-8708 > URL: https://issues.apache.org/jira/browse/LUCENE-8708 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > > BooleanQuery#rewrite already has some logic to make queries more efficient, > such as deduplicating filters or rewriting boolean queries that wrap a single > positive clause to that clause. > It would be nice to also simplify conjunctions of range queries, so that eg. > {{foo: [5 TO *] AND foo:[* TO 20]}} would be rewritten to {{foo:[5 TO 20]}}. > When constructing queries manually or via the classic query parser, it feels > unnecessary as this is something that the user can fix easily. However if you > want to implement a query parser that only allows specifying one bound at > once, such as Gmail ({{after:2018-12-31}} > https://support.google.com/mail/answer/7190?hl=en) or GitHub > ({{updated:>=2018-12-31}} > https://help.github.com/en/articles/searching-issues-and-pull-requests#search-by-when-an-issue-or-pull-request-was-created-or-last-updated) > then you might end up with inefficient queries if the end user specifies > both an upper and a lower bound. It would be nice if we optimized those > automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8708) Can we simplify conjunctions of range queries automatically?
[ https://issues.apache.org/jira/browse/LUCENE-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777138#comment-16777138 ] Atri Sharma commented on LUCENE-8708: - We could extend this approach to identify overlapping ranges ([5, 20], [15, 35] can be converted to 5 to 35). I can take a crack at this one, if you are not planning to actively work on it > Can we simplify conjunctions of range queries automatically? > > > Key: LUCENE-8708 > URL: https://issues.apache.org/jira/browse/LUCENE-8708 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > > BooleanQuery#rewrite already has some logic to make queries more efficient, > such as deduplicating filters or rewriting boolean queries that wrap a single > positive clause to that clause. > It would be nice to also simplify conjunctions of range queries, so that eg. > {{foo: [5 TO *] AND foo:[* TO 20]}} would be rewritten to {{foo:[5 TO 20]}}. > When constructing queries manually or via the classic query parser, it feels > unnecessary as this is something that the user can fix easily. However if you > want to implement a query parser that only allows specifying one bound at > once, such as Gmail ({{after:2018-12-31}} > https://support.google.com/mail/answer/7190?hl=en) or GitHub > ({{updated:>=2018-12-31}} > https://help.github.com/en/articles/searching-issues-and-pull-requests#search-by-when-an-issue-or-pull-request-was-created-or-last-updated) > then you might end up with inefficient queries if the end user specifies > both an upper and a lower bound. It would be nice if we optimized those > automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org