[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860936#comment-16860936 ] ASF subversion and git services commented on LUCENE-8845: - Commit eee1bc72a4c95bbc3e148712d404299a90c0c2f9 in lucene-solr's branch refs/heads/branch_8x from Alan Woodward [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=eee1bc7 ] LUCENE-8845: Add additional max boolean clause cap on expansion > Allow maxExpansions to be set on multi-term Intervals > - > > Key: LUCENE-8845 > URL: https://issues.apache.org/jira/browse/LUCENE-8845 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Fix For: 8.2 > > Attachments: LUCENE-8845.patch > > > MultiTermIntervalsSource has a maxExpansions parameter which is always set to > 128 by the factory methods Intervals.prefix() and Intervals.wildcard(). We > should keep 128 as the default, but also add additional methods that take a > configurable maximum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860935#comment-16860935 ] ASF subversion and git services commented on LUCENE-8845: - Commit 7a2b96510621e3da57f9a9c85847141ba3042559 in lucene-solr's branch refs/heads/master from Alan Woodward [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7a2b965 ] LUCENE-8845: Add additional max boolean clause cap on expansion > Allow maxExpansions to be set on multi-term Intervals > - > > Key: LUCENE-8845 > URL: https://issues.apache.org/jira/browse/LUCENE-8845 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Fix For: 8.2 > > Attachments: LUCENE-8845.patch > > > MultiTermIntervalsSource has a maxExpansions parameter which is always set to > 128 by the factory methods Intervals.prefix() and Intervals.wildcard(). We > should keep 128 as the default, but also add additional methods that take a > configurable maximum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860693#comment-16860693 ] Jim Ferenczi commented on LUCENE-8845: -- +1 > Allow maxExpansions to be set on multi-term Intervals > - > > Key: LUCENE-8845 > URL: https://issues.apache.org/jira/browse/LUCENE-8845 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Fix For: 8.2 > > Attachments: LUCENE-8845.patch > > > MultiTermIntervalsSource has a maxExpansions parameter which is always set to > 128 by the factory methods Intervals.prefix() and Intervals.wildcard(). We > should keep 128 as the default, but also add additional methods that take a > configurable maximum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860689#comment-16860689 ] Alan Woodward commented on LUCENE-8845: --- I can add the following: {code} diff --git a/lucene/sandbox/src/java/org/apache/lucene/search/intervals/MultiTermIntervalsSource.java b/lucene/sandbox/src/java/org/apache/lucene/search/intervals/MultiTermIntervalsSource.java index 213ef730476..9c9b5f95c28 100644 --- a/lucene/sandbox/src/java/org/apache/lucene/search/intervals/MultiTermIntervalsSource.java +++ b/lucene/sandbox/src/java/org/apache/lucene/search/intervals/MultiTermIntervalsSource.java @@ -27,6 +27,7 @@ import java.util.Objects; import org.apache.lucene.index.LeafReaderContext; import org.apache.lucene.index.Terms; import org.apache.lucene.index.TermsEnum; +import org.apache.lucene.search.BooleanQuery; import org.apache.lucene.search.MatchesIterator; import org.apache.lucene.search.MatchesUtils; import org.apache.lucene.search.QueryVisitor; @@ -41,6 +42,10 @@ class MultiTermIntervalsSource extends IntervalsSource { MultiTermIntervalsSource(CompiledAutomaton automaton, int maxExpansions, String pattern) { this.automaton = automaton; +if (maxExpansions > BooleanQuery.getMaxClauseCount()) { + throw new IllegalArgumentException("maxExpansions [" + maxExpansions + + "] cannot be greater than BooleanQuery.getMaxClauseCount [" + BooleanQuery.getMaxClauseCount() + "]"); +} this.maxExpansions = maxExpansions; this.pattern = pattern; } {code} > Allow maxExpansions to be set on multi-term Intervals > - > > Key: LUCENE-8845 > URL: https://issues.apache.org/jira/browse/LUCENE-8845 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Fix For: 8.2 > > Attachments: LUCENE-8845.patch > > > MultiTermIntervalsSource has a maxExpansions parameter which is always set to > 128 by the factory methods Intervals.prefix() and Intervals.wildcard(). We > should keep 128 as the default, but also add additional methods that take a > configurable maximum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860252#comment-16860252 ] Jim Ferenczi commented on LUCENE-8845: -- {quote} 2) I think this is covered by the javadocs and the 'expert' marking. Some users really do need to see all expansions, and if they're aware of the trade-offs involved then I don't think we need any further hard caps. {quote} I think we should try to prevent users to shoot themselves in the foot. IMO this is more important than for other queries because reaching the limit throw an error so I expect that users raise the limit until they find a number that work for all queries. Can we add an hard limit equals to max_boolean_clause ? This would be consistent with the discussion in https://issues.apache.org/jira/browse/LUCENE-8811 that should also check the number of sources in an interval query ? > Allow maxExpansions to be set on multi-term Intervals > - > > Key: LUCENE-8845 > URL: https://issues.apache.org/jira/browse/LUCENE-8845 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Fix For: 8.2 > > Attachments: LUCENE-8845.patch > > > MultiTermIntervalsSource has a maxExpansions parameter which is always set to > 128 by the factory methods Intervals.prefix() and Intervals.wildcard(). We > should keep 128 as the default, but also add additional methods that take a > configurable maximum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860071#comment-16860071 ] ASF subversion and git services commented on LUCENE-8845: - Commit 74a695ee444d195421c9e5cc28a813b42cea in lucene-solr's branch refs/heads/branch_8x from Alan Woodward [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=74a695e ] LUCENE-8845: Allow configurable maxExpansions for prefix/wildcard intervals > Allow maxExpansions to be set on multi-term Intervals > - > > Key: LUCENE-8845 > URL: https://issues.apache.org/jira/browse/LUCENE-8845 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8845.patch > > > MultiTermIntervalsSource has a maxExpansions parameter which is always set to > 128 by the factory methods Intervals.prefix() and Intervals.wildcard(). We > should keep 128 as the default, but also add additional methods that take a > configurable maximum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860072#comment-16860072 ] ASF subversion and git services commented on LUCENE-8845: - Commit e8950f4a528605f9be17c644eef4f47d0659317b in lucene-solr's branch refs/heads/master from Alan Woodward [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e8950f4 ] LUCENE-8845: Allow configurable maxExpansions for prefix/wildcard intervals > Allow maxExpansions to be set on multi-term Intervals > - > > Key: LUCENE-8845 > URL: https://issues.apache.org/jira/browse/LUCENE-8845 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8845.patch > > > MultiTermIntervalsSource has a maxExpansions parameter which is always set to > 128 by the factory methods Intervals.prefix() and Intervals.wildcard(). We > should keep 128 as the default, but also add additional methods that take a > configurable maximum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860030#comment-16860030 ] Atri Sharma commented on LUCENE-8845: - ++, LGTM > Allow maxExpansions to be set on multi-term Intervals > - > > Key: LUCENE-8845 > URL: https://issues.apache.org/jira/browse/LUCENE-8845 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8845.patch > > > MultiTermIntervalsSource has a maxExpansions parameter which is always set to > 128 by the factory methods Intervals.prefix() and Intervals.wildcard(). We > should keep 128 as the default, but also add additional methods that take a > configurable maximum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860008#comment-16860008 ] Alan Woodward commented on LUCENE-8845: --- Thanks for the review! 1) Good idea 2) I think this is covered by the javadocs and the 'expert' marking. Some users really do need to see all expansions, and if they're aware of the trade-offs involved then I don't think we need any further hard caps. 3) This is a general accuracy improvement, which was only picked up when I added specific tests for the expansion cap. > Allow maxExpansions to be set on multi-term Intervals > - > > Key: LUCENE-8845 > URL: https://issues.apache.org/jira/browse/LUCENE-8845 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8845.patch > > > MultiTermIntervalsSource has a maxExpansions parameter which is always set to > 128 by the factory methods Intervals.prefix() and Intervals.wildcard(). We > should keep 128 as the default, but also add additional methods that take a > configurable maximum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860006#comment-16860006 ] Atri Sharma commented on LUCENE-8845: - Patch looks good and a useful change! Here are some minor comments: 1) Should we mark the new method as expert? 2) Should we still have a "hard" cap on the number of expansions we can support? It would be prudent to have a limit to ensure that users do not specify too high a limit and then see unexpected and drastic changes to latencies? 3) There is a change in the check for number of terms vs max number of terms allowed (theincrement happens before the check of max terms vs after). Is that something that is unrelated to this change, but a general accuracy improvement? > Allow maxExpansions to be set on multi-term Intervals > - > > Key: LUCENE-8845 > URL: https://issues.apache.org/jira/browse/LUCENE-8845 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8845.patch > > > MultiTermIntervalsSource has a maxExpansions parameter which is always set to > 128 by the factory methods Intervals.prefix() and Intervals.wildcard(). We > should keep 128 as the default, but also add additional methods that take a > configurable maximum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals
[ https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860003#comment-16860003 ] Alan Woodward commented on LUCENE-8845: --- Here is a patch that adds factory methods for both prefix() and wildcard(), along with tests and some suitably apocalyptic warnings in the associated javadocs. > Allow maxExpansions to be set on multi-term Intervals > - > > Key: LUCENE-8845 > URL: https://issues.apache.org/jira/browse/LUCENE-8845 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8845.patch > > > MultiTermIntervalsSource has a maxExpansions parameter which is always set to > 128 by the factory methods Intervals.prefix() and Intervals.wildcard(). We > should keep 128 as the default, but also add additional methods that take a > configurable maximum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org