[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860936#comment-16860936
 ] 

ASF subversion and git services commented on LUCENE-8845:
-

Commit eee1bc72a4c95bbc3e148712d404299a90c0c2f9 in lucene-solr's branch 
refs/heads/branch_8x from Alan Woodward
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=eee1bc7 ]

LUCENE-8845: Add additional max boolean clause cap on expansion


> Allow maxExpansions to be set on multi-term Intervals
> -
>
> Key: LUCENE-8845
> URL: https://issues.apache.org/jira/browse/LUCENE-8845
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Fix For: 8.2
>
> Attachments: LUCENE-8845.patch
>
>
> MultiTermIntervalsSource has a maxExpansions parameter which is always set to 
> 128 by the factory methods Intervals.prefix() and Intervals.wildcard().  We 
> should keep 128 as the default, but also add additional methods that take a 
> configurable maximum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860935#comment-16860935
 ] 

ASF subversion and git services commented on LUCENE-8845:
-

Commit 7a2b96510621e3da57f9a9c85847141ba3042559 in lucene-solr's branch 
refs/heads/master from Alan Woodward
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7a2b965 ]

LUCENE-8845: Add additional max boolean clause cap on expansion


> Allow maxExpansions to be set on multi-term Intervals
> -
>
> Key: LUCENE-8845
> URL: https://issues.apache.org/jira/browse/LUCENE-8845
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Fix For: 8.2
>
> Attachments: LUCENE-8845.patch
>
>
> MultiTermIntervalsSource has a maxExpansions parameter which is always set to 
> 128 by the factory methods Intervals.prefix() and Intervals.wildcard().  We 
> should keep 128 as the default, but also add additional methods that take a 
> configurable maximum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-11 Thread Jim Ferenczi (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860693#comment-16860693
 ] 

Jim Ferenczi commented on LUCENE-8845:
--

+1

> Allow maxExpansions to be set on multi-term Intervals
> -
>
> Key: LUCENE-8845
> URL: https://issues.apache.org/jira/browse/LUCENE-8845
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Fix For: 8.2
>
> Attachments: LUCENE-8845.patch
>
>
> MultiTermIntervalsSource has a maxExpansions parameter which is always set to 
> 128 by the factory methods Intervals.prefix() and Intervals.wildcard().  We 
> should keep 128 as the default, but also add additional methods that take a 
> configurable maximum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-11 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860689#comment-16860689
 ] 

Alan Woodward commented on LUCENE-8845:
---

I can add the following:
{code}
diff --git 
a/lucene/sandbox/src/java/org/apache/lucene/search/intervals/MultiTermIntervalsSource.java
 
b/lucene/sandbox/src/java/org/apache/lucene/search/intervals/MultiTermIntervalsSource.java
index 213ef730476..9c9b5f95c28 100644
--- 
a/lucene/sandbox/src/java/org/apache/lucene/search/intervals/MultiTermIntervalsSource.java
+++ 
b/lucene/sandbox/src/java/org/apache/lucene/search/intervals/MultiTermIntervalsSource.java
@@ -27,6 +27,7 @@ import java.util.Objects;
 import org.apache.lucene.index.LeafReaderContext;
 import org.apache.lucene.index.Terms;
 import org.apache.lucene.index.TermsEnum;
+import org.apache.lucene.search.BooleanQuery;
 import org.apache.lucene.search.MatchesIterator;
 import org.apache.lucene.search.MatchesUtils;
 import org.apache.lucene.search.QueryVisitor;
@@ -41,6 +42,10 @@ class MultiTermIntervalsSource extends IntervalsSource {

   MultiTermIntervalsSource(CompiledAutomaton automaton, int maxExpansions, 
String pattern) {
 this.automaton = automaton;
+if (maxExpansions > BooleanQuery.getMaxClauseCount()) {
+  throw new IllegalArgumentException("maxExpansions [" + maxExpansions
+  + "] cannot be greater than BooleanQuery.getMaxClauseCount [" + 
BooleanQuery.getMaxClauseCount() + "]");
+}
 this.maxExpansions = maxExpansions;
 this.pattern = pattern;
   }
{code}

> Allow maxExpansions to be set on multi-term Intervals
> -
>
> Key: LUCENE-8845
> URL: https://issues.apache.org/jira/browse/LUCENE-8845
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Fix For: 8.2
>
> Attachments: LUCENE-8845.patch
>
>
> MultiTermIntervalsSource has a maxExpansions parameter which is always set to 
> 128 by the factory methods Intervals.prefix() and Intervals.wildcard().  We 
> should keep 128 as the default, but also add additional methods that take a 
> configurable maximum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-10 Thread Jim Ferenczi (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860252#comment-16860252
 ] 

Jim Ferenczi commented on LUCENE-8845:
--

{quote}
2) I think this is covered by the javadocs and the 'expert' marking. Some users 
really do need to see all expansions, and if they're aware of the trade-offs 
involved then I don't think we need any further hard caps.
{quote}

I think we should try to prevent users to shoot themselves in the foot. IMO 
this is more important than for other queries because reaching the limit throw 
an error so I expect that users raise the limit until they find a number that 
work for all queries. Can we add an hard limit equals to max_boolean_clause ? 
This would be consistent with the discussion in 
https://issues.apache.org/jira/browse/LUCENE-8811 that should also check the 
number of sources in an interval query ?

> Allow maxExpansions to be set on multi-term Intervals
> -
>
> Key: LUCENE-8845
> URL: https://issues.apache.org/jira/browse/LUCENE-8845
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Fix For: 8.2
>
> Attachments: LUCENE-8845.patch
>
>
> MultiTermIntervalsSource has a maxExpansions parameter which is always set to 
> 128 by the factory methods Intervals.prefix() and Intervals.wildcard().  We 
> should keep 128 as the default, but also add additional methods that take a 
> configurable maximum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-10 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860071#comment-16860071
 ] 

ASF subversion and git services commented on LUCENE-8845:
-

Commit 74a695ee444d195421c9e5cc28a813b42cea in lucene-solr's branch 
refs/heads/branch_8x from Alan Woodward
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=74a695e ]

LUCENE-8845: Allow configurable maxExpansions for prefix/wildcard intervals


> Allow maxExpansions to be set on multi-term Intervals
> -
>
> Key: LUCENE-8845
> URL: https://issues.apache.org/jira/browse/LUCENE-8845
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Attachments: LUCENE-8845.patch
>
>
> MultiTermIntervalsSource has a maxExpansions parameter which is always set to 
> 128 by the factory methods Intervals.prefix() and Intervals.wildcard().  We 
> should keep 128 as the default, but also add additional methods that take a 
> configurable maximum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-10 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860072#comment-16860072
 ] 

ASF subversion and git services commented on LUCENE-8845:
-

Commit e8950f4a528605f9be17c644eef4f47d0659317b in lucene-solr's branch 
refs/heads/master from Alan Woodward
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e8950f4 ]

LUCENE-8845: Allow configurable maxExpansions for prefix/wildcard intervals


> Allow maxExpansions to be set on multi-term Intervals
> -
>
> Key: LUCENE-8845
> URL: https://issues.apache.org/jira/browse/LUCENE-8845
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Attachments: LUCENE-8845.patch
>
>
> MultiTermIntervalsSource has a maxExpansions parameter which is always set to 
> 128 by the factory methods Intervals.prefix() and Intervals.wildcard().  We 
> should keep 128 as the default, but also add additional methods that take a 
> configurable maximum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-10 Thread Atri Sharma (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860030#comment-16860030
 ] 

Atri Sharma commented on LUCENE-8845:
-

++, LGTM

> Allow maxExpansions to be set on multi-term Intervals
> -
>
> Key: LUCENE-8845
> URL: https://issues.apache.org/jira/browse/LUCENE-8845
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Attachments: LUCENE-8845.patch
>
>
> MultiTermIntervalsSource has a maxExpansions parameter which is always set to 
> 128 by the factory methods Intervals.prefix() and Intervals.wildcard().  We 
> should keep 128 as the default, but also add additional methods that take a 
> configurable maximum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-10 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860008#comment-16860008
 ] 

Alan Woodward commented on LUCENE-8845:
---

Thanks for the review!

1) Good idea
2) I think this is covered by the javadocs and the 'expert' marking.  Some 
users really do need to see all expansions, and if they're aware of the 
trade-offs involved then I don't think we need any further hard caps.
3) This is a general accuracy improvement, which was only picked up when I 
added specific tests for the expansion cap.

> Allow maxExpansions to be set on multi-term Intervals
> -
>
> Key: LUCENE-8845
> URL: https://issues.apache.org/jira/browse/LUCENE-8845
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Attachments: LUCENE-8845.patch
>
>
> MultiTermIntervalsSource has a maxExpansions parameter which is always set to 
> 128 by the factory methods Intervals.prefix() and Intervals.wildcard().  We 
> should keep 128 as the default, but also add additional methods that take a 
> configurable maximum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-10 Thread Atri Sharma (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860006#comment-16860006
 ] 

Atri Sharma commented on LUCENE-8845:
-

Patch looks good and a useful change!

Here are some minor comments:

1) Should we mark the new method as expert?

2) Should we still have a "hard" cap on the number of expansions we can 
support? It would be prudent to have a limit to ensure that users do not 
specify too high a limit and then see unexpected and drastic changes to 
latencies?

3) There is a change in the check for number of terms vs max number of terms 
allowed (theincrement happens before the check of max terms vs after). Is that 
something that is unrelated to this change, but a general accuracy improvement?

> Allow maxExpansions to be set on multi-term Intervals
> -
>
> Key: LUCENE-8845
> URL: https://issues.apache.org/jira/browse/LUCENE-8845
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Attachments: LUCENE-8845.patch
>
>
> MultiTermIntervalsSource has a maxExpansions parameter which is always set to 
> 128 by the factory methods Intervals.prefix() and Intervals.wildcard().  We 
> should keep 128 as the default, but also add additional methods that take a 
> configurable maximum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8845) Allow maxExpansions to be set on multi-term Intervals

2019-06-10 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860003#comment-16860003
 ] 

Alan Woodward commented on LUCENE-8845:
---

Here is a patch that adds factory methods for both prefix() and wildcard(), 
along with tests and some suitably apocalyptic warnings in the associated 
javadocs.

> Allow maxExpansions to be set on multi-term Intervals
> -
>
> Key: LUCENE-8845
> URL: https://issues.apache.org/jira/browse/LUCENE-8845
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
> Attachments: LUCENE-8845.patch
>
>
> MultiTermIntervalsSource has a maxExpansions parameter which is always set to 
> 128 by the factory methods Intervals.prefix() and Intervals.wildcard().  We 
> should keep 128 as the default, but also add additional methods that take a 
> configurable maximum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org