[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2019-07-26 Thread Michael Gibney (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893958#comment-16893958
 ] 

Michael Gibney commented on SOLR-7798:
--

[~joel.bernstein], I circled back to this, and squash-rebased [PR 
325|https://github.com/apache/lucene-solr/pull/325] on current master. The 
patch applies cleanly and passes precommit and all tests, so it should be 
solid. I'm sorry for the false start (in Feb. 2018); if you'd be willing to 
take another look at this, I think this will now _actually_ be as 
straightforward as it initially should have been!

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-23 Thread Michael Gibney (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374861#comment-16374861
 ] 

Michael Gibney commented on SOLR-7798:
--

Sorry, yes! I see. The test case was from [~joergr]'s patch (July 2015). I 
incorporated an updated version of the test case (along with a new commit 
message), and pushed a new commit to [PR 
325|https://github.com/apache/lucene-solr/pull/325]. Feel free to use this as 
you see fit – I'm happy to squash-rebase against master if you like. Thanks!

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-23 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374513#comment-16374513
 ] 

Joel Bernstein commented on SOLR-7798:
--

There were a couple issues I ran into. First I worked with the pull request, 
but I found the commit message wasn't formatted quite right and the test wasn't 
include. 

Next I applied the patch, which didn't apply cleanly on master. 

So, I decided to hand integrate the changes and the test from the patch. The 
test case in the patch looked like it might have been written for an older 
version and was failing every time.

I wouldn't worry about it though. This is a small enough change that I can just 
fix things up on my own and commit it. It will just take a little longer to get 
committed because I need to carve out a little more time to work with it. But I 
will get this committed unless I run into a blocker.

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-22 Thread Michael Gibney (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373938#comment-16373938
 ] 

Michael Gibney commented on SOLR-7798:
--

It looks like the randomness comes from [Line 58 or 
TestExpandComponent|https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/handler/component/TestExpandComponent.java#L58],
 when "hint=top_fc" is randomly specified; the problem arises when it's 
specified for a field with no SortedDocValues (the {{null}} comes from 
[here|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/uninverting/UninvertingReader.java#L349]).

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-22 Thread Michael Gibney (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373889#comment-16373889
 ] 

Michael Gibney commented on SOLR-7798:
--

I was able to reproduce an NPE with the above command; but the test only threw 
this NPE intermittently, and I was able to reproduce it on master 
(364b680afaf9) as well. I've included the stack trace to make sure we're 
talking about the same issue.
{code:java}
[junit4] 2> 2778 ERROR (searcherExecutor-7-thread-1) [ ] o.a.s.s.LRUCache Error 
during auto-warming of 
key:org.apache.solr.search.QueryResultKey@7ce6ad2e:java.lang.RuntimeException: 
java.lang.NullPointerException
[junit4] 2> at 
org.apache.solr.search.CollapsingQParserPlugin$CollapsingPostFilter.getFilterCollector(CollapsingQParserPlugin.java:378)
[junit4] 2> at 
org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1084)
[junit4] 2> at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1540)
[junit4] 2> at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1416)
[junit4] 2> at 
org.apache.solr.search.SolrIndexSearcher.access$000(SolrIndexSearcher.java:90)
[junit4] 2> at 
org.apache.solr.search.SolrIndexSearcher$3.regenerateItem(SolrIndexSearcher.java:575)
[junit4] 2> at org.apache.solr.search.LRUCache.warm(LRUCache.java:297)
[junit4] 2> at 
org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:2146)
[junit4] 2> at 
org.apache.solr.core.SolrCore.lambda$getSearcher$16(SolrCore.java:2258)
[junit4] 2> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[junit4] 2> at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
[junit4] 2> at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[junit4] 2> at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[junit4] 2> at java.lang.Thread.run(Thread.java:745)
[junit4] 2> Caused by: java.lang.NullPointerException
[junit4] 2> at 
org.apache.solr.search.CollapsingQParserPlugin$OrdScoreCollector.(CollapsingQParserPlugin.java:514)
[junit4] 2> at 
org.apache.solr.search.CollapsingQParserPlugin$CollectorFactory.getCollector(CollapsingQParserPlugin.java:1331)
[junit4] 2> at 
org.apache.solr.search.CollapsingQParserPlugin$CollapsingPostFilter.getFilterCollector(CollapsingQParserPlugin.java:367)
[junit4] 2> ... 13 more
{code}

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-22 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373618#comment-16373618
 ] 

Joel Bernstein commented on SOLR-7798:
--

try:

ant test -Dtestcase=TestExpandComponent

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-22 Thread Michael Gibney (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373489#comment-16373489
 ] 

Michael Gibney commented on SOLR-7798:
--

I was indeed developing on master. Which test case is failing for you? I'm 
getting a couple of failed tests on master (364b680afaf9) that seem to be 
unrelated to the ExpandComponent changes:
{code}
 [junit4] Tests with failures [seed: 2723B1A9FC179033]:
 [junit4]   - 
org.apache.solr.handler.component.DummyCustomParamSpellChecker.initializationError
 [junit4]   - 
org.apache.solr.handler.component.ResourceSharingTestComponent.initializationError
{code}

but when I try to selectively run the ExpandComponent test I'm getting:

{code}
ant -Dtests.class="org.apache.solr.handler.component.TestExpandComponent" test
...
 [junit4] Tests summary: 0 suites, 0 tests
{code}

No errors ... but I guess something is still amiss.


> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-22 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373370#comment-16373370
 ] 

Joel Bernstein commented on SOLR-7798:
--

The test case is failing for me. What version are you working with? 

The commit workflow is to commit the master branch and back port to release 
branch.

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-22 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373246#comment-16373246
 ] 

Joel Bernstein commented on SOLR-7798:
--

I've got the branch locally. I'll do a little more reviewing, run tests etc... 
and then most likely commit it without change.

Thanks!

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-22 Thread Michael Gibney (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373197#comment-16373197
 ] 

Michael Gibney commented on SOLR-7798:
--

Right, sounds good! Thanks for the explanation of the 200 ceiling. Just 
submitted [PR 325|https://github.com/apache/lucene-solr/pull/325].

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-22 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373096#comment-16373096
 ] 

Joel Bernstein commented on SOLR-7798:
--

Actually there can't be duplicate values in the ordBytes Map, so that's not 
even an issue.

This patch looks pretty safe to me.

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-22 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373072#comment-16373072
 ] 

Joel Bernstein commented on SOLR-7798:
--

I believe the patch will indeed make things more robust. It looks like to me if 
there are duplicate values in the ordBytes map then we'll just have duplicate 
values in the disjunction query, which will make things slower, but still work 
fine.

So, I see no problem with the patch.

 

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-22 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373055#comment-16373055
 ] 

Joel Bernstein commented on SOLR-7798:
--

Just getting reacquainted with the code. I believe the 200 magic number is an 
inflection point for when it makes sense to build a disjunction query for 
retrieving the group records. I'll need to do a little more digging to fully 
understand how the patch effects things.

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-22 Thread Michael Gibney (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372962#comment-16372962
 ] 

Michael Gibney commented on SOLR-7798:
--

Thanks, [~joel.bernstein]. I'm happy to prep a PR, but would you mind first 
confirming that {{count}} (and its associated ceiling of 200) is intended to 
represent the number of matching collapse _values_, as opposed to the number of 
result documents associated with those values? Assuming that's the case, is 
there any reason to continue trac{{king }}{{count}} externally (as opposed to 
simply relying on {{ordBytes.size(), as in [~joergr]'s 
[^expand-component.patch] patch}})?

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-21 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372251#comment-16372251
 ] 

Joel Bernstein commented on SOLR-7798:
--

Thanks for the patch. If you can setup a pull request I should have time to 
review and commit when its ready.

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Assignee: Joel Bernstein
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent

2018-02-21 Thread Michael Gibney (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371599#comment-16371599
 ] 

Michael Gibney commented on SOLR-7798:
--

Although [~joergr]'s initial description mentions an NPE in ExpandComponent "if 
_accidentally_ used without prior collapsing of results" (italics mine), there 
are applications of ExpandComponent that _intentionally_ do not involve prior 
collapsing of results on the expand field. For example, I'm using cached Join 
queries to implement tiered deduplication of the search domain across multiple 
document sources, but do not wish to deduplicate documents against other 
documents from the same source (and specifically wish to deduplicate the search 
domain, as opposed to the set of results). The approach is described in a bit 
more detail [here|https://github.com/upenn-libraries/solr-source-deduplication] 
(bullet points 3, 4, and 7 are particularly relevant).

[^expand-component.patch] looks good to me, as I can't see a reason why 
{{count}} is being tracked separately, rather than relying on 
{{ordBytes.size()}}. The only potential issue I see with it is that where 
{{count}} is used to determine whether {{groupQuery}} is initialized, {{count}} 
now represents a different concept than {{ordBytes.size()}}. I'm not sure what 
the desired behavior would be (or for that matter, what the explanation is for 
the magic "200" ceiling on {{count)}}.

I've uploaded an alternative, [^expand-npe.patch] , which differs only in that 
it leaves the separate tracking of {{count}} in place (though I don't think it 
should have to), and also in that it checks for duplication on addition of ord 
to groupBits/groupSet, thereby avoiding unnecessary {{BytesRef.deepCopyOf()}} 
in the (normally rare) case where duplicate terms are encountered.

> Improve robustness of ExpandComponent
> -
>
> Key: SOLR-7798
> URL: https://issues.apache.org/jira/browse/SOLR-7798
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Reporter: Jörg Rathlev
>Priority: Minor
> Attachments: expand-component.patch, expand-npe.patch
>
>
> The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally 
> used without prior collapsing of results.
> If there are multiple documents in the result which have the same term value 
> in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from 
> the {{count}} value, and the {{getGroupQuery}} method creates an incompletely 
> filled {{bytesRef}} array, which later causes a {{NullPointerException}} when 
> trying to sort the terms.
> The attached patch extends the test to demonstrate the error, and modifies 
> the {{getGroupQuery}} methods to create the array based on the size of the 
> input maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org