[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893958#comment-16893958 ] Michael Gibney commented on SOLR-7798: -- [~joel.bernstein], I circled back to this, and squash-rebased [PR 325|https://github.com/apache/lucene-solr/pull/325] on current master. The patch applies cleanly and passes precommit and all tests, so it should be solid. I'm sorry for the false start (in Feb. 2018); if you'd be willing to take another look at this, I think this will now _actually_ be as straightforward as it initially should have been! > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374861#comment-16374861 ] Michael Gibney commented on SOLR-7798: -- Sorry, yes! I see. The test case was from [~joergr]'s patch (July 2015). I incorporated an updated version of the test case (along with a new commit message), and pushed a new commit to [PR 325|https://github.com/apache/lucene-solr/pull/325]. Feel free to use this as you see fit – I'm happy to squash-rebase against master if you like. Thanks! > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374513#comment-16374513 ] Joel Bernstein commented on SOLR-7798: -- There were a couple issues I ran into. First I worked with the pull request, but I found the commit message wasn't formatted quite right and the test wasn't include. Next I applied the patch, which didn't apply cleanly on master. So, I decided to hand integrate the changes and the test from the patch. The test case in the patch looked like it might have been written for an older version and was failing every time. I wouldn't worry about it though. This is a small enough change that I can just fix things up on my own and commit it. It will just take a little longer to get committed because I need to carve out a little more time to work with it. But I will get this committed unless I run into a blocker. > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373938#comment-16373938 ] Michael Gibney commented on SOLR-7798: -- It looks like the randomness comes from [Line 58 or TestExpandComponent|https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/handler/component/TestExpandComponent.java#L58], when "hint=top_fc" is randomly specified; the problem arises when it's specified for a field with no SortedDocValues (the {{null}} comes from [here|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/uninverting/UninvertingReader.java#L349]). > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373889#comment-16373889 ] Michael Gibney commented on SOLR-7798: -- I was able to reproduce an NPE with the above command; but the test only threw this NPE intermittently, and I was able to reproduce it on master (364b680afaf9) as well. I've included the stack trace to make sure we're talking about the same issue. {code:java} [junit4] 2> 2778 ERROR (searcherExecutor-7-thread-1) [ ] o.a.s.s.LRUCache Error during auto-warming of key:org.apache.solr.search.QueryResultKey@7ce6ad2e:java.lang.RuntimeException: java.lang.NullPointerException [junit4] 2> at org.apache.solr.search.CollapsingQParserPlugin$CollapsingPostFilter.getFilterCollector(CollapsingQParserPlugin.java:378) [junit4] 2> at org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1084) [junit4] 2> at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1540) [junit4] 2> at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1416) [junit4] 2> at org.apache.solr.search.SolrIndexSearcher.access$000(SolrIndexSearcher.java:90) [junit4] 2> at org.apache.solr.search.SolrIndexSearcher$3.regenerateItem(SolrIndexSearcher.java:575) [junit4] 2> at org.apache.solr.search.LRUCache.warm(LRUCache.java:297) [junit4] 2> at org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:2146) [junit4] 2> at org.apache.solr.core.SolrCore.lambda$getSearcher$16(SolrCore.java:2258) [junit4] 2> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [junit4] 2> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [junit4] 2> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [junit4] 2> at java.lang.Thread.run(Thread.java:745) [junit4] 2> Caused by: java.lang.NullPointerException [junit4] 2> at org.apache.solr.search.CollapsingQParserPlugin$OrdScoreCollector.(CollapsingQParserPlugin.java:514) [junit4] 2> at org.apache.solr.search.CollapsingQParserPlugin$CollectorFactory.getCollector(CollapsingQParserPlugin.java:1331) [junit4] 2> at org.apache.solr.search.CollapsingQParserPlugin$CollapsingPostFilter.getFilterCollector(CollapsingQParserPlugin.java:367) [junit4] 2> ... 13 more {code} > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373618#comment-16373618 ] Joel Bernstein commented on SOLR-7798: -- try: ant test -Dtestcase=TestExpandComponent > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373489#comment-16373489 ] Michael Gibney commented on SOLR-7798: -- I was indeed developing on master. Which test case is failing for you? I'm getting a couple of failed tests on master (364b680afaf9) that seem to be unrelated to the ExpandComponent changes: {code} [junit4] Tests with failures [seed: 2723B1A9FC179033]: [junit4] - org.apache.solr.handler.component.DummyCustomParamSpellChecker.initializationError [junit4] - org.apache.solr.handler.component.ResourceSharingTestComponent.initializationError {code} but when I try to selectively run the ExpandComponent test I'm getting: {code} ant -Dtests.class="org.apache.solr.handler.component.TestExpandComponent" test ... [junit4] Tests summary: 0 suites, 0 tests {code} No errors ... but I guess something is still amiss. > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373370#comment-16373370 ] Joel Bernstein commented on SOLR-7798: -- The test case is failing for me. What version are you working with? The commit workflow is to commit the master branch and back port to release branch. > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373246#comment-16373246 ] Joel Bernstein commented on SOLR-7798: -- I've got the branch locally. I'll do a little more reviewing, run tests etc... and then most likely commit it without change. Thanks! > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373197#comment-16373197 ] Michael Gibney commented on SOLR-7798: -- Right, sounds good! Thanks for the explanation of the 200 ceiling. Just submitted [PR 325|https://github.com/apache/lucene-solr/pull/325]. > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373096#comment-16373096 ] Joel Bernstein commented on SOLR-7798: -- Actually there can't be duplicate values in the ordBytes Map, so that's not even an issue. This patch looks pretty safe to me. > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373072#comment-16373072 ] Joel Bernstein commented on SOLR-7798: -- I believe the patch will indeed make things more robust. It looks like to me if there are duplicate values in the ordBytes map then we'll just have duplicate values in the disjunction query, which will make things slower, but still work fine. So, I see no problem with the patch. > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373055#comment-16373055 ] Joel Bernstein commented on SOLR-7798: -- Just getting reacquainted with the code. I believe the 200 magic number is an inflection point for when it makes sense to build a disjunction query for retrieving the group records. I'll need to do a little more digging to fully understand how the patch effects things. > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372962#comment-16372962 ] Michael Gibney commented on SOLR-7798: -- Thanks, [~joel.bernstein]. I'm happy to prep a PR, but would you mind first confirming that {{count}} (and its associated ceiling of 200) is intended to represent the number of matching collapse _values_, as opposed to the number of result documents associated with those values? Assuming that's the case, is there any reason to continue trac{{king }}{{count}} externally (as opposed to simply relying on {{ordBytes.size(), as in [~joergr]'s [^expand-component.patch] patch}})? > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372251#comment-16372251 ] Joel Bernstein commented on SOLR-7798: -- Thanks for the patch. If you can setup a pull request I should have time to review and commit when its ready. > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Assignee: Joel Bernstein >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7798) Improve robustness of ExpandComponent
[ https://issues.apache.org/jira/browse/SOLR-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371599#comment-16371599 ] Michael Gibney commented on SOLR-7798: -- Although [~joergr]'s initial description mentions an NPE in ExpandComponent "if _accidentally_ used without prior collapsing of results" (italics mine), there are applications of ExpandComponent that _intentionally_ do not involve prior collapsing of results on the expand field. For example, I'm using cached Join queries to implement tiered deduplication of the search domain across multiple document sources, but do not wish to deduplicate documents against other documents from the same source (and specifically wish to deduplicate the search domain, as opposed to the set of results). The approach is described in a bit more detail [here|https://github.com/upenn-libraries/solr-source-deduplication] (bullet points 3, 4, and 7 are particularly relevant). [^expand-component.patch] looks good to me, as I can't see a reason why {{count}} is being tracked separately, rather than relying on {{ordBytes.size()}}. The only potential issue I see with it is that where {{count}} is used to determine whether {{groupQuery}} is initialized, {{count}} now represents a different concept than {{ordBytes.size()}}. I'm not sure what the desired behavior would be (or for that matter, what the explanation is for the magic "200" ceiling on {{count)}}. I've uploaded an alternative, [^expand-npe.patch] , which differs only in that it leaves the separate tracking of {{count}} in place (though I don't think it should have to), and also in that it checks for duplication on addition of ord to groupBits/groupSet, thereby avoiding unnecessary {{BytesRef.deepCopyOf()}} in the (normally rare) case where duplicate terms are encountered. > Improve robustness of ExpandComponent > - > > Key: SOLR-7798 > URL: https://issues.apache.org/jira/browse/SOLR-7798 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: Jörg Rathlev >Priority: Minor > Attachments: expand-component.patch, expand-npe.patch > > > The {{ExpandComponent}} causes a {{NullPointerException}} if accidentally > used without prior collapsing of results. > If there are multiple documents in the result which have the same term value > in the expand field, the size of the {{ordBytes}}/{{groupSet}} differs from > the {{count}} value, and the {{getGroupQuery}} method creates an incompletely > filled {{bytesRef}} array, which later causes a {{NullPointerException}} when > trying to sort the terms. > The attached patch extends the test to demonstrate the error, and modifies > the {{getGroupQuery}} methods to create the array based on the size of the > input maps. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org