[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hoss Man updated SOLR-2894: --------------------------- Attachment: SOLR-2894.patch I've been focusing on more tests using facet.offset... bq. I haven't looed into this closely, but i noticed the refinement code seems to only refine things started at the "facetFieldOffset," of the current collection don't we need to refine all the values, starting from the beginging of the list? There was in fact a bug with refinement when using facet.offset -- but i was looking in the wrong place. the code i was refering to before was involved in deciding which values to drilldown into when recursively refining the sub-pivots. that logic was already (mostly) correct because by that point we've already refined the _current_ levle completly, so we can skip past the offset when doing the recursion (the only glitch was a boundary check causing an IOOBE, see detials below). Earlier on in the code however, there was a mistake where only the limit (not the limit+offset) was being used to decide the threshold value for refinement. ---- New improvements in this patch... * TestCloudPivotFacet ** increase the odds of overrequest==0 ** randonly include a facet.offset param to sanity check refinement in that case * PivotFacetField ** fix refineNextLevelOfFacets not to ask for a sublist with a start offset bigger then the size of the collection *** this was causing an IndexOutOfBoundsException pretty quickly when offset was mixed into the random test ** fix queuePivotRefinementRequests to respect offset when picking the "indexOfCountThreshold" *** before it was only looking at limit, with offset in the randomized test this was causing failures even when pivots only had one field in them! ---- A few more things to consider in the future... * PivotFacetFieldValueCollection.refinableSubList is only use to deal with offset+limit sublisting from PivotFacetField.refineNextLevelOfFacets -- but PivotFacetFieldValueCollection already knows the offset&limit so maybe it should be a smarter special purpose method with 0 args: {{getNextLevelValuesToRefine()}} * trim earlier? ** the way refinement currently works in PivotFacetField, after we've refined our values, we mark that we no longer need refinement, and then on the next call we recursively refine the subpivots of each value -- and in both cases we do the offset+limit calculations and hang on to all of the values (both below offset and above limit) as we keep iterating down hte pivots -- they don't get thrown away until the final trim() call just before building up the final result. ** i previously suggested folding the trim() logic into the NamedList response logic -- but now i'm wondering if the trim() logic should instead be folded into refinement? so once we're sure a level is fully refined, we go ahead and trim that level before drilling down and refining it's kids? ---- Unfortunately, with this new patch, i did uncover a new random failure i can't easily explain (doesn't seem related ot the offset changes since facet.offset isn't evne used in these random params -- but it's possible i broke something while fixing that) ... {noformat} [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestCloudPivotFacet -Dtests.method=testDistribSearch -Dtests.seed=775F7BCA685BBC22 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=da_DK -Dtests.timezone=America/Montserrat -Dtests.file.encoding=UTF-8 [junit4] FAILURE 65.9s | TestCloudPivotFacet.testDistribSearch <<< [junit4] > Throwable #1: java.lang.AssertionError: {main(facet=true&facet.pivot=pivot_tl%2Cpivot_tl%2Cpivot_y_s&facet.pivot=bogus_not_in_any_doc_s%2Cpivot_l1%2Cpivot_td&facet.limit=13&facet.missing=true&facet.sort=count&facet.overrequest.count=2),extra(rows=0&q=*%3A*&fq=id%3A%5B*+TO+383%5D&_test_miss=true&_test_sort=count)} ==> bogus_not_in_any_doc_s,pivot_l1,pivot_td: {params(rows=0),defaults({main({main(rows=0&q=*%3A*&fq=id%3A%5B*+TO+383%5D&_test_miss=true&_test_sort=count),extra(fq=-bogus_not_in_any_doc_s%3A%5B*+TO+*%5D)}),extra(fq=%7B%21term+f%3Dpivot_l1%7D5098)})} expected:<7> but was:<9> [junit4] > at __randomizedtesting.SeedInfo.seed([775F7BCA685BBC22:F6B9F5D21F04DC1E]:0) [junit4] > at org.apache.solr.cloud.TestCloudPivotFacet.assertPivotCountsAreCorrect(TestCloudPivotFacet.java:239) [junit4] > at org.apache.solr.cloud.TestCloudPivotFacet.doTest(TestCloudPivotFacet.java:187) [junit4] > at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:865) [junit4] > at java.lang.Thread.run(Thread.java:744) [junit4] > Caused by: java.lang.AssertionError: bogus_not_in_any_doc_s,pivot_l1,pivot_td: {params(rows=0),defaults({main({main(rows=0&q=*%3A*&fq=id%3A%5B*+TO+383%5D&_test_miss=true&_test_sort=count),extra(fq=-bogus_not_in_any_doc_s%3A%5B*+TO+*%5D)}),extra(fq=%7B%21term+f%3Dpivot_l1%7D5098)})} expected:<7> but was:<9> [junit4] > at org.apache.solr.cloud.TestCloudPivotFacet.assertNumFound(TestCloudPivotFacet.java:507) [junit4] > at org.apache.solr.cloud.TestCloudPivotFacet.assertPivotCountsAreCorrect(TestCloudPivotFacet.java:257) [junit4] > at org.apache.solr.cloud.TestCloudPivotFacet.assertPivotCountsAreCorrect(TestCloudPivotFacet.java:268) [junit4] > at org.apache.solr.cloud.TestCloudPivotFacet.assertPivotCountsAreCorrect(TestCloudPivotFacet.java:229) {noformat} ...i need to dig into this a bit more tommorow. > Implement distributed pivot faceting > ------------------------------------ > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement > Reporter: Erik Hatcher > Assignee: Hoss Man > Fix For: 4.9, 5.0 > > Attachments: SOLR-2894-mincount-minification.patch, > SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894_cloud_test.patch, dateToObject.patch, pivot_mincount_problem.sh > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org