[ https://issues.apache.org/jira/browse/SOLR-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540576#comment-16540576 ]
Lucene/Solr QA commented on SOLR-12343: --------------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 36s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 17m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 16m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 16m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate ref guide {color} | {color:green} 16m 20s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}153m 34s{color} | {color:red} core in the patch failed. {color} | | {color:black}{color} | {color:black} {color} | {color:black}213m 21s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | solr.cloud.api.collections.TestCollectionsAPIViaSolrCloudCluster | | | solr.cloud.cdcr.CdcrBidirectionalTest | | | solr.cloud.autoscaling.sim.TestExecutePlanAction | | | solr.cloud.autoscaling.SearchRateTriggerIntegrationTest | | | solr.cloud.api.collections.ShardSplitTest | | | solr.cloud.autoscaling.sim.TestLargeCluster | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | SOLR-12343 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12930878/SOLR-12343.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns validaterefguide | | uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / fe180bb | | ant | version: Apache Ant(TM) version 1.9.6 compiled on July 8 2015 | | Default Java | 1.8.0_172 | | unit | https://builds.apache.org/job/PreCommit-SOLR-Build/142/artifact/out/patch-unit-solr_core.txt | | Test Results | https://builds.apache.org/job/PreCommit-SOLR-Build/142/testReport/ | | modules | C: solr/core solr/solr-ref-guide U: solr | | Console output | https://builds.apache.org/job/PreCommit-SOLR-Build/142/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > JSON Field Facet refinement can return incorrect counts/stats for sorted > buckets > -------------------------------------------------------------------------------- > > Key: SOLR-12343 > URL: https://issues.apache.org/jira/browse/SOLR-12343 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Hoss Man > Assignee: Yonik Seeley > Priority: Major > Attachments: SOLR-12343.patch, SOLR-12343.patch, SOLR-12343.patch, > SOLR-12343.patch, SOLR-12343.patch, SOLR-12343.patch, > __incomplete_processEmpty_microfix.patch > > > The way JSON Facet's simple refinement "re-sorts" buckets after refinement > can cause _refined_ buckets to be "bumped out" of the topN based on the > refined counts/stats depending on the sort - causing _unrefined_ buckets > originally discounted in phase#2 to bubble up into the topN and be returned > to clients *with inaccurate counts/stats* > The simplest way to demonstrate this bug (in some data sets) is with a > {{sort: 'count asc'}} facet: > * assume shard1 returns termX & termY in phase#1 because they have very low > shard1 counts > ** but *not* returned at all by shard2, because these terms both have very > high shard2 counts. > * Assume termX has a slightly lower shard1 count then termY, such that: > ** termX "makes the cut" off for the limit=N topN buckets > ** termY does not make the cut, and is the "N+1" known bucket at the end of > phase#1 > * termX then gets included in the phase#2 refinement request against shard2 > ** termX now has a much higher _known_ total count then termY > ** the coordinator now sorts termX "worse" in the sorted list of buckets > then termY > ** which causes termY to bubble up into the topN > * termY is ultimately included in the final result _with incomplete > count/stat/sub-facet data_ instead of termX > ** this is all indepenent of the possibility that termY may actually have a > significantly higher total count then termX across the entire collection > ** the key problem is that all/most of the other terms returned to the > client have counts/stats that are the cumulation of all shards, but termY > only has the contributions from shard1 > Important Notes: > * This scenerio can happen regardless of the amount of overrequest used. > Additional overrequest just increases the number of "extra" terms needed in > the index with "better" sort values then termX & termY in shard2 > * {{sort: 'count asc'}} is not just an exceptional/pathelogical case: > ** any function sort where additional data provided shards during refinement > can cause a bucket to "sort worse" can also cause this problem. > ** Examples: {{sum(price_i) asc}} , {{min(price_i) desc}} , {{avg(price_i) > asc|desc}} , etc... -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org