[
https://issues.apache.org/jira/browse/SOLR-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14087984#comment-14087984
]
Hoss Man commented on SOLR-6329:
--------------------------------
Notes from SOLR-2894 about the root of the issue...
{panel}
>From what I can tell, the gist of the issue is that when dealing with
>sub-fields of the pivot, the coordination code doesn't know about some of the
>"0" values if no shard which has the value for the parent field even knows
>about the existence of the term.
The simplest example of this discrepency (compared to single node pivots) is to
consider an index with only 2 docs...
{noformat}
[{"id":1,"top_s":"foo","sub_s":"bar"}
{"id":2,"top_s":"xxx","sub_s":"yyy"}]
{noformat}
If those two docs exist in a single node index, and you pivot on
{{top_s,sub_s}} using mincount=0 you get a response like this...
{noformat}
$ curl -sS
'http://localhost:8881/solr/select?q=*:*&rows=0&facet=true&facet.pivot.mincount=0&facet.pivot=top_s,sub_s&omitHeader=true&wt=json&indent=true'
{
"response":{"numFound":2,"start":0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{},
"facet_dates":{},
"facet_ranges":{},
"facet_intervals":{},
"facet_pivot":{
"top_s,sub_s":[{
"field":"top_s",
"value":"foo",
"count":1,
"pivot":[{
"field":"sub_s",
"value":"bar",
"count":1},
{
"field":"sub_s",
"value":"yyy",
"count":0}]},
{
"field":"top_s",
"value":"xxx",
"count":1,
"pivot":[{
"field":"sub_s",
"value":"yyy",
"count":1},
{
"field":"sub_s",
"value":"bar",
"count":0}]}]}}}
{noformat}
If however you index each of those docs on a seperate shard, the response comes
back like this...
{noformat}
$ curl -sS
'http://localhost:8881/solr/select?q=*:*&rows=0&facet=true&facet.pivot.mincount=0&facet.pivot=top_s,sub_s&omitHeader=true&wt=json&indent=true&shards=localhost:8881/solr,localhost:8882/solr'
{
"response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{},
"facet_dates":{},
"facet_ranges":{},
"facet_intervals":{},
"facet_pivot":{
"top_s,sub_s":[{
"field":"top_s",
"value":"foo",
"count":1,
"pivot":[{
"field":"sub_s",
"value":"bar",
"count":1}]},
{
"field":"top_s",
"value":"xxx",
"count":1,
"pivot":[{
"field":"sub_s",
"value":"yyy",
"count":1}]}]}}}
{noformat}
The only solution i can think of, would be an extra (special to mincount=0)
stage of logic, after each PivotFacetField is refined, that would:
* iterate over all the values of the current pivot
* build up a Set of all all the known values for the child-pivots of of those
values
* iterate over all the values again, merging in a "0"-count child value for
every value in the set
...ie: "At least one shard knows about value 'v_x' in field 'sub_field', so add
a count of '0' for 'v_x' in every 'sub_field' collection nested under the
'top_field' in our 'top_field,sub_field' pivot"
I haven't thought this idea through enough to be confident it would work, or
that it's worth doing ... i'm certainly not convinced that mincount=0 makes
enough sense in a facet.pivot usecase to think getting this test working should
hold up getting this committed -- probably something that should just be
committed as is, with an open Jira that it's a known bug.
{panel}
SOLR-2894 includes a commented out test case related to using mincount=0 in
distributed pivot faceting in DistributedFacetPivotLargeTest (annotated with
"SOLR-6329")
> facet.pivot.mincount=0 doesn't work well in distributed pivot faceting
> ----------------------------------------------------------------------
>
> Key: SOLR-6329
> URL: https://issues.apache.org/jira/browse/SOLR-6329
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Priority: Minor
>
> Using facet.pivot.mincount=0 in conjunction with the distributed pivot
> faceting support being added in SOLR-2894 doesn't work as folks would expect
> if they are use to using facet.pivot.mincount=0 in a single node setup.
> Filing this issue to track this as a known defect, because it may not have a
> viable solution.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]