[ 
https://issues.apache.org/jira/browse/SOLR-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-7631:
---------------------------
    Attachment: log.tgz
                SOLR-7631_test.patch

The attached patch contains a simple test case that failed 7 out of 100 times 
when i ran with...

{noformat}
for i in {1..100}; do ant test -Dtestcase=TestTrieFacet -Dtests.verbose=true; 
done | tee log.txt
{noformat}

Also attached is the log.txt showing all the success and failures.  Note that 
all failures come from testMultiValuedTrie -- aparently this bug doesn't affect 
single valued Trie fields.

As you can see, i used "-Dtests.verbose=true" when running these tests, hoping 
I could find some pattern in both the success and failures realted to the the 
codecs -- but nothing jumps out at me.



> Faceting on multivalued Trie fields with precisionStep != 0 can produce bogus 
> value="0" in some situations
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-7631
>                 URL: https://issues.apache.org/jira/browse/SOLR-7631
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>         Attachments: SOLR-7631_test.patch, log.tgz
>
>
> Working through SOLR-7605, I've confirmed that the underlying problem exists 
> for regular {{field.facet}} situations, regardless of distrib mode, for Trie 
> fields that have a non-zero precisionStep -- there's still ome other missing 
> piece of the puzzle i haven't figured out, but it relates in some way to some 
> of randomized factors we use in our tests (Codec? PostingFormat? ... no idea)
> The problem, when it manifests, is that faceting on a TrieIntField, using 
> {{facet.mincount=0}}, causes the facet results to include three instances of 
> facet the value "0" listed with a count of "0" -- even though no document in 
> the index contains this value at all...
> {noformat}
>    [junit4]    >   <lst name="facet_fields">
>    [junit4]    >     <lst name="foo_ti">
>    [junit4]    >       <int name="20">32</int>
> ...
>    [junit4]    >       <int name="50">21</int>
>    [junit4]    >       <int name="0">0</int>
>    [junit4]    >       <int name="0">0</int>
>    [junit4]    >       <int name="0">0</int>
> {noformat}
> This is concerning for a few reasons:
> * In the case of PivotFaceting, getting duplicate values back from a single 
> shard like this triggers an assert in distributed queries and the request 
> fails -- even if asserts aren't enabled, the bogus "0" value can be 
> propogated to clients if they ask for facet.pivot.mincount=0
> * Client code expecting a single (value,count) pair for each value may 
> equally be confused/broken by this response where the same "value" is 
> returned multiple times
> * w/o knowing the root cause, It seems very possible that other nonsense 
> values may be getting returned -- ie: if the error only happens with fields 
> utilizing precisionStep, then it's likely related to the synthetic values 
> used for faster range queries, and other synthetic values may be getting 
> included with bogus counts
> A Patch with a simple test that can demonstrate the bug fairly easily will be 
> attached shortly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to