[
https://issues.apache.org/jira/browse/SOLR-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-7631:
---------------------------
Attachment: log.tgz
SOLR-7631_test.patch
The attached patch contains a simple test case that failed 7 out of 100 times
when i ran with...
{noformat}
for i in {1..100}; do ant test -Dtestcase=TestTrieFacet -Dtests.verbose=true;
done | tee log.txt
{noformat}
Also attached is the log.txt showing all the success and failures. Note that
all failures come from testMultiValuedTrie -- aparently this bug doesn't affect
single valued Trie fields.
As you can see, i used "-Dtests.verbose=true" when running these tests, hoping
I could find some pattern in both the success and failures realted to the the
codecs -- but nothing jumps out at me.
> Faceting on multivalued Trie fields with precisionStep != 0 can produce bogus
> value="0" in some situations
> ----------------------------------------------------------------------------------------------------------
>
> Key: SOLR-7631
> URL: https://issues.apache.org/jira/browse/SOLR-7631
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Attachments: SOLR-7631_test.patch, log.tgz
>
>
> Working through SOLR-7605, I've confirmed that the underlying problem exists
> for regular {{field.facet}} situations, regardless of distrib mode, for Trie
> fields that have a non-zero precisionStep -- there's still ome other missing
> piece of the puzzle i haven't figured out, but it relates in some way to some
> of randomized factors we use in our tests (Codec? PostingFormat? ... no idea)
> The problem, when it manifests, is that faceting on a TrieIntField, using
> {{facet.mincount=0}}, causes the facet results to include three instances of
> facet the value "0" listed with a count of "0" -- even though no document in
> the index contains this value at all...
> {noformat}
> [junit4] > <lst name="facet_fields">
> [junit4] > <lst name="foo_ti">
> [junit4] > <int name="20">32</int>
> ...
> [junit4] > <int name="50">21</int>
> [junit4] > <int name="0">0</int>
> [junit4] > <int name="0">0</int>
> [junit4] > <int name="0">0</int>
> {noformat}
> This is concerning for a few reasons:
> * In the case of PivotFaceting, getting duplicate values back from a single
> shard like this triggers an assert in distributed queries and the request
> fails -- even if asserts aren't enabled, the bogus "0" value can be
> propogated to clients if they ask for facet.pivot.mincount=0
> * Client code expecting a single (value,count) pair for each value may
> equally be confused/broken by this response where the same "value" is
> returned multiple times
> * w/o knowing the root cause, It seems very possible that other nonsense
> values may be getting returned -- ie: if the error only happens with fields
> utilizing precisionStep, then it's likely related to the synthetic values
> used for faster range queries, and other synthetic values may be getting
> included with bogus counts
> A Patch with a simple test that can demonstrate the bug fairly easily will be
> attached shortly
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]