[ 
https://issues.apache.org/jira/browse/SOLR-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986230#comment-14986230
 ] 

Yonik Seeley commented on SOLR-7802:
------------------------------------

{quote}
if you build 2 HLL instances, with different log2m settings, and add the exact 
same set of (raw) values to both, then the HLL with the larger log2m will give 
you the most accurate results then the HLL with a smaller log2m setting.
{quote}

Is that really true for any given set of raw values, or is it just true on 
average?
These are just estimates after all, and it would seem like a very difficult 
(and interesting) property to achieve what is seemingly claimed.  At first 
blush, it seems false.

> TestDistributedStatsComponentCardinality failure
> ------------------------------------------------
>
>                 Key: SOLR-7802
>                 URL: https://issues.apache.org/jira/browse/SOLR-7802
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 5.3, Trunk
>            Reporter: Steve Rowe
>            Priority: Minor
>         Attachments: 
> TestDistributedStatsComponentCardinality.tests-failures.txt
>
>
> Original trunk failure on Linux: 
> [http://jenkins.sarowe.net/job/Lucene-Solr-tests-trunk/773/].  Reproduced 
> with the repro line on OS X, both with trunk/Java8 and branch_5x/java7:
> {noformat}
>   [junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=TestDistributedStatsComponentCardinality -Dtests.method=test 
> -Dtests.seed=87100DE827E75E41 -Dtests.slow=true -Dtests.locale=sr_RS 
> -Dtests.timezone=Zulu -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
> {noformat}
> {noformat}
> Stack Trace:
> java.lang.AssertionError: int_i: goodEst=13957, poorEst=13970, real=13980, 
> p=q=id%3A%5B88+TO+14067%5D&rows=0&stats=true&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_int_i%7Dint_i&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_int_i_prehashed_l+hllPreHashed%3Dtrue%7Dint_i_prehashed_l&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_int_i%7Dint_i&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_int_i_prehashed_l+hllPreHashed%3Dtrue%7Dint_i_prehashed_l&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_long_l%7Dlong_l&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_long_l_prehashed_l+hllPreHashed%3Dtrue%7Dlong_l_prehashed_l&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_long_l%7Dlong_l&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_long_l_prehashed_l+hllPreHashed%3Dtrue%7Dlong_l_prehashed_l&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_string_s%7Dstring_s&stats.field=%7B%21cardinality%3D0.008936367747461982+key%3Dlow_string_s_prehashed_l+hllPreHashed%3Dtrue%7Dstring_s_prehashed_l&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_string_s%7Dstring_s&stats.field=%7B%21cardinality%3D0.508936367747462+key%3Dhigh_string_s_prehashed_l+hllPreHashed%3Dtrue%7Dstring_s_prehashed_l
>       at 
> __randomizedtesting.SeedInfo.seed([87100DE827E75E41:F443232891B33B9]:0)
>       at org.junit.Assert.fail(Assert.java:93)
>       at org.junit.Assert.assertTrue(Assert.java:43)
>       at 
> org.apache.solr.handler.component.TestDistributedStatsComponentCardinality.test(TestDistributedStatsComponentCardinality.java:216)
> [...]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to