[ 
https://issues.apache.org/jira/browse/SOLR-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-6968:
---------------------------
    Attachment: SOLR-6968.patch

more updates...

* Added hueristics for regwidth
** slightly reduced default for stats over value sets with 32bit max 
cardinality (5 vs 6)
** scaled default further based on {{cardinality=N}} float option
*** less aggresive scaling on the low end then we have with log2m: 
{{cardinality=0.0}} gives "regwidth default - 1"
*** we don't want the huersitc to even generate single bit registers 
(regwidth==1), instead we lean on the log2m hueristic to be where most of the 
RAM vs accuracy tunning happens when a float option is used
* license files and misc precommit cleanup

...i'm working on more involved randomized & distributed tests next.

> add hyperloglog in statscomponent as an approximate count
> ---------------------------------------------------------
>
>                 Key: SOLR-6968
>                 URL: https://issues.apache.org/jira/browse/SOLR-6968
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Hoss Man
>         Attachments: SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch, 
> SOLR-6968.patch, SOLR-6968.patch
>
>
> stats component currently supports "calcDistinct" but it's terribly 
> inefficient -- especially in distib mode.
> we should add support for using hyperloglog to compute an approximate count 
> of distinct values (using localparams via SOLR-6349 to control the precision 
> of the approximation)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to