[
https://issues.apache.org/jira/browse/SOLR-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-6968:
---------------------------
Attachment: SOLR-6968.patch
more updates...
* Added hueristics for regwidth
** slightly reduced default for stats over value sets with 32bit max
cardinality (5 vs 6)
** scaled default further based on {{cardinality=N}} float option
*** less aggresive scaling on the low end then we have with log2m:
{{cardinality=0.0}} gives "regwidth default - 1"
*** we don't want the huersitc to even generate single bit registers
(regwidth==1), instead we lean on the log2m hueristic to be where most of the
RAM vs accuracy tunning happens when a float option is used
* license files and misc precommit cleanup
...i'm working on more involved randomized & distributed tests next.
> add hyperloglog in statscomponent as an approximate count
> ---------------------------------------------------------
>
> Key: SOLR-6968
> URL: https://issues.apache.org/jira/browse/SOLR-6968
> Project: Solr
> Issue Type: Sub-task
> Reporter: Hoss Man
> Attachments: SOLR-6968.patch, SOLR-6968.patch, SOLR-6968.patch,
> SOLR-6968.patch, SOLR-6968.patch
>
>
> stats component currently supports "calcDistinct" but it's terribly
> inefficient -- especially in distib mode.
> we should add support for using hyperloglog to compute an approximate count
> of distinct values (using localparams via SOLR-6349 to control the precision
> of the approximation)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]