To support range partitioning for native parallel batch indexing, I’m 
considering moving DataSketches from extensions to core (see 
https://github.com/apache/incubator-druid/issues/8769 
<https://github.com/apache/incubator-druid/issues/8769> for details). Having 
DataSketches in core would also allow us to switch usages of 
HyperLogLogCollector to the better HLL implementation available in 
DataSketches. One drawback is that moving DataSketches to core will possibly 
block the work to upgrade DataSketches to the latest version: 
https://github.com/apache/incubator-druid/pull/8647 
<https://github.com/apache/incubator-druid/pull/8647>.

Any other thoughts on the pros/cons?

Thanks,
Chi

Reply via email to