pdeva opened a new issue #7337: DataSketches HLL is not a replacement for Cardinality aggregator URL: https://github.com/apache/incubator-druid/issues/7337 ### Description The `Cardinality` aggregator is deprecated in 0.14 in favor of `Datasketch HLL`. However, `Datasketch HLL` requires you to add that aggregation during ingestion time, thus severely limiting its usage. What if after you have been ingesting the data, you decide you want to calculate the cardinality of some columns? You can do that with `Cardinality` aggregator. ### Motivation `Datasketch HLL` requires you to specify columns to calculate cardinality over during ingestion time. This limits its usage if you decide to calculate cardinality over some different column later in your application. Since the `Cardinality` aggregator has no such limitation, it should _not_ be deprecated and kept alongside `Datasketch HLL`. Those who want ultra fast cardinality queries over specific columns can pre-specify them during ingestion. For others, the Cardinality aggregator would provide a good fallback.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
