[GitHub] [incubator-druid] gianm commented on issue #7337: DataSketches HLL is not a replacement for Cardinality aggregator

2019-04-04 Thread GitBox
gianm commented on issue #7337: DataSketches HLL is not a replacement for 
Cardinality aggregator
URL: 
https://github.com/apache/incubator-druid/issues/7337#issuecomment-480022692
 
 
   @jon-wei added some of this stuff to the docs in #7407.
   
   Btw, if you are doing a cardinality count without grouping by anything, the 
memory requirements are probably irrelevant. Either one is small enough that 
without the need to create a lot of them, the memory use is trivial.
   
   I'm not sure if folks with more experience will be able to chime in, but if 
not then you might need to determine some of these answers yourself through 
performance testing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] gianm commented on issue #7337: DataSketches HLL is not a replacement for Cardinality aggregator

2019-03-25 Thread GitBox
gianm commented on issue #7337: DataSketches HLL is not a replacement for 
Cardinality aggregator
URL: 
https://github.com/apache/incubator-druid/issues/7337#issuecomment-476359336
 
 
   > Datasketch HLL requires you to specify columns to calculate cardinality 
over during ingestion time. 
   
   Is this right? IIRC, DataSketches HLL can also work on primitive 
(number/string) columns at query time. I think the only feature that's missing 
is the `byRow` calculation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org