gianm commented on pull request #11201:
URL: https://github.com/apache/druid/pull/11201#issuecomment-836992606


   > This is not correct, at least for the HLL in datasketches-java (I'm not 
sure what the Druid adaptor does). Strings are encoded using UTF-8 and have 
been for as long as I can remember. If you wish to use UTF-16, you just convert 
your string to char[] and the HLL sketch will accept that as well.
   
   @leerho Understood, but it is true as far as Druid is concerned — the 
HllSketch-based aggregator implementation in Druid does 
`update(s.toCharArray())` not `update(s)`: 
https://github.com/apache/druid/blob/8296123d895db7d06bc4517db5e767afb7862b83/extensions-core/datasketches/src/main/java/org/apache/druid/query/aggregation/datasketches/hll/HllSketchBuildAggregator.java#L103
   
   >  Nonetheless, whatever you decide, you will always need to stick with your 
choice.
   
   Yep, that's why this must be an option and the choice needs to be made in a 
consistent way.
   
   > I have some comments about PR 353 but I want to make these in the actual 
PR.
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to