leerho commented on issue #6814: [Discuss] Replacing hyperUnique as 'default' distinct count sketch URL: https://github.com/apache/incubator-druid/issues/6814#issuecomment-454639073 Back to the main topic of this issue-thread, we have discussed internally many alternatives to see if there is some way to transition "smoothly" from historically generated Druid-HLL sketches to the DataSketches-HLL (DS-HLL) sketch. There just isn't one. It is very ugly no matter how you try to do it and you still end up with results with errors. I would recommend that we document that the Druid-HLL sketch has serious problems and encourage users to move to the DS-HLL sketch. Those users that have Druid-HLL history will just have to live with the history and try to either re-sketch their history (If it exists!) or somehow create a _clean-line-in-the-sand_ and move forward with the DS-HLL sketch. Note that the DS-HLL sketch is extensively used here at Yahoo with history that goes back several years and is actively maintained. Not only do we have strong backward compatibility, but we also have versions of the DS-HLL sketch in C++ (with Python on the way) that are all binary compatible using the same stored images. So you can generate DS-HLL sketches in C++ and interpret and merge them in Java (and soon Python) or visa-versa. There is no other sketch library that offers that!
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
