milleruntime commented on issue #2371:
URL: https://github.com/apache/accumulo/issues/2371#issuecomment-982974602


   I was talking with @ctubbsii a bit about some of the issues the many 
different use cases that can be encountered with splitting data evenly and the 
work I was doing for #2368. Instead of us trying to write our own algorithms, 
maybe we can use a library. He suggested taking a look at the Apache 
Datasketches project, for help with mathematical algorthims such as this. One 
type in particular that might be useful is the quantiles in ItemsSketch: 
https://datasketches.apache.org/api/java/snapshot/apidocs/org/apache/datasketches/quantiles/ItemsSketch.html#getQuantiles-int-


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to