milleruntime commented on issue #2371: URL: https://github.com/apache/accumulo/issues/2371#issuecomment-982974602
I was talking with @ctubbsii a bit about some of the issues the many different use cases that can be encountered with splitting data evenly and the work I was doing for #2368. Instead of us trying to write our own algorithms, maybe we can use a library. He suggested taking a look at the Apache Datasketches project, for help with mathematical algorthims such as this. One type in particular that might be useful is the quantiles in ItemsSketch: https://datasketches.apache.org/api/java/snapshot/apidocs/org/apache/datasketches/quantiles/ItemsSketch.html#getQuantiles-int- -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
