keith-turner commented on pull request #2368:
URL: https://github.com/apache/accumulo/pull/2368#issuecomment-983993160


   >I am going to take a look at the Apache Datasketches project, starting with 
the quantiles in ItemsSketch: 
https://datasketches.apache.org/api/java/snapshot/apidocs/org/apache/datasketches/quantiles/ItemsSketch.html#getQuantiles-int-
   
   @milleruntime  looking at the javadoc for the above it says
   
   > evenlySpaced - an integer that specifies the number of evenly spaced 
fractional ranks. This must be a positive integer greater than 1. A value of 2 
will return the min and the max value. A value of 3 will return the min, the 
median and the max value, etc.
   
   The above behavior is not what we want. If a user request one split, we 
probably want the source row that is around 50% of the way through (1 is 
unsupported by the sketch).  If a user request 2 splits, we probably want the 
source rows that are around 33% and 66% of the way through (not the min and max 
returned by the sketch).  If a user request 3 splits, the we probably want the 
source rows that are around 25%,50%, and 75% of the way through (not the min, 
max, and median as returned by the sketch).
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to