stevenzwu commented on code in PR #5348:
URL: https://github.com/apache/iceberg/pull/5348#discussion_r938434611


##########
api/src/main/java/org/apache/iceberg/metrics/Histogram.java:
##########
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.metrics;
+
+public interface Histogram {
+  /** Update the histogram with a new value observed. */
+  void update(long value);
+
+  /** Return the number of observations. */
+  long count();
+
+  /** Calculate the statistics of the observed values. */
+  Statistics statistics();
+
+  interface Statistics {
+    /**
+     * Return the number of values that the statistics computation is based on.
+     *
+     * <p>If the number of sampling is less than the reservoir size, the 
sampling count should be
+     * returned. Otherwise, the reservoir size is returned.
+     */
+    long size();
+
+    /** Returns the mean value of the histogram observations. */
+    double mean();
+
+    /** Returns the standard deviation of the histogram distribution. */
+    double stdDev();
+
+    /** Returns the maximum value of the histogram observations. */
+    long max();
+
+    /** Returns the minimum value of the histogram observations. */
+    long min();
+
+    /**
+     * Returns the percentile value based on the histogram statistics.
+     *
+     * @param percentile percentile point in double. E.g., 0.75 means 75 
percentile. It is up to the
+     *     implementation to decide what valid percentile points are supported.
+     * @return Value for the given percentile
+     */
+    double percentile(double percentile);

Review Comment:
   because we do `interpolation` when position (`double`) is btw indexes, the 
return value needs to be `double`.
   
   ```
   return lower + (position - Math.floor(position)) * (upper - lower);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to