Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r38619843
--- Diff:
flink-java/src/main/java/org/apache/flink/api/java/utils/DataSetUtils.java ---
@@ -248,6 +251,58 @@ public void mapPartition(Iterable<T> values,
Collector<Tuple2<Long, T>> out) thr
input.getType(), sampleInCoordinator, callLocation);
}
+ /**
+ * Creates a {@link
org.apache.flink.api.common.accumulators.DiscreteHistogram} from the data set
+ *
+ * @param data Discrete valued data set
+ * @return A histogram over data
+ */
+ public static DataSet<DiscreteHistogram>
createDiscreteHistogram(DataSet<Double> data) {
+ return data.mapPartition(new RichMapPartitionFunction<Double,
DiscreteHistogram>() {
+ @Override
+ public void mapPartition(Iterable<Double> values,
Collector<DiscreteHistogram> out)
+ throws Exception {
+ DiscreteHistogram histogram = new
DiscreteHistogram();
+ for (double value : values) {
+ histogram.add(value);
+ }
+ out.collect(histogram);
+ }
+ }).reduce(new ReduceFunction<DiscreteHistogram>() {
+ @Override
+ public DiscreteHistogram reduce(DiscreteHistogram
value1, DiscreteHistogram value2) throws Exception {
+ value1.merge(value2);
+ return value1;
+ }
+ });
+ }
+
+ /**
+ * Creates a {@link
org.apache.flink.api.common.accumulators.DiscreteHistogram} from the data set
+ *
+ * @param data Discrete valued data set
+ * @param bins Number of bins in the histogram
+ * @return A histogram over data
+ */
+ public static DataSet<ContinuousHistogram>
createContinuousHistogram(DataSet<Double> data, final int bins) {
+ return data.mapPartition(new RichMapPartitionFunction<Double,
ContinuousHistogram>() {
+ @Override
+ public void mapPartition(Iterable<Double> values,
Collector<ContinuousHistogram> out)
+ throws Exception {
--- End diff --
Same here (unnecessary new line)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---