Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/1581#discussion_r52496326
--- Diff:
flink-examples/flink-examples-batch/src/main/java/org/apache/flink/examples/java/clustering/KMeans.java
---
@@ -105,119 +110,127 @@ public static void main(String[] args) throws
Exception {
.groupBy(0).reduce(new CentroidAccumulator())
// compute new centroids from point counts and
coordinate sums
.map(new CentroidAverager());
-
+
// feed new centroids back into next iteration
DataSet<Centroid> finalCentroids = loop.closeWith(newCentroids);
-
+
DataSet<Tuple2<Integer, Point>> clusteredPoints = points
- // assign points to final clusters
- .map(new
SelectNearestCenter()).withBroadcastSet(finalCentroids, "centroids");
-
+ // assign points to final clusters
+ .map(new
SelectNearestCenter()).withBroadcastSet(finalCentroids, "centroids");
+
// emit result
- if (fileOutput) {
- clusteredPoints.writeAsCsv(outputPath, "\n", " ");
+ if (params.has("output")) {
+ clusteredPoints.writeAsCsv(params.get("output"), "\n",
" ");
// since file sinks are lazy, we trigger the execution
explicitly
env.execute("KMeans Example");
- }
- else {
+ } else {
clusteredPoints.print();
--- End diff --
Add a message saying something like "Printing result to std-out. Use
--output to specify output path."
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---