johnyangk commented on a change in pull request #153: [NEMO-245,247] Handle 
watermark in OutputWriter and Implement unbounded word count example
URL: https://github.com/apache/incubator-nemo/pull/153#discussion_r232547414
 
 

 ##########
 File path: 
examples/beam/src/main/java/org/apache/nemo/examples/beam/WindowedWordCount.java
 ##########
 @@ -41,19 +43,64 @@
   private WindowedWordCount() {
   }
 
+  public static final String INPUT_TYPE_BOUNDED = "bounded";
+  public static final String INPUT_TYPE_UNBOUNDED = "unbounded";
+
+
+  private static PCollection<KV<String, Long>> getSource(
+    final Pipeline p,
+    final String[] args) {
+
+    final String inputType = args[2];
+    if (inputType.compareTo(INPUT_TYPE_BOUNDED) == 0) {
+      final String inputFilePath = args[3];
+      return GenericSourceSink.read(p, inputFilePath)
+        .apply(ParDo.of(new DoFn<String, String>() {
+          @ProcessElement
+          public void processElement(@Element final String elem,
+                                     final OutputReceiver<String> out) {
+            final String[] splitt = elem.split("!");
+            out.outputWithTimestamp(splitt[0], new 
Instant(Long.valueOf(splitt[1])));
+          }
+        }))
+        .apply(MapElements.<String, KV<String, Long>>via(new 
SimpleFunction<String, KV<String, Long>>() {
+          @Override
+          public KV<String, Long> apply(final String line) {
+            final String[] words = line.split(" +");
+            final String documentId = words[0] + "#" + words[1];
+            final Long count = Long.parseLong(words[2]);
+            return KV.of(documentId, count);
+          }
+        }));
+    } else if (inputType.compareTo(INPUT_TYPE_UNBOUNDED) == 0) {
+      // unbounded
+      return p.apply(GenerateSequence
+        .from(1)
+        .withRate(2, Duration.standardSeconds(1))
 
 Review comment:
   Nevermind. I wrongly thought that if Duration.millis(1) used than we may not 
be able to test the case where watermarks are inserted between elements.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to