johnyangk commented on a change in pull request #151: [NEMO-267] Consider
watermark holds in GroupByKeyAndWindowDoFnTransform
URL: https://github.com/apache/incubator-nemo/pull/151#discussion_r231353759
##########
File path:
compiler/frontend/beam/src/main/java/org/apache/nemo/compiler/frontend/beam/transform/GroupByKeyAndWindowDoFnTransform.java
##########
@@ -314,4 +345,31 @@ public TimerInternals timerInternalsForKey(final K key) {
return stateAndTimerForKey.timerInternals;
}
}
+
+ /**
+ * This class wraps the output collector to track the watermark hold of each
key.
+ */
+ final class GBKWOutputCollector implements
OutputCollector<WindowedValue<KV<K, Iterable<InputT>>>> {
+ private final OutputCollector<WindowedValue<KV<K, Iterable<InputT>>>>
outputCollector;
+ GBKWOutputCollector(final OutputCollector<WindowedValue<KV<K,
Iterable<InputT>>>> outputCollector) {
+ this.outputCollector = outputCollector;
+ }
+
+ @Override
+ public void emit(final WindowedValue<KV<K, Iterable<InputT>>> output) {
+ // adds the output timestamp to the watermark hold of each key
+ // +1 to the output timestamp because if the window is [0-5000), the
timestamp is 4999
Review comment:
For an element to belong to a window [0~5000) then its timestamp must be one
of 0, 1, ... , 4999.
If you do +1 to an element with timestamp 4999, would that make the element
belong to a different window [5000, 10000)?
Please let me know if I'm missing something here.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services