[
https://issues.apache.org/jira/browse/FLINK-22587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351595#comment-17351595
]
Etienne Chauchot commented on FLINK-22587:
------------------------------------------
Hi [~sjwiesman], thanks for commenting. Yes, I agree these triggering behaviors
are expected, the question is:
_In batch mode in datastream API what is the proper trigger to set ?_
Regarding table API: I have a working pipeline using table API already. What
I'm doing is a benchmark comparing Flink Dataset/Datstream/SQL and other
technologies such as Apache Beam. So I'd like to have a working DataStream
pipeline (I already have also a working DataSet pipeline)
Regarding your workaround with TumblingEventTimwWindow: I am in batch mode so I
have no timestamp on my data that is why I did not try timestamp based
triggers. That being said, I just tried it with
_DataStream.assignTimestampsAndWatermarks(WatermarkStrategy.noWatermarks());_
but the runtime complains about records having Long.MIN_VALUE timestamp.
> Support aggregations in batch mode with DataStream API
> ------------------------------------------------------
>
> Key: FLINK-22587
> URL: https://issues.apache.org/jira/browse/FLINK-22587
> Project: Flink
> Issue Type: Bug
> Components: API / DataStream
> Affects Versions: 1.12.0, 1.13.0
> Reporter: Etienne Chauchot
> Priority: Major
>
> A pipeline like this *in batch mode* would output no data
> {code:java}
> stream.join(otherStream)
> .where(<KeySelector>)
> .equalTo(<KeySelector>)
> .window(GlobalWindows.create())
> .apply(<JoinFunction>)
> {code}
> Indeed the default trigger for GlobalWindow is NeverTrigger which never
> fires. If we set a _EventTimeTrigger_ it will fire with every element as the
> watermark will be set to +INF (batch mode) and will pass the end of the
> global window with each new element. A _ProcessingTimeTrigger_ never fires
> either and all elapsed time or delta based triggers would not be suited for
> batch.
> Same goes for _reduce()_ instead of join().
> So I guess we miss something for batch support with DataStream.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)