[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...
Github user Aitozi commented on the issue: https://github.com/apache/flink/pull/5405 Hi @aljoscha , you have mentioned two points : 1. The events arrived may out of order in event-time processing 2. We can use windowFunction or ProcessWindowFunction to filter serverl window by specify the start time of window and the endtime. I have some differerent ideas: 1. when we deal with the out-of-order eventtime stream, we may specify the maxOutOfOrder to avoid the too much late elements skipped, so when the job restart/start the maxNumOfWindow to be skipped can be set to maxOutOfOrder/(the length of the thumbling window), So that the late elements will not produce incorrect results. The num of the window need to be skipped is according to the degree of the out of order 2. We need to skip the serveral broken window data , and we dont know which window is broken, we can just detect which window is first fired and the serval window after this is broken too. The num should very from the production (according to the maxOutOfOrder & the length of the window ) ---
[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...
Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/5405 I commented on the issue: https://issues.apache.org/jira/browse/FLINK-8477?focusedCommentId=16359834=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16359834 ---
[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...
Github user Aitozi commented on the issue: https://github.com/apache/flink/pull/5405 ping @aljoscha ---
[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...
Github user Aitozi commented on the issue: https://github.com/apache/flink/pull/5405 cc @aljoscha please help review this patch. ![image](https://user-images.githubusercontent.com/9486140/35761522-6e00f4b8-08c4-11e8-8063-7ec015802428.png) see the picture above, when user choose to use without a checkpoint to avoid catch up data after a crash , and use kafka#setStartFromLatest to consume the latest data. if use without the skip api , we can see that it can produce a broken data which may lead to the alert in monitor Scenarioãif user want to skip the broken window, can hava a choice to skip serveral window after the first fire. ---