[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...

2018-02-11 Thread Aitozi
Github user Aitozi commented on the issue:

https://github.com/apache/flink/pull/5405
  
Hi @aljoscha , you have mentioned two points : 
1. The events arrived may out of order in event-time processing 
2. We can use windowFunction or ProcessWindowFunction to filter serverl 
window by specify the start time of window and the endtime.

I have some differerent ideas: 
1. when we deal with the out-of-order eventtime stream, we may specify the 
maxOutOfOrder to avoid the too much late elements skipped, so when the job 
restart/start the maxNumOfWindow to be skipped can be set to  
maxOutOfOrder/(the length of the thumbling window), So that the late elements 
will not produce incorrect results. The num of the window need to be skipped is 
according to the degree of the out of order
2. We need to skip the serveral broken window data , and we dont know which 
window is broken, we can just detect which window is first fired and the serval 
window after this is broken too. The num should very from the production 
(according to the maxOutOfOrder & the length of the window )


---


[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...

2018-02-10 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/5405
  
I commented on the issue: 
https://issues.apache.org/jira/browse/FLINK-8477?focusedCommentId=16359834=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16359834


---


[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...

2018-02-09 Thread Aitozi
Github user Aitozi commented on the issue:

https://github.com/apache/flink/pull/5405
  
ping @aljoscha 


---


[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...

2018-02-02 Thread Aitozi
Github user Aitozi commented on the issue:

https://github.com/apache/flink/pull/5405
  
cc @aljoscha please help review this patch.

![image](https://user-images.githubusercontent.com/9486140/35761522-6e00f4b8-08c4-11e8-8063-7ec015802428.png)
see the picture above, when user choose to use without a checkpoint to 
avoid catch up data after a crash , and use kafka#setStartFromLatest to consume 
the latest data. if use without the skip api , we can see that it can  produce 
a broken data which may lead to the alert in monitor Scenario。if user want to 
skip the broken window, can hava a choice to skip serveral window after the 
first fire.



---