Re: tumbling event time window , parallel

2019-09-02 Thread Fabian Hueske
33 PM > *To:* Hanan Yehudai > *Cc:* user@flink.apache.org > *Subject:* Re: tumbling event time window , parallel > > > > I would use a regular ProcessFunction, not a WindowProcessFunction. > > > > The final WM depends on how the records were partitioned at t

RE: tumbling event time window , parallel

2019-09-02 Thread Hanan Yehudai
o:hanan.yehu...@radcom.com>> Cc: user@flink.apache.org<mailto:user@flink.apache.org> Subject: Re: tumbling event time window , parallel Hi, The paths of the files to read are distributed across all reader / source tasks and each task reads the files in order of their modification

Re: tumbling event time window , parallel

2019-08-26 Thread Fabian Hueske
ks > > > > *From:* Fabian Hueske > *Sent:* Monday, August 26, 2019 12:38 PM > *To:* Hanan Yehudai > *Cc:* user@flink.apache.org > *Subject:* Re: tumbling event time window , parallel > > > > Hi, > > > > The paths of the files to read are distribute

RE: tumbling event time window , parallel

2019-08-26 Thread Hanan Yehudai
the WM will be the highest EVENT_TIME on my set of files.. thanks From: Fabian Hueske Sent: Monday, August 26, 2019 12:38 PM To: Hanan Yehudai Cc: user@flink.apache.org Subject: Re: tumbling event time window , parallel Hi, The paths of the files to read are distributed across all reader

Re: tumbling event time window , parallel

2019-08-26 Thread Fabian Hueske
gt; > > > > > > > > > *From:* Fabian Hueske > *Sent:* Monday, August 26, 2019 11:06 AM > *To:* Hanan Yehudai > *Cc:* user@flink.apache.org > *Subject:* Re: tumbling event time window , parallel > > > > Hi, > > > > C

RE: tumbling event time window , parallel

2019-08-26 Thread Hanan Yehudai
@flink.apache.org Subject: Re: tumbling event time window , parallel Hi, Can you share a few more details about the data source? Are you continuously ingesting files from a folder? You are correct, that the parallelism should not affect the results, but there are a few things that can affect that: 1) non

Re: tumbling event time window , parallel

2019-08-26 Thread Fabian Hueske
Hi, Can you share a few more details about the data source? Are you continuously ingesting files from a folder? You are correct, that the parallelism should not affect the results, but there are a few things that can affect that: 1) non-determnistic keys 2) out-of-order data with inappropriate