I would say so, yes. But I don't consider ProessWindowFunction to be low-level, 
it's just the function that should be used for processing windows if you need 
more information about context.

Best,
Aljoscha

> On 14. Aug 2017, at 22:53, M Singh <mans2si...@yahoo.com> wrote:
> 
> Thanks Aljoscha for your response.
> 
> Just to clarify - the only way to handle the duplication scenario properly is 
> by using the ProcessWindowFunction - there is no high level function for this.
> 
> Thanks again.
> 
> 
> On Wednesday, August 9, 2017 6:26 AM, Aljoscha Krettek <aljos...@apache.org> 
> wrote:
> 
> 
> Hi,
> 
> 1. You could use a ProcessWindowFunction instead of a WindowFunction. In 
> there, you can query the current watermark and thus determine why the firing 
> is happening. Also, in a ProcessWindowFunction you can keep per-window state, 
> this would allow you to keep a bit of state that can tell you whether this is 
> the first firing for a given window or the number of firings so far.
> 
> 2. This depends on whether the Trigger is purging or not. The default 
> EventTimeTrigger is not purging, meaning that all elements in the window will 
> be preserved after firing (until the watermark reaches the end of the window 
> plus the allowed lateness). You can turn this into a purging trigger using 
> PurgingTrigger.of(EventTimeTrigger.create()). You would specify this using 
> .trigger() on WindowedStream when constructing your windowed operation.
> 
> 3. It doesn't, you have to manually keep state in a ProcessWindowFunction to 
> distinguish between different cases, as mentioned above.
> 
> 4. Currently, I think there are no examples because this depends to a large 
> degree on the specifics of the application. I'm afraid.
> 
> Best,
> Aljoscha
> 
>> On 6. Aug 2017, at 18:45, M Singh <mans2si...@yahoo.com 
>> <mailto:mans2si...@yahoo.com>> wrote:
>> 
>> Hi Folks:
>> 
>> I am going through flink documentation and it states the following:
>> 
>> "You should be aware that the elements emitted by a late firing should be 
>> treated as updated results of a previous computation, i.e., your data stream 
>> will contain multiple results for the same computation. Depending on your 
>> application, you need to take these duplicated results into account or 
>> deduplicate them."
>> 
>> I wanted to find out the following:
>> 
>> 1. How do we distinguish the late firing from the main firing ?
>> 2. Does the late firing including all events or only late events ?
>> 3. How does the late vs main firing affect the associated window function ?
>> 4. Are there any examples of how to handle these events and deduplication 
>> mentioned in the documentation ?
>> 
>> Thanks for your help.
>> 
>> Mans
> 
> 
> 

Reply via email to