I'm not sure I fully understand the scenario you envision. Are you saying you want to have some sort of window that batches (and deduplicates) up until a downstream map has finished processing the previous deduplicated batch, and then the window should emit the new batch?
If that's what you want, then I would say no, that implementation strategy is not a good fit for Flink (at present). With more understanding of your functional requirements, we might be able to suggest an approach that would be a better fit. David On Fri, Aug 16, 2019 at 10:58 PM Steven Nelson <snel...@sourceallies.com> wrote: > > Hello! > > I think I know the answer to this, but I thought I would go ahead and ask. > > We have a process the emits messages to our stream. These messages can > include duplicates based on a certain key ( we'll call it TheKey). Our Flink > job reads the messages, keys by TheKey and enters a window function. Right > now we are using an EventTime Session Window with a custom aggregator, which > only keeps the most recent messages for each TheKey. It then emits those > messages to a map function that does work based on the message. > > What we would like to have is a window function that keeps building up until > the downstream map function has completed for the windows key, then emit. > > Is this a pattern that Flink supports? > > -Steve >