Hi Shen, In order for this to work well with watermark tracking, we have some initial ideas on https://issues.apache.org/jira/browse/BEAM-644
Kenn On Wed, Jun 14, 2017 at 1:34 PM, Shen Li <[email protected]> wrote: > Hi, > > I saw the DoFn#getAllowedTimestampSkew has been marked as deprecated. What > if a user does want to rewind back the timestamp without violating the > watermark? > > Consider the case where there is a GroupByKey followed by a ParDo. The > GroupByKey transform groups tuples into one-hour windows. Say, each value > of the output iterable of the GroupByKey remembers the timestamp of when it > is created. The ParDo finds the max value in the iterable and wants to use > its timestamp as the output timestamp. For example, the timestamp of the > GroupByKey output might be 11 AM, but the timestamp of the max value might > be 10:30 AM. Is it possible for the user-defined ParDo to rewind back the > timestamp to 10:30 AM? > > As the runner knows the current watermark, should there be any API for the > runner to notify the app of the allowedTimestampSkew? > > Thanks, > > Shen >
