je-ik commented on issue #28554:
URL: https://github.com/apache/beam/issues/28554#issuecomment-1746498546
Yes, if the final checkpoint fails (for whatever reason), then the retry can
yield different result than the first run. I think we have only two options:
a) best-effort, i.e. flushing our buffer in `flushData`, which is what it
was intended for, or
b) fail the pipeline, as draining is incompatible with stable DoFns.
Currently, we do b), which is actually semantically correct. We can change
the exception to be more explanatory or even better throw exception
unconditionally, if DoFn has stable input (currently it might fail
non-deterministically).
Because switching between a) and b) might be subject to user decision, we
might introduce a flag to `FlinkPipelineOptions` that will opt-in to from b) to
a).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]