Re: [I] [Bug]: Unable to drain Flink job when RequiresStableInput is used [beam]

via GitHub Wed, 04 Oct 2023 02:33:35 -0700


je-ik commented on issue #28554:
URL: https://github.com/apache/beam/issues/28554#issuecomment-1746498546


   Yes, if the final checkpoint fails (for whatever reason), then the retry can 
yield different result than the first run. I think we have only two options:
    a) best-effort, i.e. flushing our buffer in `flushData`, which is what it 
was intended for, or
    b) fail the pipeline, as draining is incompatible with stable DoFns.
    
   Currently, we do b), which is actually semantically correct. We can change 
the exception to be more explanatory or even better throw exception 
unconditionally, if DoFn has stable input (currently it might fail 
non-deterministically).
   
   Because switching between a) and b) might be subject to user decision, we 
might introduce a flag to `FlinkPipelineOptions` that will opt-in to from b) to 
a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [Bug]: Unable to drain Flink job when RequiresStableInput is used [beam]

Reply via email to