On Wed, Jun 8, 2016 at 10:13 AM, Ben Chambers <bchamb...@google.com.invalid>
wrote:

> - If failure occurs after finishBundle() but before the consumption is
> committed, then the bundle may be reprocessed, which leads to duplicated
> calls to processElement() and finishBundle().
>



> - If failure occurs after consumption is committed but before
> finishBundle(), then those elements which may have buffered state in the
> DoFn but not had their side-effects fully processed (since the
> finishBundle() was responsible for that) are lost.
>

I am trying to understand this better. Does this mean during
recovery/replay after a failure, the particular instance of DoFn that
existed before the worker failure would not be discarded, but might still
receive elements?  If a DoFn is caching some internal state, it should
always assume the worker its on might abruptly fail anytime and the state
would be lost, right?

Reply via email to