On Wed, Jun 8, 2016 at 10:15 AM, Dan Halperin <[email protected]> wrote:
> > I thought finishBundle() > > exists simply as best-effort indication from the runner to user some > chunk > > of records have been processed.. not part of processing guarantees. Also > > the term "bundle" itself is fairly loosely defined (may be > intentionally). > > > > No, finish bundle MUST be called by a runner before it can commit any work. > This > is akin to flushing a stream before closing it -- the DoFn may have some > elements > cached or pending and if you don't call finish bundle you will not have > fully > processed or produced all the elements. I see. finshBundle() includes context too (DoFn could output more elements e.g.). Yeah it should be called before the runner can commit/checkpoint.
