arunpandianp commented on PR #38814: URL: https://github.com/apache/beam/pull/38814#issuecomment-4842363660
> So it seems now that we are not caching this DoFn for more things that might happen in flushInternal which was previously separate from user code execution? > For the multi-key case this seems needed because we haven't called finishBundle yet and thus have incomplete dofn lifecycle. However if it is the final key (or only key) within a batch we may be not caching dofns as aggressively as previously when they are still valid to use. > We could defer the final finishKey if advance will return false? I'm worried that there might be more effects for single-key processing than we realize if there are cases where checkpoints do have errors since dofn construction is expensive. Done. Moved last flushStateInternal outside the doFn lifecycle. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
