mosche commented on PR #25817: URL: https://github.com/apache/beam/pull/25817#issuecomment-1482345878
@hemantsk I'm wondering, doesn't this significantly change processing semantics? If you retry at the DoFn level you might generate duplicates in an uncontrollable way, which would be particularly bad with exactly once sources. In particular there might be partial failures as a runner can output multiple elements with potential errors happening mid way through. To prevent that you could of course buffer outputs before passing them on. However, doing that could quickly impact performance, especially with splittable pardos -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
