Hello, I recently worked on adding a bundle retry for the Python SDK DirectRunner ( https://issues.apache.org/jira/browse/BEAM-2718).
The goal was to have a more reliable processing of bundles. The change included having any bundle retry be processed up to 4 times and making sure GroupByKey doesn't do partial write-outs in case of a retry. For 2.2.0, it will remain in opt-in mode with a message in case of failure, suggesting the use of the --direct_runner_bundle_retry flag. In our next release, it will be fully integrated (opt-in removed). If you have any questions about the addition, please let me know. Best, María
