boyuanzz commented on pull request #12572: URL: https://github.com/apache/beam/pull/12572#issuecomment-701043637
> Thanks, it looks fine in general for me. I left several questions, ptal. > > My main concern is the following: > > * Can we have a dataloss in case of failures during record processing while an offset of this partition is already committed in parallel pipeline's branch? > That's the usage of `Reshuffle`. When there is a failure in record processing, the record will not be re-read from Kafka Read, instead it will be re-read from `Reshuffle`. > Also, the tests are very needed for this feature. I'm thinking about having tests with mock Kafka. Do you have suggestions/ideas around testing? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
