AnishMahto commented on PR #56016: URL: https://github.com/apache/spark/pull/56016#issuecomment-4523874120
I left some comments in the code explaining concepts we discussed in these threads. Accepted code suggestions. By the way in the near future I'm hoping to merge some automated/fuzz testing for idempotency, to provide additional signal for the idempotency argument (and prevent idempotency regression). The idea would be to compose two different `Scd1ForeachBatchHandler`, one with a stubbed faulty `Scd1BatchProcessor.mergeMicrobatchOntoAuxiliaryTable`, and the one with the regular `Scd1ForeachBatchHandler` implementation. When both handlers run on the same generated microbatch, they should produce the same output (granted the failing handler needs to run twice to recover from failure). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
