je-ik commented on issue #24655: URL: https://github.com/apache/beam/issues/24655#issuecomment-1371955852
Sorry, I should've attached link to the [ML thread](https://lists.apache.org/thread/2s3jx62wh0rz09dcmz816sl5y3dnq432). TL;DR It is runner-dependent how to achieve stable input, but every runner is able to ensure stability at the boundary of executable stage only. If we have two DoFns, say NonDeterminsticDoFn -> StableDoFn and these two get fused into single executable stage, then the contract of `@RequiresStableInput` do the StableDoFn is broken, because the runner can ensure stability only for the fused (NonDeterminsticDoFn + StableDoFn) DoFn, which is executed in the harness. Retrying such stage leads to non-stable input to the StableDoFn (due to non-determinism of the leading DoFn). Therefore, fusion needs to be broken at the boundary of `@RequiresStableInput` to make sure, that the executable stage starts with the stable DoFn. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
