kennknowles commented on PR #30545:
URL: https://github.com/apache/beam/pull/30545#issuecomment-1981548748
One question which wasn't in the design docs was how to implement: wrap
Reshuffle (aka build a composite that just invokes Reshuffle and relies on
everything built around it) or fork. This PR chose to fork. Pro/con:
Arguments for wrapping:
- less code
- runners that implement Reshuffle specially already will implement
Redistribute the same way
- if there is something I missed in how Reshuffle is treated, it will get
picked up because we are still using it
Arguments for forking:
- decouple whatever state a runner may store, and just generally decouple
their evolution
- people won't unpack their "Redistribute" and see a Reshuffle inside and
get the wrong idea
- (minor) can remove update compatibility path
- if there is something I missed in how Redistribute and Reshuffle are
different, we are free for them to diverge
So I chose forking but could be convinced otherwise. My way creates more
code and more work, because we need to make runners treat it specially - for
Dataflow we can do way better than the existing Reshuffle translation, for the
other runners it'll be a re-use of existing Reshuffle translation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]