kennknowles commented on PR #31682:
URL: https://github.com/apache/beam/pull/31682#issuecomment-2284473496
> This change causes an error when updating the Dataflow pipeline from
previous version, but the update can be allowed by passing:
>
> ```
>
--transformNameMapping={"KafkaIO.Read/KafkaIO.Read/KafkaIO.Read.ReadFromKafkaViaSDF/KafkaIO.ReadSourceDescriptors/Reshuffle.ViaRandomKey/Reshuffle/GroupByKey":""}"
> ```
>
> @kennknowles Is that sufficient or should I add some option to maintain
the previous expensive behavior? Are there any other concerns with this change?
Thanks!
Thanks for raising this. I hate to bring it in, because it is a pain, but we
do have a mechanism where the user can pass `--updateCompatibilityVersion` and
for this one we really ought to use it. Example at
https://github.com/apache/beam/pull/28853/files#diff-b8cf6c3051a36c566f2f28f525449f456a88b05b3b4c17c814e6a55ba2ce36e9R77
For you, the work is to keep the old code as-is in a deprecated fork of
`expand` that is activated if the user requests it. Users who value
compatibility can/should set this field to the value of the version they
launched the pipeline with. This allows us to make update-incompatible changes
to the default codepath without breaking users with long-running pipeline that
want to upgrade the SDK for some other reason. It is fine for users who want
this new improvement to have to pass the parameter you mentioned, or even to
have to drain.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]