Today, you can abbreviate arguments in Beam Python. This is generally
convenient since you can do things like specify `--r` instead of `--runner`,
and Beam will infer your intent.

Unfortunately, it also has unintended side effects. For example, specifying
`--u` will impact not just `--update`, but also
`--update_compatibility_version` (caused bug fixed by #34083
<https://github.com/apache/beam/pull/34083>), and specifying `--output`
like we do in most of our examples
<https://github.com/search?q=repo%3Aapache%2Fbeam+%22--output%22+language%3APython&type=code&l=Python>
will impact `--output_executable_path` as well (notably, this cannot be
fixed in an easy way like `--update_compatibility_version` since both
`--output` and `--output_executable_path` might be expecting strings).
There is not an easy way to resolve this, and it only gets worse as we add
more flags over time. The `--output_executable_path` argument is
particularly relevant as we move more python pipelines to prism which
depends on that arg. We could probably find a band-aid, but it's going to
be ugly and a temporary patch at best.

To resolve this, I'm proposing making a one time breaking change to get rid
of argument abbreviation. This will cause existing pipelines which are
using this feature to no longer pick up abbreviated flags. In most cases,
this will lead to obvious changes in behavior or failures (e.g. a runner is
not specified correctly and now tries to run with the local runner), but in
some cases the issue may be more subtle. I do not think there is a great
way around this.

I'd like to get thoughts - does anyone have objections or other ideas on
how we can handle this gracefully? I have
https://github.com/apache/beam/pull/34934 as a WIP PR to do this.

Thanks,
Danny

Reply via email to