hjtran opened a new pull request, #28984: URL: https://github.com/apache/beam/pull/28984
Motivating Issue ---------------- The python SDK requires users specify a unique label for every transform. Often times, the first time a transform is applied there is a default label that will be unique, but any future applications of the transform will require that the user specify a new and unique label. This can be quite tedious since there are many (?) scenarios where a user won't actually care about the labels. A couple scenarios that came up in my experience: - Writing unit tests with multiple `assert_that()`s requires unique label for each assert - Writing a pipeline that branches and applies the same transform to each branch - Writing a linear pipeline where I want to reuse the same transform (e.g. `LogElements` or `Deduplicate`) Change Summary ------------------ This change adds a new `--auto_unique_labels` standard option. The option defaults to off so there's no change in the default behavior. If the option is set, then whenever a transform is applied with a non-unique label, a new label is generated that includes an automatically incremented suffix. For example, if `Deduplicate` is applied twice, then the second deduplicate will have a label of `Deduplicate_1`. Additional `Deduplicates` would have a label of `Deduplicate_n`. Testing ------- Wrote a new unit test that tests the new behavior. The rest of the `pipeline_test.py` tests still pass except for one: `test_display_data`. It also fails on my fork's master though so I'm assuming it's unrelated. GitHub Actions Tests Status (on master branch) ------------------------------------------------------------------------------------------------ [](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule) [](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule) [](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule) [](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule) See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI or the [workflows README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) to see a list of phrases to trigger workflows. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
