Scott Wegner created BEAM-3372:
----------------------------------
Summary: Duplicated 'zone' PipelineOption has inconsistent
documentation
Key: BEAM-3372
URL: https://issues.apache.org/jira/browse/BEAM-3372
Project: Beam
Issue Type: Bug
Components: runner-dataflow
Reporter: Scott Wegner
Assignee: Thomas Groh
Priority: Minor
Two different PipelineOptions interfaces defined a 'zone' option: GcpOptions
[1] and DataflowWorkerPoolOptions [2]. It's not an error for an option to be
redefined, and internally Beam checks that the definitions are compatible.
In this case the two 'zone' definitions are compatible but they have different
descriptions. This can be confusing as setting one will also impact the other.
We should make improvements around duplicate PipelineOptions definitions for a
given runner. In this case, I propose we:
a) Update the @Description's so that they match.
b) Mark one of them as @Deprecated with a link to the other. Migrate code
references and plan to remove it on the next major version.
c) Add a test which checks all PipelineOptions on the DataflowRunner classpath
and verify that any duplicates have the properties above (equivalent
definitions including @Description, and only one non-@Deprecated version)
[1]
https://github.com/apache/beam/blob/670941961845593d9a7e09b17c1bd117f27bf579/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcpOptions.java#L95
[2]
https://github.com/apache/beam/blob/670941961845593d9a7e09b17c1bd117f27bf579/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java#L175
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)