[
https://issues.apache.org/jira/browse/BEAM-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Frances Perry reassigned BEAM-2450:
-----------------------------------
Assignee: (was: Frances Perry)
> Transform names and named applications should not be null or empty
> ------------------------------------------------------------------
>
> Key: BEAM-2450
> URL: https://issues.apache.org/jira/browse/BEAM-2450
> Project: Beam
> Issue Type: Bug
> Components: beam-model, sdk-java-core, sdk-py
> Reporter: Scott Wegner
> Priority: Minor
>
> Beam SDK allows setting the name of a transform [1] and also naming the
> transform application [2]. If no name is specified on application, the name
> of the transform is used. If no name is specified for the transform, the
> class name is used.
> The application name serves as metadata for the applied PTransforms in the
> constructed graph. The are effectively extra display data (historically,
> PTransform names predate display data). The names are used by runners for UI
> and monitoring applications, such as the displayed pipeline graph in the
> Dataflow Monitoring UI [3].
> Currently there is no explicit validation on the specified application name.
> The current behavior seems to be:
> * null application names cause a NullPointerException at construction time.
> * Specifying the empty string compiles and succeeds in the DirectRunner, but
> causes strange behavior in Dataflow when rendering the graph in the UI. I
> have not tested the behavior of other runners.
> We should add explicit validation in the model on the specified transform
> name and application name. I propose that we disallow null and empty names.
> This is technically a breaking change as the SDK currently allows the empty
> string, but only because it is under-specified. The upgrade path for any
> pipelines broken by this change is simple: specify a non-empty name or
> fallback to the default class name.
> [1]
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PTransform.java#L236
> [2]
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java#L295
> [3]
> https://cloud.google.com/dataflow/pipelines/dataflow-monitoring-intf#viewing-a-pipeline
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)