+1, I think it makes more sense to name the application of a transform rather than the transform itself. (Still mulling on how best to do this with Python...)
On Wed, Jun 22, 2016 at 9:27 PM, Jean-Baptiste Onofré <[email protected]> wrote: > +1 > > Regards > JB > > > On 06/23/2016 12:17 AM, Ben Chambers wrote: >> >> Based on a recent PR (https://github.com/apache/incubator-beam/pull/468) I >> was reminded of the confusion around the use of >> .apply(transform.named(someName)) and .apply(someName, transform). This is >> one of things I’ve wanted to cleanup for a while. I’d like to propose a >> path towards removing this redundancy. >> >> First, some background -- why are there two ways to name things? When we >> added support for updating existing pipelines, we needed all applications >> to have unique user-provided names to allow diff’ing the pipelines. We >> found a few problems with the first approach -- using .named() to create a >> new transform -- which led to the introduction of the named apply: >> >> 1. When receiving an error about an application not having a name, it is >> not obvious that a name should be given to the *transform* >> 2. When using .named() to construct a new transform either the type >> information is lost or the composite transform has to override .named() >> >> We now generally suggest the use of .apply(someName, transform). It is >> easier to use and doesn’t lead to as much confusion around PTransform >> names >> and PTransform application names. >> >> To that end, I'd like to propose the following changes to the code and >> documentation: >> 1. Replace the usage of .named(name) in all examples and composites with >> the named-apply syntax. >> 2. Replace .named(name) with a protected PTransform constructor which >> takes >> a default name. If not provided, the default name will be derived from the >> class of the PTransform. >> 3. Use the protected constructor in composites (where appropriate) to >> ensure that the default application has a reasonable name. >> >> Users will benefit from having a single way of naming applications while >> building a pipeline. Any breakages due to the removal of .named should be >> easily fixed by either using the named application or by passing the name >> to the constructor of a composite. >> >> I’d like to hear any comments or opinions on this topic from the wider >> community. Please let me know what you think! >> >> -- Ben >> > > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com
