Re: Subclassing MapElements

[email protected] Tue, 28 Jan 2020 14:52:22 -0800

Jeff,

Cool, thanks for the thoughts, the point about  naming is a very good one. I 
think we will probably look to do more with static builder methods as we 
discussed.

Cheers!
-J

From: Jeff Klukas <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Tuesday, January 28, 2020 at 12:57 PM
To: dev <[email protected]>
Subject: Re: Subclassing MapElements

Notice: This email is from an external sender.

Jason - I was going to suggest the same approach that you arrived at. I think 
using a static method to a MapElements instance is an elegant solution.

The one drawback I can think of with that approach is that you lose control 
over default naming. From your example usage:

    myPCollection.apply(MyMapper.of(…).exceptionsInto(…))

if MyMapper returns a MapElements instance, then this line  will appear in your 
pipeline graph as "MapWithFailures". If you do the more verbose thing and 
extend PTransform, then the default name would be "MyMapper" from the name of 
the class.

You can easily get around this problem by passing an explicit name to `apply` 
when using the transform (and indeed this style is encouraged in the Beam 
docs), but it could be a nuisance if you're used to relying on default names 
that match class names.

On Tue, Jan 28, 2020 at 6:41 PM [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>> wrote:
Yeah it just seems like a lot of boiler plate to do builders with lots of 
methods just to wrap a MapElements type for syntactic convenience. After 
thinking this over last night I’m wondering if it wouldn’t be better to use 
static factory methods for these, so rather than the following, which tends to 
be what we see in the PTransform style guide etc, but hides the MapElements 
instance and builder calls, closing off the callers opportunity to add 
additional tweaks to the transform:

public MyMapper extends PTransform<PCollection<U>, PCollection<V> {
    @Override
    public PCollection<T> expand(PCollection<U> input) {
        return MapElements.into(…).via(…);
    }
}
we might instead do something like this

public MyMapper {
    public static MapElements<U, V> of(…) {
        return MapElements.into(…).via(…);
    }
}

Which might be used like so (shown with usage of MapElements additional calls):

                myPCollection.apply(MyMapper.of(…).exceptionsInto(…))

For us, this saves us from having a lot of typedescriptor code and such ending 
up in high level pipeline setups, making it easier to understand.

-Jason

From: Kenneth Knowles <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Monday, January 27, 2020 at 9:10 PM
To: dev <[email protected]<mailto:[email protected]>>
Subject: Re: Subclassing MapElements

Notice: This email is from an external sender.

You can, with care, create an abstract Builder class that can be extended. You 
have to be careful to never call a concrete constructor and only call its own 
builder methods, leaving the final construction abstract. Basically a bunch of 
setter methods in a builder style.

Kenn

On Mon, Jan 27, 2020 at 9:08 PM Kenneth Knowles 
<[email protected]<mailto:[email protected]>> wrote:
It might be more trouble than it is worth, saving typing but adding complexity. 
Especially since you've got @AutoValue and @AutoValue.Builder to do all the 
heavy lifting anyhow 
(https://beam.apache.org/contribute/ptransform-style-guide/#api).

Kenn

Re: Subclassing MapElements

Reply via email to