Re: Subclassing MapElements

Jeff Klukas Tue, 28 Jan 2020 12:57:21 -0800

Jason - I was going to suggest the same approach that you arrived at. I
think using a static method to a MapElements instance is an elegant
solution.


The one drawback I can think of with that approach is that you lose control
over default naming. From your example usage:

    myPCollection.apply(MyMapper.of(…).exceptionsInto(…))

if MyMapper returns a MapElements instance, then this line  will appear in
your pipeline graph as "MapWithFailures". If you do the more verbose thing
and extend PTransform, then the default name would be "MyMapper" from the
name of the class.

You can easily get around this problem by passing an explicit name to
`apply` when using the transform (and indeed this style is encouraged in
the Beam docs), but it could be a nuisance if you're used to relying on
default names that match class names.

On Tue, Jan 28, 2020 at 6:41 PM [email protected] <[email protected]>
wrote:

> Yeah it just seems like a lot of boiler plate to do builders with lots of
> methods just to wrap a MapElements type for syntactic convenience. After
> thinking this over last night I’m wondering if it wouldn’t be better to use
> static factory methods for these, so rather than the following, which tends
> to be what we see in the PTransform style guide etc, but hides the
> MapElements instance and builder calls, closing off the callers opportunity
> to add additional tweaks to the transform:
>
>
>
> public MyMapper extends PTransform<PCollection<U>, PCollection<V> {
>
>     @Override
>
>     public PCollection<T> expand(PCollection<U> input) {
>
>         return MapElements.into(…).via(…);
>
>     }
>
> }
>
> we might instead do something like this
>
>
>
>
> public MyMapper {
>
>     public static MapElements<U, V> of(…) {
>
>         return MapElements.into(…).via(…);
>     }
> }
>
>
>
> Which might be used like so (shown with usage of MapElements additional
> calls):
>
>
>                 myPCollection.apply(MyMapper.of(…).exceptionsInto(…))
>
>
>
>
>
> For us, this saves us from having a lot of typedescriptor code and such
> ending up in high level pipeline setups, making it easier to understand.
>
>
>
> -Jason
>
>
>
> *From: *Kenneth Knowles <[email protected]>
> *Reply-To: *"[email protected]" <[email protected]>
> *Date: *Monday, January 27, 2020 at 9:10 PM
> *To: *dev <[email protected]>
> *Subject: *Re: Subclassing MapElements
>
>
>
> Notice: This email is from an external sender.
>
>
>
> You can, with care, create an abstract Builder class that can be extended.
> You have to be careful to never call a concrete constructor and only call
> its own builder methods, leaving the final construction abstract. Basically
> a bunch of setter methods in a builder style.
>
>
>
> Kenn
>
>
>
> On Mon, Jan 27, 2020 at 9:08 PM Kenneth Knowles <[email protected]> wrote:
>
> It might be more trouble than it is worth, saving typing but adding
> complexity. Especially since you've got @AutoValue and @AutoValue.Builder
> to do all the heavy lifting anyhow (
> https://beam.apache.org/contribute/ptransform-style-guide/#api).
>
>
>
> Kenn
>
>

Re: Subclassing MapElements

Reply via email to