Hi all,
I was encouraged to write about my troubles to use PCollections of
AutoValue classes with AvroCoder; because it seems like currently, this is
not possible.

As part of the changes to PAssert, I meant to create a SuccessOrFailure
class that could be passed in a PCollection to a `concludeTransform`, which
would be in charge of validating that all the assertions succeeded, and use
AvroCoder for serialization of that class. Consider this dummy example:

@AutoValue
abstract class FizzBuzz {
...
}

class FizzBuzzDoFn extends DoFn<Integer, FizzBuzz> {
...
}

1. The first problem was that the abstract class does not have any
attributes, so AvroCoder can not scrape them. For this, (with advice from
Kenn Knowles), the Coder would need to take the AutoValue-generated class:

.apply(ParDo.of(new FizzBuzzDoFn()))
.setCoder(AvroCoder.of((Class<FizzBuzz>) AutoValue_FizzBuzz.class))

2. This errored out saying that FizzBuzz and AutoValue_FizzBuzz are
incompatible classes, so I just tried bypassing the type system like so:

.setCoder(AvroCoder.of((Class) AutoValue_FizzBuzz.class))

3. This compiled properly, and encoding worked, but the problem came at
decoding, because Avro specifically requires the class to have a no-arg
constructor [1], and AutoValue-generated classes do not come with one. This
is a problem for several serialization frameworks, and we're not the first
ones to hit this [2], and the AutoValue people don't seem keen on adding
this.

Considering all that, it seems that the AutoValue-AvroCoder pair can not
currently work. We'd need a serialization framework that does not depend on
calling the no-arg constructor and then filling in the attributes with
reflection. I'm trying to check if SerializableCoder has different
deserialization techniques; but for PAssert, I just decided to use
POJO+AvroCoder.

I hope my experience may be useful to others, and maybe start a discussion
on how to enable users to have AutoValue classes in their PCollections.

Best
-P.

[1] -
http://avro.apache.org/docs/1.7.7/api/java/org/apache/avro/reflect/package-summary.html?is-external=true
[2] - https://github.com/google/auto/issues/122

Reply via email to