https://github.com/apache/beam/pull/7147 starts adding the framework to do this (for POJOs we actually generate a constructor using ByteBuddy, but that might not be necessary for AutoValue).
I would start by writing the inference from AutoVaue to a Schema. For example, see PojoUils::schemaFromPojoClass or JavaBeanUtils::schemaFromJavaBeanClass. Reuven On Mon, Nov 26, 2018 at 6:08 AM Jeff Klukas <[email protected]> wrote: > Reuven - How is the work on constructor support for ByteBuddy codegen > going? Does it still look like that's going to be a feasible way forward > for generating schemas/coders for AutoValue classes? > > On Thu, Nov 15, 2018 at 4:37 PM Reuven Lax <[email protected]> wrote: > >> I would hope so if possible. >> >> On Fri, Nov 16, 2018, 4:36 AM Kenneth Knowles <[email protected] wrote: >> >>> Just some low-level detail: If there is no @DefaultSchema annotation but >>> it is an @AutoValue class, can schema inference go ahead with the >>> AutoValueSchema? Then the user doesn't have to do anything. >>> >>> Kenn >>> >>> On Wed, Nov 14, 2018 at 6:14 AM Reuven Lax <[email protected]> wrote: >>> >>>> We already have a framework for ByteBuddy codegen for JavaBean Row >>>> interfaces, which should hopefully be easy to extend AutoValue (and more >>>> efficient than using reflection). I'm working on adding constructor support >>>> to this right now. >>>> >>>> On Wed, Nov 14, 2018 at 12:29 AM Jeff Klukas <[email protected]> >>>> wrote: >>>> >>>>> Sounds, then, like we need to a define a new `AutoValueSchema extends >>>>> SchemaProvider` and users would opt-in to this via the DefaultSchema >>>>> annotation: >>>>> >>>>> @DefaultSchema(AutoValueSchema.class) >>>>> @AutoValue >>>>> public abstract MyClass ... >>>>> >>>>> Since we already have the JavaBean and JavaField reflection-based >>>>> schema providers to use as a guide, it sounds like it may be best to try >>>>> to >>>>> implement this using reflection rather than implementing an AutoValue >>>>> extension. >>>>> >>>>> A reflection-based approach here would hinge on being able to discover >>>>> the package-private constructor for the concrete class and read its types. >>>>> Those types would define the schema, and the fromRow impementation would >>>>> call the discovered constructor. >>>>> >>>>> On Mon, Nov 12, 2018 at 10:02 AM Reuven Lax <[email protected]> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Nov 12, 2018 at 11:38 PM Jeff Klukas <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Reuven - A SchemaProvider makes sense. It's not clear to me, though, >>>>>>> whether that's more limited than a Coder. Do all values of the schema >>>>>>> have >>>>>>> to be simple types, or does Beam SQL support nested schemas? >>>>>>> >>>>>> >>>>>> Nested schemas, collection types (lists and maps), and collections of >>>>>> nested types are all supported. >>>>>> >>>>>>> >>>>>>> Put another way, would a user be able to create an AutoValue class >>>>>>> comprised of simple types and then use that as a field inside another >>>>>>> AutoValue class? I can see how that's possible with Coders, but not >>>>>>> clear >>>>>>> whether that's possible with Row schemas. >>>>>>> >>>>>> >>>>>> Yes, this is explicitly supported. >>>>>> >>>>>>> >>>>>>> On Fri, Nov 9, 2018 at 8:22 PM Reuven Lax <[email protected]> wrote: >>>>>>> >>>>>>>> Hi Jeff, >>>>>>>> >>>>>>>> I would suggest a slightly different approach. Instead of >>>>>>>> generating a coder, writing a SchemaProvider that generates a schema >>>>>>>> for >>>>>>>> AutoValue. Once a PCollection has a schema, a coder is not needed (as >>>>>>>> Beam >>>>>>>> knows how to encode any type with a schema), and it will work >>>>>>>> seamlessly >>>>>>>> with Beam SQL (in fact you don't need to write a transform to turn it >>>>>>>> into >>>>>>>> a Row if a schema is registered). >>>>>>>> >>>>>>>> We already do this for POJOs and basic JavaBeans. I'm happy to help >>>>>>>> do this for AutoValue. >>>>>>>> >>>>>>>> Reuven >>>>>>>> >>>>>>>> On Sat, Nov 10, 2018 at 5:50 AM Jeff Klukas <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi all - I'm looking for some review and commentary on a proposed >>>>>>>>> design for providing built-in Coders for AutoValue classes. There's >>>>>>>>> existing discussion in BEAM-1891 [0] about using AvroCoder, but that's >>>>>>>>> blocked on incompatibility between AutoValue and Avro's reflection >>>>>>>>> machinery that don't look resolvable. >>>>>>>>> >>>>>>>>> I wrote up a design document [1] that instead proposes using >>>>>>>>> AutoValue's extension API to automatically generate a Coder for each >>>>>>>>> AutoValue class that users generate. A similar technique could be >>>>>>>>> used to >>>>>>>>> generate conversions to and from Row for use with BeamSql. >>>>>>>>> >>>>>>>>> I'd appreciate review of the design and thoughts on whether this >>>>>>>>> seems feasible to support within the Beam codebase. I may be missing a >>>>>>>>> simpler approach. >>>>>>>>> >>>>>>>>> [0] https://issues.apache.org/jira/browse/BEAM-1891 >>>>>>>>> [1] >>>>>>>>> https://docs.google.com/document/d/1ucoik4WzUDfilqIz3I1AuMHc1J8DE6iv7gaUCDI42BI/edit?usp=sharing >>>>>>>>> >>>>>>>>
