Just some low-level detail: If there is no @DefaultSchema annotation but it is an @AutoValue class, can schema inference go ahead with the AutoValueSchema? Then the user doesn't have to do anything.
Kenn On Wed, Nov 14, 2018 at 6:14 AM Reuven Lax <[email protected]> wrote: > We already have a framework for ByteBuddy codegen for JavaBean Row > interfaces, which should hopefully be easy to extend AutoValue (and more > efficient than using reflection). I'm working on adding constructor support > to this right now. > > On Wed, Nov 14, 2018 at 12:29 AM Jeff Klukas <[email protected]> wrote: > >> Sounds, then, like we need to a define a new `AutoValueSchema extends >> SchemaProvider` and users would opt-in to this via the DefaultSchema >> annotation: >> >> @DefaultSchema(AutoValueSchema.class) >> @AutoValue >> public abstract MyClass ... >> >> Since we already have the JavaBean and JavaField reflection-based schema >> providers to use as a guide, it sounds like it may be best to try to >> implement this using reflection rather than implementing an AutoValue >> extension. >> >> A reflection-based approach here would hinge on being able to discover >> the package-private constructor for the concrete class and read its types. >> Those types would define the schema, and the fromRow impementation would >> call the discovered constructor. >> >> On Mon, Nov 12, 2018 at 10:02 AM Reuven Lax <[email protected]> wrote: >> >>> >>> >>> On Mon, Nov 12, 2018 at 11:38 PM Jeff Klukas <[email protected]> >>> wrote: >>> >>>> Reuven - A SchemaProvider makes sense. It's not clear to me, though, >>>> whether that's more limited than a Coder. Do all values of the schema have >>>> to be simple types, or does Beam SQL support nested schemas? >>>> >>> >>> Nested schemas, collection types (lists and maps), and collections of >>> nested types are all supported. >>> >>>> >>>> Put another way, would a user be able to create an AutoValue class >>>> comprised of simple types and then use that as a field inside another >>>> AutoValue class? I can see how that's possible with Coders, but not clear >>>> whether that's possible with Row schemas. >>>> >>> >>> Yes, this is explicitly supported. >>> >>>> >>>> On Fri, Nov 9, 2018 at 8:22 PM Reuven Lax <[email protected]> wrote: >>>> >>>>> Hi Jeff, >>>>> >>>>> I would suggest a slightly different approach. Instead of generating a >>>>> coder, writing a SchemaProvider that generates a schema for AutoValue. >>>>> Once >>>>> a PCollection has a schema, a coder is not needed (as Beam knows how to >>>>> encode any type with a schema), and it will work seamlessly with Beam SQL >>>>> (in fact you don't need to write a transform to turn it into a Row if a >>>>> schema is registered). >>>>> >>>>> We already do this for POJOs and basic JavaBeans. I'm happy to help do >>>>> this for AutoValue. >>>>> >>>>> Reuven >>>>> >>>>> On Sat, Nov 10, 2018 at 5:50 AM Jeff Klukas <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi all - I'm looking for some review and commentary on a proposed >>>>>> design for providing built-in Coders for AutoValue classes. There's >>>>>> existing discussion in BEAM-1891 [0] about using AvroCoder, but that's >>>>>> blocked on incompatibility between AutoValue and Avro's reflection >>>>>> machinery that don't look resolvable. >>>>>> >>>>>> I wrote up a design document [1] that instead proposes using >>>>>> AutoValue's extension API to automatically generate a Coder for each >>>>>> AutoValue class that users generate. A similar technique could be used to >>>>>> generate conversions to and from Row for use with BeamSql. >>>>>> >>>>>> I'd appreciate review of the design and thoughts on whether this >>>>>> seems feasible to support within the Beam codebase. I may be missing a >>>>>> simpler approach. >>>>>> >>>>>> [0] https://issues.apache.org/jira/browse/BEAM-1891 >>>>>> [1] >>>>>> https://docs.google.com/document/d/1ucoik4WzUDfilqIz3I1AuMHc1J8DE6iv7gaUCDI42BI/edit?usp=sharing >>>>>> >>>>>
