One https://github.com/apache/beam/pull/7289 goes in, the field order will be solved as well. I'll go ahead and send a PR adding support for AutoValue, as there will be very little delta by then.
Reuven On Sun, Dec 2, 2018 at 9:44 PM Reuven Lax <[email protected]> wrote: > Thinking about this a bit more - I suspect we already have almost all the > code we need. > > The code to infer a schema from a Java Bean will probably work with little > change on AutoValue, as it's essentially just a fancy Java Bean. The Java > Bean generated getters should also work. I think all that need to be done > is to generate a constructor. The tricky thing is that the order of fields > in the AutoValue_XXX constructor may not match the order of the fields in > the schema, so we will need to generate an intermediate constructor that > generates the correct call. (alternatively we can try and detect the schema > from the constructor instead of from the getters, which should give us a > schema with matching field order). > > Reuven > > On Thu, Nov 29, 2018 at 9:30 AM Reuven Lax <[email protected]> wrote: > >> https://github.com/apache/beam/pull/7147 starts adding the framework to >> do this (for POJOs we actually generate a constructor using ByteBuddy, but >> that might not be necessary for AutoValue). >> >> I would start by writing the inference from AutoVaue to a Schema. For >> example, see PojoUils::schemaFromPojoClass or >> JavaBeanUtils::schemaFromJavaBeanClass. >> >> Reuven >> >> On Mon, Nov 26, 2018 at 6:08 AM Jeff Klukas <[email protected]> wrote: >> >>> Reuven - How is the work on constructor support for ByteBuddy codegen >>> going? Does it still look like that's going to be a feasible way forward >>> for generating schemas/coders for AutoValue classes? >>> >>> On Thu, Nov 15, 2018 at 4:37 PM Reuven Lax <[email protected]> wrote: >>> >>>> I would hope so if possible. >>>> >>>> On Fri, Nov 16, 2018, 4:36 AM Kenneth Knowles <[email protected] wrote: >>>> >>>>> Just some low-level detail: If there is no @DefaultSchema annotation >>>>> but it is an @AutoValue class, can schema inference go ahead with the >>>>> AutoValueSchema? Then the user doesn't have to do anything. >>>>> >>>>> Kenn >>>>> >>>>> On Wed, Nov 14, 2018 at 6:14 AM Reuven Lax <[email protected]> wrote: >>>>> >>>>>> We already have a framework for ByteBuddy codegen for JavaBean Row >>>>>> interfaces, which should hopefully be easy to extend AutoValue (and more >>>>>> efficient than using reflection). I'm working on adding constructor >>>>>> support >>>>>> to this right now. >>>>>> >>>>>> On Wed, Nov 14, 2018 at 12:29 AM Jeff Klukas <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Sounds, then, like we need to a define a new `AutoValueSchema >>>>>>> extends SchemaProvider` and users would opt-in to this via the >>>>>>> DefaultSchema annotation: >>>>>>> >>>>>>> @DefaultSchema(AutoValueSchema.class) >>>>>>> @AutoValue >>>>>>> public abstract MyClass ... >>>>>>> >>>>>>> Since we already have the JavaBean and JavaField reflection-based >>>>>>> schema providers to use as a guide, it sounds like it may be best to >>>>>>> try to >>>>>>> implement this using reflection rather than implementing an AutoValue >>>>>>> extension. >>>>>>> >>>>>>> A reflection-based approach here would hinge on being able to >>>>>>> discover the package-private constructor for the concrete class and read >>>>>>> its types. Those types would define the schema, and the fromRow >>>>>>> impementation would call the discovered constructor. >>>>>>> >>>>>>> On Mon, Nov 12, 2018 at 10:02 AM Reuven Lax <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Nov 12, 2018 at 11:38 PM Jeff Klukas <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Reuven - A SchemaProvider makes sense. It's not clear to me, >>>>>>>>> though, whether that's more limited than a Coder. Do all values of the >>>>>>>>> schema have to be simple types, or does Beam SQL support nested >>>>>>>>> schemas? >>>>>>>>> >>>>>>>> >>>>>>>> Nested schemas, collection types (lists and maps), and collections >>>>>>>> of nested types are all supported. >>>>>>>> >>>>>>>>> >>>>>>>>> Put another way, would a user be able to create an AutoValue class >>>>>>>>> comprised of simple types and then use that as a field inside another >>>>>>>>> AutoValue class? I can see how that's possible with Coders, but not >>>>>>>>> clear >>>>>>>>> whether that's possible with Row schemas. >>>>>>>>> >>>>>>>> >>>>>>>> Yes, this is explicitly supported. >>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Nov 9, 2018 at 8:22 PM Reuven Lax <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Jeff, >>>>>>>>>> >>>>>>>>>> I would suggest a slightly different approach. Instead of >>>>>>>>>> generating a coder, writing a SchemaProvider that generates a schema >>>>>>>>>> for >>>>>>>>>> AutoValue. Once a PCollection has a schema, a coder is not needed >>>>>>>>>> (as Beam >>>>>>>>>> knows how to encode any type with a schema), and it will work >>>>>>>>>> seamlessly >>>>>>>>>> with Beam SQL (in fact you don't need to write a transform to turn >>>>>>>>>> it into >>>>>>>>>> a Row if a schema is registered). >>>>>>>>>> >>>>>>>>>> We already do this for POJOs and basic JavaBeans. I'm happy to >>>>>>>>>> help do this for AutoValue. >>>>>>>>>> >>>>>>>>>> Reuven >>>>>>>>>> >>>>>>>>>> On Sat, Nov 10, 2018 at 5:50 AM Jeff Klukas <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi all - I'm looking for some review and commentary on a >>>>>>>>>>> proposed design for providing built-in Coders for AutoValue classes. >>>>>>>>>>> There's existing discussion in BEAM-1891 [0] about using AvroCoder, >>>>>>>>>>> but >>>>>>>>>>> that's blocked on incompatibility between AutoValue and Avro's >>>>>>>>>>> reflection >>>>>>>>>>> machinery that don't look resolvable. >>>>>>>>>>> >>>>>>>>>>> I wrote up a design document [1] that instead proposes using >>>>>>>>>>> AutoValue's extension API to automatically generate a Coder for each >>>>>>>>>>> AutoValue class that users generate. A similar technique could be >>>>>>>>>>> used to >>>>>>>>>>> generate conversions to and from Row for use with BeamSql. >>>>>>>>>>> >>>>>>>>>>> I'd appreciate review of the design and thoughts on whether this >>>>>>>>>>> seems feasible to support within the Beam codebase. I may be >>>>>>>>>>> missing a >>>>>>>>>>> simpler approach. >>>>>>>>>>> >>>>>>>>>>> [0] https://issues.apache.org/jira/browse/BEAM-1891 >>>>>>>>>>> [1] >>>>>>>>>>> https://docs.google.com/document/d/1ucoik4WzUDfilqIz3I1AuMHc1J8DE6iv7gaUCDI42BI/edit?usp=sharing >>>>>>>>>>> >>>>>>>>>>
