One https://github.com/apache/beam/pull/7289 goes in, the field order will
be solved as well. I'll go ahead and send a PR adding support for
AutoValue, as there will be very little delta by then.

Reuven

On Sun, Dec 2, 2018 at 9:44 PM Reuven Lax <[email protected]> wrote:

> Thinking about this a bit more - I suspect we already have almost all the
> code we need.
>
> The code to infer a schema from a Java Bean will probably work with little
> change on AutoValue, as it's essentially just a fancy Java Bean. The Java
> Bean generated getters should also work. I think all that need to be done
> is to generate a constructor. The tricky thing is that the order of fields
> in the AutoValue_XXX constructor may not match the order of the fields in
> the schema, so we will need to generate an intermediate constructor that
> generates the correct call. (alternatively we can try and detect the schema
> from the constructor instead of from the getters, which should give us a
> schema with matching field order).
>
> Reuven
>
> On Thu, Nov 29, 2018 at 9:30 AM Reuven Lax <[email protected]> wrote:
>
>> https://github.com/apache/beam/pull/7147 starts adding the framework to
>> do this (for POJOs we actually generate a constructor using ByteBuddy, but
>> that might not be necessary for AutoValue).
>>
>> I would start by writing the inference from AutoVaue to a Schema. For
>> example, see PojoUils::schemaFromPojoClass or
>> JavaBeanUtils::schemaFromJavaBeanClass.
>>
>> Reuven
>>
>> On Mon, Nov 26, 2018 at 6:08 AM Jeff Klukas <[email protected]> wrote:
>>
>>> Reuven - How is the work on constructor support for ByteBuddy codegen
>>> going? Does it still look like that's going to be a feasible way forward
>>> for generating schemas/coders for AutoValue classes?
>>>
>>> On Thu, Nov 15, 2018 at 4:37 PM Reuven Lax <[email protected]> wrote:
>>>
>>>> I would hope so if possible.
>>>>
>>>> On Fri, Nov 16, 2018, 4:36 AM Kenneth Knowles <[email protected] wrote:
>>>>
>>>>> Just some low-level detail: If there is no @DefaultSchema annotation
>>>>> but it is an @AutoValue class, can schema inference go ahead with the
>>>>> AutoValueSchema? Then the user doesn't have to do anything.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Wed, Nov 14, 2018 at 6:14 AM Reuven Lax <[email protected]> wrote:
>>>>>
>>>>>> We already have a framework for ByteBuddy codegen for JavaBean Row
>>>>>> interfaces, which should hopefully be easy to extend AutoValue (and more
>>>>>> efficient than using reflection). I'm working on adding constructor 
>>>>>> support
>>>>>> to this right now.
>>>>>>
>>>>>> On Wed, Nov 14, 2018 at 12:29 AM Jeff Klukas <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Sounds, then, like we need to a define a new `AutoValueSchema
>>>>>>> extends SchemaProvider` and users would opt-in to this via the
>>>>>>> DefaultSchema annotation:
>>>>>>>
>>>>>>> @DefaultSchema(AutoValueSchema.class)
>>>>>>> @AutoValue
>>>>>>> public abstract MyClass ...
>>>>>>>
>>>>>>> Since we already have the JavaBean and JavaField reflection-based
>>>>>>> schema providers to use as a guide, it sounds like it may be best to 
>>>>>>> try to
>>>>>>> implement this using reflection rather than implementing an AutoValue
>>>>>>> extension.
>>>>>>>
>>>>>>> A reflection-based approach here would hinge on being able to
>>>>>>> discover the package-private constructor for the concrete class and read
>>>>>>> its types. Those types would define the schema, and the fromRow
>>>>>>> impementation would call the discovered constructor.
>>>>>>>
>>>>>>> On Mon, Nov 12, 2018 at 10:02 AM Reuven Lax <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 12, 2018 at 11:38 PM Jeff Klukas <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Reuven - A SchemaProvider makes sense. It's not clear to me,
>>>>>>>>> though, whether that's more limited than a Coder. Do all values of the
>>>>>>>>> schema have to be simple types, or does Beam SQL support nested 
>>>>>>>>> schemas?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Nested schemas, collection types (lists and maps), and collections
>>>>>>>> of nested types are all supported.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Put another way, would a user be able to create an AutoValue class
>>>>>>>>> comprised of simple types and then use that as a field inside another
>>>>>>>>> AutoValue class? I can see how that's possible with Coders, but not 
>>>>>>>>> clear
>>>>>>>>> whether that's possible with Row schemas.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, this is explicitly supported.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Nov 9, 2018 at 8:22 PM Reuven Lax <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Jeff,
>>>>>>>>>>
>>>>>>>>>> I would suggest a slightly different approach. Instead of
>>>>>>>>>> generating a coder, writing a SchemaProvider that generates a schema 
>>>>>>>>>> for
>>>>>>>>>> AutoValue. Once a PCollection has a schema, a coder is not needed 
>>>>>>>>>> (as Beam
>>>>>>>>>> knows how to encode any type with a schema), and it will work 
>>>>>>>>>> seamlessly
>>>>>>>>>> with Beam SQL (in fact you don't need to write a transform to turn 
>>>>>>>>>> it into
>>>>>>>>>> a Row if a schema is registered).
>>>>>>>>>>
>>>>>>>>>> We already do this for POJOs and basic JavaBeans. I'm happy to
>>>>>>>>>> help do this for AutoValue.
>>>>>>>>>>
>>>>>>>>>> Reuven
>>>>>>>>>>
>>>>>>>>>> On Sat, Nov 10, 2018 at 5:50 AM Jeff Klukas <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all - I'm looking for some review and commentary on a
>>>>>>>>>>> proposed design for providing built-in Coders for AutoValue classes.
>>>>>>>>>>> There's existing discussion in BEAM-1891 [0] about using AvroCoder, 
>>>>>>>>>>> but
>>>>>>>>>>> that's blocked on incompatibility between AutoValue and Avro's 
>>>>>>>>>>> reflection
>>>>>>>>>>> machinery that don't look resolvable.
>>>>>>>>>>>
>>>>>>>>>>> I wrote up a design document [1] that instead proposes using
>>>>>>>>>>> AutoValue's extension API to automatically generate a Coder for each
>>>>>>>>>>> AutoValue class that users generate. A similar technique could be 
>>>>>>>>>>> used to
>>>>>>>>>>> generate conversions to and from Row for use with BeamSql.
>>>>>>>>>>>
>>>>>>>>>>> I'd appreciate review of the design and thoughts on whether this
>>>>>>>>>>> seems feasible to support within the Beam codebase. I may be 
>>>>>>>>>>> missing a
>>>>>>>>>>> simpler approach.
>>>>>>>>>>>
>>>>>>>>>>> [0] https://issues.apache.org/jira/browse/BEAM-1891
>>>>>>>>>>> [1]
>>>>>>>>>>> https://docs.google.com/document/d/1ucoik4WzUDfilqIz3I1AuMHc1J8DE6iv7gaUCDI42BI/edit?usp=sharing
>>>>>>>>>>>
>>>>>>>>>>

Reply via email to