https://github.com/apache/beam/pull/7147 starts adding the framework to do
this (for POJOs we actually generate a constructor using ByteBuddy, but
that might not be necessary for AutoValue).

I would start by writing the inference from AutoVaue to a Schema. For
example, see PojoUils::schemaFromPojoClass or
JavaBeanUtils::schemaFromJavaBeanClass.

Reuven

On Mon, Nov 26, 2018 at 6:08 AM Jeff Klukas <[email protected]> wrote:

> Reuven - How is the work on constructor support for ByteBuddy codegen
> going? Does it still look like that's going to be a feasible way forward
> for generating schemas/coders for AutoValue classes?
>
> On Thu, Nov 15, 2018 at 4:37 PM Reuven Lax <[email protected]> wrote:
>
>> I would hope so if possible.
>>
>> On Fri, Nov 16, 2018, 4:36 AM Kenneth Knowles <[email protected] wrote:
>>
>>> Just some low-level detail: If there is no @DefaultSchema annotation but
>>> it is an @AutoValue class, can schema inference go ahead with the
>>> AutoValueSchema? Then the user doesn't have to do anything.
>>>
>>> Kenn
>>>
>>> On Wed, Nov 14, 2018 at 6:14 AM Reuven Lax <[email protected]> wrote:
>>>
>>>> We already have a framework for ByteBuddy codegen for JavaBean Row
>>>> interfaces, which should hopefully be easy to extend AutoValue (and more
>>>> efficient than using reflection). I'm working on adding constructor support
>>>> to this right now.
>>>>
>>>> On Wed, Nov 14, 2018 at 12:29 AM Jeff Klukas <[email protected]>
>>>> wrote:
>>>>
>>>>> Sounds, then, like we need to a define a new `AutoValueSchema extends
>>>>> SchemaProvider` and users would opt-in to this via the DefaultSchema
>>>>> annotation:
>>>>>
>>>>> @DefaultSchema(AutoValueSchema.class)
>>>>> @AutoValue
>>>>> public abstract MyClass ...
>>>>>
>>>>> Since we already have the JavaBean and JavaField reflection-based
>>>>> schema providers to use as a guide, it sounds like it may be best to try 
>>>>> to
>>>>> implement this using reflection rather than implementing an AutoValue
>>>>> extension.
>>>>>
>>>>> A reflection-based approach here would hinge on being able to discover
>>>>> the package-private constructor for the concrete class and read its types.
>>>>> Those types would define the schema, and the fromRow impementation would
>>>>> call the discovered constructor.
>>>>>
>>>>> On Mon, Nov 12, 2018 at 10:02 AM Reuven Lax <[email protected]> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 12, 2018 at 11:38 PM Jeff Klukas <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Reuven - A SchemaProvider makes sense. It's not clear to me, though,
>>>>>>> whether that's more limited than a Coder. Do all values of the schema 
>>>>>>> have
>>>>>>> to be simple types, or does Beam SQL support nested schemas?
>>>>>>>
>>>>>>
>>>>>> Nested schemas, collection types (lists and maps), and collections of
>>>>>> nested types are all supported.
>>>>>>
>>>>>>>
>>>>>>> Put another way, would a user be able to create an AutoValue class
>>>>>>> comprised of simple types and then use that as a field inside another
>>>>>>> AutoValue class? I can see how that's possible with Coders, but not 
>>>>>>> clear
>>>>>>> whether that's possible with Row schemas.
>>>>>>>
>>>>>>
>>>>>> Yes, this is explicitly supported.
>>>>>>
>>>>>>>
>>>>>>> On Fri, Nov 9, 2018 at 8:22 PM Reuven Lax <[email protected]> wrote:
>>>>>>>
>>>>>>>> Hi Jeff,
>>>>>>>>
>>>>>>>> I would suggest a slightly different approach. Instead of
>>>>>>>> generating a coder, writing a SchemaProvider that generates a schema 
>>>>>>>> for
>>>>>>>> AutoValue. Once a PCollection has a schema, a coder is not needed (as 
>>>>>>>> Beam
>>>>>>>> knows how to encode any type with a schema), and it will work 
>>>>>>>> seamlessly
>>>>>>>> with Beam SQL (in fact you don't need to write a transform to turn it 
>>>>>>>> into
>>>>>>>> a Row if a schema is registered).
>>>>>>>>
>>>>>>>> We already do this for POJOs and basic JavaBeans. I'm happy to help
>>>>>>>> do this for AutoValue.
>>>>>>>>
>>>>>>>> Reuven
>>>>>>>>
>>>>>>>> On Sat, Nov 10, 2018 at 5:50 AM Jeff Klukas <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi all - I'm looking for some review and commentary on a proposed
>>>>>>>>> design for providing built-in Coders for AutoValue classes. There's
>>>>>>>>> existing discussion in BEAM-1891 [0] about using AvroCoder, but that's
>>>>>>>>> blocked on incompatibility between AutoValue and Avro's reflection
>>>>>>>>> machinery that don't look resolvable.
>>>>>>>>>
>>>>>>>>> I wrote up a design document [1] that instead proposes using
>>>>>>>>> AutoValue's extension API to automatically generate a Coder for each
>>>>>>>>> AutoValue class that users generate. A similar technique could be 
>>>>>>>>> used to
>>>>>>>>> generate conversions to and from Row for use with BeamSql.
>>>>>>>>>
>>>>>>>>> I'd appreciate review of the design and thoughts on whether this
>>>>>>>>> seems feasible to support within the Beam codebase. I may be missing a
>>>>>>>>> simpler approach.
>>>>>>>>>
>>>>>>>>> [0] https://issues.apache.org/jira/browse/BEAM-1891
>>>>>>>>> [1]
>>>>>>>>> https://docs.google.com/document/d/1ucoik4WzUDfilqIz3I1AuMHc1J8DE6iv7gaUCDI42BI/edit?usp=sharing
>>>>>>>>>
>>>>>>>>

Reply via email to