On Mon, Nov 12, 2018 at 11:38 PM Jeff Klukas <[email protected]> wrote:

> Reuven - A SchemaProvider makes sense. It's not clear to me, though,
> whether that's more limited than a Coder. Do all values of the schema have
> to be simple types, or does Beam SQL support nested schemas?
>

Nested schemas, collection types (lists and maps), and collections of
nested types are all supported.

>
> Put another way, would a user be able to create an AutoValue class
> comprised of simple types and then use that as a field inside another
> AutoValue class? I can see how that's possible with Coders, but not clear
> whether that's possible with Row schemas.
>

Yes, this is explicitly supported.

>
> On Fri, Nov 9, 2018 at 8:22 PM Reuven Lax <[email protected]> wrote:
>
>> Hi Jeff,
>>
>> I would suggest a slightly different approach. Instead of generating a
>> coder, writing a SchemaProvider that generates a schema for AutoValue. Once
>> a PCollection has a schema, a coder is not needed (as Beam knows how to
>> encode any type with a schema), and it will work seamlessly with Beam SQL
>> (in fact you don't need to write a transform to turn it into a Row if a
>> schema is registered).
>>
>> We already do this for POJOs and basic JavaBeans. I'm happy to help do
>> this for AutoValue.
>>
>> Reuven
>>
>> On Sat, Nov 10, 2018 at 5:50 AM Jeff Klukas <[email protected]> wrote:
>>
>>> Hi all - I'm looking for some review and commentary on a proposed design
>>> for providing built-in Coders for AutoValue classes. There's existing
>>> discussion in BEAM-1891 [0] about using AvroCoder, but that's blocked on
>>> incompatibility between AutoValue and Avro's reflection machinery that
>>> don't look resolvable.
>>>
>>> I wrote up a design document [1] that instead proposes using AutoValue's
>>> extension API to automatically generate a Coder for each AutoValue class
>>> that users generate. A similar technique could be used to generate
>>> conversions to and from Row for use with BeamSql.
>>>
>>> I'd appreciate review of the design and thoughts on whether this seems
>>> feasible to support within the Beam codebase. I may be missing a simpler
>>> approach.
>>>
>>> [0] https://issues.apache.org/jira/browse/BEAM-1891
>>> [1]
>>> https://docs.google.com/document/d/1ucoik4WzUDfilqIz3I1AuMHc1J8DE6iv7gaUCDI42BI/edit?usp=sharing
>>>
>>

Reply via email to