On Mon, Nov 12, 2018 at 11:38 PM Jeff Klukas <[email protected]> wrote:
> Reuven - A SchemaProvider makes sense. It's not clear to me, though, > whether that's more limited than a Coder. Do all values of the schema have > to be simple types, or does Beam SQL support nested schemas? > Nested schemas, collection types (lists and maps), and collections of nested types are all supported. > > Put another way, would a user be able to create an AutoValue class > comprised of simple types and then use that as a field inside another > AutoValue class? I can see how that's possible with Coders, but not clear > whether that's possible with Row schemas. > Yes, this is explicitly supported. > > On Fri, Nov 9, 2018 at 8:22 PM Reuven Lax <[email protected]> wrote: > >> Hi Jeff, >> >> I would suggest a slightly different approach. Instead of generating a >> coder, writing a SchemaProvider that generates a schema for AutoValue. Once >> a PCollection has a schema, a coder is not needed (as Beam knows how to >> encode any type with a schema), and it will work seamlessly with Beam SQL >> (in fact you don't need to write a transform to turn it into a Row if a >> schema is registered). >> >> We already do this for POJOs and basic JavaBeans. I'm happy to help do >> this for AutoValue. >> >> Reuven >> >> On Sat, Nov 10, 2018 at 5:50 AM Jeff Klukas <[email protected]> wrote: >> >>> Hi all - I'm looking for some review and commentary on a proposed design >>> for providing built-in Coders for AutoValue classes. There's existing >>> discussion in BEAM-1891 [0] about using AvroCoder, but that's blocked on >>> incompatibility between AutoValue and Avro's reflection machinery that >>> don't look resolvable. >>> >>> I wrote up a design document [1] that instead proposes using AutoValue's >>> extension API to automatically generate a Coder for each AutoValue class >>> that users generate. A similar technique could be used to generate >>> conversions to and from Row for use with BeamSql. >>> >>> I'd appreciate review of the design and thoughts on whether this seems >>> feasible to support within the Beam codebase. I may be missing a simpler >>> approach. >>> >>> [0] https://issues.apache.org/jira/browse/BEAM-1891 >>> [1] >>> https://docs.google.com/document/d/1ucoik4WzUDfilqIz3I1AuMHc1J8DE6iv7gaUCDI42BI/edit?usp=sharing >>> >>
