Ok, so actually SchemaRegistry is based on TypeDescriptors, so it does not have this limitation (I was wrong about that).
However, I'm still not sure that the @DefaultSchema annotation-based registration would work here. Right now it tries to infer a schema eagerly, which clearly would not work. I guess we could create a SchemaProvider that lazily resolved the schema only upon use, when we should have a good TypeDescriptor.. However I'm still worried that we often won't have a good type descriptor. It works well for DoFn, because usually the user's DoFn is a concrete class with resolved types. I'm not sure that this is easy to do with AutoValue; the user can't create a concrete subclass of their AutoValue class, as that won't work with the generated code AutoValue does. Reuven On Sun, Feb 10, 2019 at 8:00 PM Kenneth Knowles <[email protected]> wrote: > Hmm, this is a huge limitation relative to the CoderRegistry, which very > explicitly does support constructing parameterized coders via > CoderProvider. The root CoderProvider is still keyed on rawtype but the > CoderProvider is passed inferred coders for the concrete parameters. Here's > how List.class is registered: > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java#L116 > > The one thing that _is_ required for this is that at the call site a good > TypeDescriptor is captured. That is mostly automatic for DoFns, hence the > CoderRegistry works fairly well. There are special methods in various user > fns and boilerplate in transforms like MapElements to provide a good > TypeDescriptor. > > Kenn > > On Sun, Feb 10, 2019 at 5:11 PM Reuven Lax <[email protected]> wrote: > >> This is an interesting question. >> >> In general, I don't think schema inference can handle these generics >> today. Right now the SchemaRegistry is keyed off of Java class, and due to >> type erasure all different instances of . MyClass<T> will look the same. >> >> Now it might be possible to include generic type parameters in the >> registry. You would not be able to use the @DefaultSchema annotation to >> infer a schema, but you might be able to dynamically register a schema >> using a TypeDescriptor. Unfortunately I think this would only sometimes >> work. e..g. my experience has been that given a type T you can often figure >> out T using reflection, but if there are nested types (e.g. List<T>) than >> Java doesn't always preserve these types for introspection.. >> >> In sum, I think we could do a bit better for these types of classes, but >> not a whole lot better. >> >> Reuven >> >> On Mon, Feb 4, 2019 at 6:02 AM Jeff Klukas <[email protected]> wrote: >> >>> I've started experimenting with Beam schemas in the context of creating >>> custom AutoValue-based classes and using AutoValueSchema to generate >>> schemas and thus coders. >>> >>> AFAICT, schemas need to have types fully specified, so it doesn't appear >>> to be possible to define an AutoValue class with a type parameter and then >>> create a schema for it. Basically, I want to confirm whether the following >>> type would ever be possible to create a schema for: >>> >>> @DefaultSchema(AutoValueSchema.class) >>> @AutoValue >>> public abstract class MyClass<T> { >>> public abstract T getField1(); >>> public abstract String getField2(); >>> public static <T> MyClass<T> of(T field1, String field2) { >>> return new AutoValue_MyClass(field1, field2); >>> } >>> } >>> >>> This may be an entirely reasonable limitation of the schema machinery, >>> but I want to make sure I'm not missing something. >>> >>
