Re: Schemas for classes with type parameters

Reuven Lax Sun, 10 Feb 2019 20:24:51 -0800

Ok, so actually SchemaRegistry is based on TypeDescriptors, so it does not
have this limitation (I was wrong about that).


However, I'm still not sure that the @DefaultSchema annotation-based
registration would work here. Right now it tries to infer a schema eagerly,
which clearly would not work. I guess we could create a SchemaProvider that
lazily resolved the schema only upon use, when we should have a good
TypeDescriptor.. However I'm still worried that we often won't have a good
type descriptor. It works well for DoFn, because usually the user's DoFn is
a concrete class with resolved types. I'm not sure that this is easy to do
with AutoValue; the user can't create a concrete subclass of their
AutoValue class, as that won't work with the generated code AutoValue does.

Reuven

On Sun, Feb 10, 2019 at 8:00 PM Kenneth Knowles <[email protected]> wrote:

> Hmm, this is a huge limitation relative to the CoderRegistry, which very
> explicitly does support constructing parameterized coders via
> CoderProvider. The root CoderProvider is still keyed on rawtype but the
> CoderProvider is passed inferred coders for the concrete parameters. Here's
> how List.class is registered:
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java#L116
>
> The one thing that _is_ required for this is that at the call site a good
> TypeDescriptor is captured. That is mostly automatic for DoFns, hence the
> CoderRegistry works fairly well. There are special methods in various user
> fns and boilerplate in transforms like MapElements to provide a good
> TypeDescriptor.
>
> Kenn
>
> On Sun, Feb 10, 2019 at 5:11 PM Reuven Lax <[email protected]> wrote:
>
>> This is an interesting question.
>>
>> In general, I don't think schema inference can handle these generics
>> today. Right now the SchemaRegistry is keyed off of Java class, and due to
>> type erasure all different instances of . MyClass<T> will look the same.
>>
>> Now it might be possible to include generic type parameters in the
>> registry. You would not be able to use the @DefaultSchema annotation to
>> infer a schema, but you might be able to dynamically register a schema
>> using a TypeDescriptor. Unfortunately I think this would only sometimes
>> work. e..g. my experience has been that given a type T you can often figure
>> out T using reflection, but if there are nested types (e.g. List<T>) than
>> Java doesn't always preserve these types for introspection..
>>
>> In sum, I think we could do a bit better for these types of classes, but
>> not a whole lot better.
>>
>> Reuven
>>
>> On Mon, Feb 4, 2019 at 6:02 AM Jeff Klukas <[email protected]> wrote:
>>
>>> I've started experimenting with Beam schemas in the context of creating
>>> custom AutoValue-based classes and using AutoValueSchema to generate
>>> schemas and thus coders.
>>>
>>> AFAICT, schemas need to have types fully specified, so it doesn't appear
>>> to be possible to define an AutoValue class with a type parameter and then
>>> create a schema for it. Basically, I want to confirm whether the following
>>> type would ever be possible to create a schema for:
>>>
>>> @DefaultSchema(AutoValueSchema.class)
>>> @AutoValue
>>> public abstract class MyClass<T> {
>>>   public abstract T getField1();
>>>   public abstract String getField2();
>>>   public static <T> MyClass<T> of(T field1, String field2) {
>>>     return new AutoValue_MyClass(field1, field2);
>>>   }
>>> }
>>>
>>> This may be an entirely reasonable limitation of the schema machinery,
>>> but I want to make sure I'm not missing something.
>>>
>>

Re: Schemas for classes with type parameters

Reply via email to