Thank you!

It would be amazing if you or somebody else could copy paste this into our
documentation.

On Fri, Jan 9, 2015 at 11:44 AM, Stephan Ewen <se...@apache.org> wrote:

> Hi everyone!
>
> We recently introduced type hints for the Java API. Since that is a pretty
> useful feature, I wanted to quickly explain what it is.
>
> Kudos to Timo Walther, who did a large part of this work.
>
>
> *Background*
>
> Flink tries to figure out as much information about what types enter and
> leave user functions as possible.
>
>  - For the POJO API (where one refers to field names), we need that
> information to make checks (for typos and type compatibility) before the
> job is executed.
>
>  - For the upcoming logical programs (see roadmap draft) we need this to
> know the "schema" of functions.
>
>  - The more we know, the better serialization and data layout schemes the
> compiler/optimizer can develop. That is quite important for the memory
> usage paradigm in Flink (work on serialized data inside/outside the heap
> and make serialization very cheap)
>
>  - Finally, it also spares users having to worry about serialization
> frameworks and having to register types at those frameworks.
>
>
> *Problem*
>
> Scala is an easy case, because it preserves generic type information
> (ClassTags / Type Manifests), but Java erases generic type info in most
> cases.
>
> We do reflection analysis on the user function classes to get the generic
> types. This logic also contains some simple type inference in case the
> functions have type variables (such as a MapFunction<T, Tuple2<T, Long>>).
>
> Not in all cases can we figure out the data types of functions reliably in
> Java. Some issues remain with generic lambdas (we are trying to solve this
> with the Java community, see below) and with generic type variables that we
> cannot infer.
>
>
> *Solution: Type Hints*
>
> To make this cases work easily, a recent addition to the 0.9-SNAPSHOT
> master introduced type hints. They allow you to tell the system types that
> it cannot infer.
>
> You can write code like
>
> DataSet<SomeType> result =
>         dataSet.map(new MyGenericNonInferrableFunction<Long,
> SomeType>()).returns(SomeType.class);
>
>
> To make specification of generic types easier, it also comes with a parser
> for simple string representations of generic types:
>
>   .returns("Tuple2<Integer, my.SomeType>")
>
>
> We suggest to use this instead of the "ResultTypeQueryable" workaround that
> has been used in some cases.
>
>
> *Improving Type information in Java*
>
> One Flink committer (Timo Walther) has actually become active in the
> Eclipse JDT compiler community and in the OpenJDK community to try and
> improve the way type information is available for lambdas.
>
>
> Greetings,
> Stephan
>

Reply via email to