I am adding a derivative of that text to the docs right now.
On Fri, Jan 9, 2015 at 11:54 AM, Robert Metzger <rmetz...@apache.org> wrote: > Thank you! > > It would be amazing if you or somebody else could copy paste this into our > documentation. > > On Fri, Jan 9, 2015 at 11:44 AM, Stephan Ewen <se...@apache.org> wrote: > > > Hi everyone! > > > > We recently introduced type hints for the Java API. Since that is a > pretty > > useful feature, I wanted to quickly explain what it is. > > > > Kudos to Timo Walther, who did a large part of this work. > > > > > > *Background* > > > > Flink tries to figure out as much information about what types enter and > > leave user functions as possible. > > > > - For the POJO API (where one refers to field names), we need that > > information to make checks (for typos and type compatibility) before the > > job is executed. > > > > - For the upcoming logical programs (see roadmap draft) we need this to > > know the "schema" of functions. > > > > - The more we know, the better serialization and data layout schemes the > > compiler/optimizer can develop. That is quite important for the memory > > usage paradigm in Flink (work on serialized data inside/outside the heap > > and make serialization very cheap) > > > > - Finally, it also spares users having to worry about serialization > > frameworks and having to register types at those frameworks. > > > > > > *Problem* > > > > Scala is an easy case, because it preserves generic type information > > (ClassTags / Type Manifests), but Java erases generic type info in most > > cases. > > > > We do reflection analysis on the user function classes to get the generic > > types. This logic also contains some simple type inference in case the > > functions have type variables (such as a MapFunction<T, Tuple2<T, > Long>>). > > > > Not in all cases can we figure out the data types of functions reliably > in > > Java. Some issues remain with generic lambdas (we are trying to solve > this > > with the Java community, see below) and with generic type variables that > we > > cannot infer. > > > > > > *Solution: Type Hints* > > > > To make this cases work easily, a recent addition to the 0.9-SNAPSHOT > > master introduced type hints. They allow you to tell the system types > that > > it cannot infer. > > > > You can write code like > > > > DataSet<SomeType> result = > > dataSet.map(new MyGenericNonInferrableFunction<Long, > > SomeType>()).returns(SomeType.class); > > > > > > To make specification of generic types easier, it also comes with a > parser > > for simple string representations of generic types: > > > > .returns("Tuple2<Integer, my.SomeType>") > > > > > > We suggest to use this instead of the "ResultTypeQueryable" workaround > that > > has been used in some cases. > > > > > > *Improving Type information in Java* > > > > One Flink committer (Timo Walther) has actually become active in the > > Eclipse JDT compiler community and in the OpenJDK community to try and > > improve the way type information is available for lambdas. > > > > > > Greetings, > > Stephan > > >