Hi, thanks for the nice explanation and the great work! This will simplify our Graph API-lives a lot ^^
Cheers, V. On 9 January 2015 at 11:59, Stephan Ewen <se...@apache.org> wrote: > I am adding a derivative of that text to the docs right now. > > > > On Fri, Jan 9, 2015 at 11:54 AM, Robert Metzger <rmetz...@apache.org> > wrote: > > > Thank you! > > > > It would be amazing if you or somebody else could copy paste this into > our > > documentation. > > > > On Fri, Jan 9, 2015 at 11:44 AM, Stephan Ewen <se...@apache.org> wrote: > > > > > Hi everyone! > > > > > > We recently introduced type hints for the Java API. Since that is a > > pretty > > > useful feature, I wanted to quickly explain what it is. > > > > > > Kudos to Timo Walther, who did a large part of this work. > > > > > > > > > *Background* > > > > > > Flink tries to figure out as much information about what types enter > and > > > leave user functions as possible. > > > > > > - For the POJO API (where one refers to field names), we need that > > > information to make checks (for typos and type compatibility) before > the > > > job is executed. > > > > > > - For the upcoming logical programs (see roadmap draft) we need this > to > > > know the "schema" of functions. > > > > > > - The more we know, the better serialization and data layout schemes > the > > > compiler/optimizer can develop. That is quite important for the memory > > > usage paradigm in Flink (work on serialized data inside/outside the > heap > > > and make serialization very cheap) > > > > > > - Finally, it also spares users having to worry about serialization > > > frameworks and having to register types at those frameworks. > > > > > > > > > *Problem* > > > > > > Scala is an easy case, because it preserves generic type information > > > (ClassTags / Type Manifests), but Java erases generic type info in most > > > cases. > > > > > > We do reflection analysis on the user function classes to get the > generic > > > types. This logic also contains some simple type inference in case the > > > functions have type variables (such as a MapFunction<T, Tuple2<T, > > Long>>). > > > > > > Not in all cases can we figure out the data types of functions reliably > > in > > > Java. Some issues remain with generic lambdas (we are trying to solve > > this > > > with the Java community, see below) and with generic type variables > that > > we > > > cannot infer. > > > > > > > > > *Solution: Type Hints* > > > > > > To make this cases work easily, a recent addition to the 0.9-SNAPSHOT > > > master introduced type hints. They allow you to tell the system types > > that > > > it cannot infer. > > > > > > You can write code like > > > > > > DataSet<SomeType> result = > > > dataSet.map(new MyGenericNonInferrableFunction<Long, > > > SomeType>()).returns(SomeType.class); > > > > > > > > > To make specification of generic types easier, it also comes with a > > parser > > > for simple string representations of generic types: > > > > > > .returns("Tuple2<Integer, my.SomeType>") > > > > > > > > > We suggest to use this instead of the "ResultTypeQueryable" workaround > > that > > > has been used in some cases. > > > > > > > > > *Improving Type information in Java* > > > > > > One Flink committer (Timo Walther) has actually become active in the > > > Eclipse JDT compiler community and in the OpenJDK community to try and > > > improve the way type information is available for lambdas. > > > > > > > > > Greetings, > > > Stephan > > > > > >