I like the idea very much!

In my opinion, the DataSet is not quite the right place to put that
functionality. I think the UnaryUDFOperator or the BinaryUDFOperator would
be better. After all, these hooks are only necessary for UDFs.

One more suggestion:

 - Can the TypeExctactor initially return a special "Unknown" type? The
returns() method can override that type. Then we can also keep a bit more
of the eager initialization.

 - The collection execution, for example, works without specific type
information. It only needs the ability to clone, which is easily possible
with an "unknown" type information, which can create a "defaultserializer"
that simply uses Kryo to clone.

That way, one could also use Java 8 lambdas inside IDE with collection
execution, and on the cluster with the properly compiled code from maven.

Stephan





On Mon, Nov 3, 2014 at 12:23 PM, Timo Walther <[email protected]> wrote:

> Hey,
>
> I have made a small prototype for a map-operator
>
> env.fromElements(1, 2, 3)
>  .map((i) -> new Tuple2<String,String>()).returns("Tuple2<String,String>")
>  .print();
>
> you can find my solution here: https://github.com/twalthr/
> incubator-flink/commit/3ce2d3c86cf2457e02986f7a2d858304bbefea58
>
> Actually, I like this solution most as it looks very easy to the user.
> Furthermore, we can move the Type Extraction part into the operator which
> makes more sense to me.
>
> What do you think?
>
> Greetings,
> Timo
>
>
>
>
>
>
> On 02.11.2014 16:22, Stephan Ewen wrote:
>
>> An alternative would be to go for
>>
>> env.fromElements(1, 2, 3)
>> .flatMap((Integer i, Collector<Integer> o) -> o.collect(i) ,
>> returns("Integer"))
>> .print();
>>
>> "returns" would here be a static method that creates the type info.
>>
>> That would require to add an additional parameter, but allow us to keep
>> the
>> immediate checks. Deferring the checks will make things harder to
>> understand for users as well...
>> Am 30.10.2014 11:44 schrieb "Stephan Ewen" <[email protected]>:
>>
>> I think that would look nice.
>>
>> How easy is that to implement? With that change, we could not initialize
>> the type info in the constructor any more, but would have to change
>> everything to lazy initialization, which makes it complicated and error
>> prone...
>>
>> On Wed, Oct 29, 2014 at 4:26 PM, Timo Walther <[email protected]> wrote:
>>
>>  What do you think about something like:
>>>
>>> env.fromElements(1, 2, 3)
>>> .flatMap((Integer i, Collector<Integer> o) -> o.collect(i)).returns("
>>> Integer")
>>> .print();
>>>
>>> This looks to me like the most readable and user-friendly solution. We
>>> only need to change the internals of DataSet a little bit, such that a
>>> possible TypeExtractor Exception is stored temporarily and thrown by the
>>> operator that follows if "returns()" was not called.
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>>
>>> On 28.10.2014 15:34, Stephan Ewen wrote:
>>>
>>>  Is it possible to use a static method "hint" to create the hinting
>>>> wrapper
>>>> function?
>>>>
>>>> Something like
>>>>
>>>> DataSet.map(hint( (x) -> x.toString() , String.class));
>>>>
>>>> If we go for option (1), I would suggest to call the methods just "from"
>>>> and overload them for String, Class, and TypeInformation
>>>>
>>>>
>>>> Stephan
>>>>
>>>>
>>>> On Tue, Oct 28, 2014 at 3:27 PM, Timo Walther <[email protected]>
>>>> wrote:
>>>>
>>>>   Hi all,
>>>>
>>>>> currently the Eclipse JDT compiler was the only compiler that included
>>>>> generic signatures for Lambda Expressions in class files which is
>>>>> necessary
>>>>> to use them type-safe in Flink. Unfortunalely, this "feature" was
>>>>> considered as a "bug" and had been thrown out with Eclipse 4.4.1. This
>>>>> is
>>>>> why Lambdas do not work properly with the current version of Eclipse. I
>>>>> have opened a bug for that (see https://bugs.eclipse.org/bugs/
>>>>> show_bug.cgi?id=449063).
>>>>>
>>>>> The question is: Independent of the decision of the Eclipse JDT team,
>>>>> how
>>>>> do we want to deal with missing return type information?
>>>>>
>>>>> Option 1)
>>>>> Add a separate TypeInformation argument to each Java API operator.
>>>>> Leads
>>>>> to blown up API...
>>>>> .map((x)->x + 1, TypeInformation.fromString("Integer"))
>>>>> .flatMap((in, out)->out.collect(in), TypeInformation.fromClass(
>>>>> Integer.class))
>>>>>
>>>>> Option 2)
>>>>> Introduce a wrapper class which implements ResultTypeQueryable. Leads
>>>>> to
>>>>> complicated syntax...
>>>>> .map(TypeHint.map((x)->x + 1, "Integer"));
>>>>> .map(TypeHint.map((x)->x + 1, Integer.class));
>>>>>
>>>>> What are your opinions? Or any other ideas?
>>>>>
>>>>>
>>>>> Regards,
>>>>> Timo
>>>>>
>>>>>
>>>>>
>

Reply via email to