O sorry, Flavio!
I didn't see Hawins questions :-(

Thanks Stephan for picking up!

2015-08-14 17:43 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>:

> Any insight about these 2 questions..?
> On 12 Aug 2015 17:38, "Flavio Pompermaier" <pomperma...@okkam.it> wrote:
>
>> This is something I've never understood in depth: isn't a mapper created
>> for each record?if it's created only once per task manager then it's not so
>> different from mapPartition..what I'm missing here?
>>
>> And then a more philosophic question: all big data framework requires
>> somehow to manage memory very efficiently (Flink has even though to reserve
>> a fraction of the entire memory in order to have control over it). Wouldn't
>> be simpler if java would finally release some APIs (even marked as unsafe,
>> it doesn't change theMat much) to allow for a full control of the
>> memory..?it will make a lot of sense for all big data platforms (at least
>> for non-UDF code...).
>>
>> Best,
>> Flavio
>> On 12 Aug 2015 12:44, "Timo Walther" <twal...@apache.org> wrote:
>>
>>> Hello Michael,
>>>
>>> every time you code a Java program you should avoid object creation if
>>> you want an efficient program, because every created object needs to be
>>> garbage collected later (which slows down your program performance).
>>> You can have small Pojos, just try to avoid the call "new" in your
>>> functions:
>>>
>>> Instead of:
>>>
>>> class Mapper implements MapFunction<String,Pojo> {
>>> public Pojo map(String s) {
>>>     Pojo p = new Pojo();
>>>     p.f = s;
>>> }
>>> }
>>>
>>> do:
>>>
>>> class Mapper implements MapFunction<String,Pojo> {
>>> private Pojo p = new Pojo();
>>> public Pojo map(String s) {
>>>     p.f = s;
>>> }
>>> }
>>>
>>> Then an object is only created once per Mapper and not per record.
>>>
>>> Hope this helps.
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>>
>>> On 12.08.2015 11:53, Michael Huelfenhaus wrote:
>>>
>>>> Hello
>>>>
>>>> I have a question about the programming of user defined functions, is
>>>> it still like in old Stratosphere times the case that object creation
>>>> should be avoided al all cost? Because in some of the examples there are
>>>> now Tuples and other objects created before returning them.
>>>>
>>>> I gonna have an at least 6 step streaming plan and I am going to use
>>>> Pojos. Is it performance wise a big improvement to define one big pojo that
>>>> can be used by all the steps or better to have smaller ones to send less
>>>> data but create more objects.
>>>>
>>>> Thanks
>>>> Michael
>>>>
>>>
>>>

Reply via email to