Fwd: If you use Spark 1.5 and disabled Tungsten mode ...

Reynold Xin Tue, 20 Oct 2015 18:11:26 -0700

With Jerry's permission, sending this back to the dev list to close the
loop.



---------- Forwarded message ----------
From: Jerry Lam <[email protected]>
Date: Tue, Oct 20, 2015 at 3:54 PM
Subject: Re: If you use Spark 1.5 and disabled Tungsten mode ...
To: Reynold Xin <[email protected]>


Yup, coarse grained mode works just fine. :)
The difference is that by default, coarse grained mode uses 1 core per
task. If I constraint 20 cores in total, there can be only 20 tasks running
at the same time. However, with fine grained, I cannot set the total number
of cores and therefore, it could be +200 tasks running at the same time (It
is dynamic). So it might be the calculation of how much memory to acquire
fail when the number of cores cannot be known ahead of time because you
cannot make the assumption that X tasks running in an executor? Just my
guess...


On Tue, Oct 20, 2015 at 6:24 PM, Reynold Xin <[email protected]> wrote:

> Can you try coarse-grained mode and see if it is the same?
>
>
> On Tue, Oct 20, 2015 at 3:20 PM, Jerry Lam <[email protected]> wrote:
>
>> Hi Reynold,
>>
>> Yes, I'm using 1.5.1. I see them quite often. Sometimes it recovers but
>> sometimes it does not. For one particular job, it failed all the time with
>> the acquire-memory issue. I'm using spark on mesos with fine grained mode.
>> Does it make a difference?
>>
>> Best Regards,
>>
>> Jerry
>>
>> On Tue, Oct 20, 2015 at 5:27 PM, Reynold Xin <[email protected]> wrote:
>>
>>> Jerry - I think that's been fixed in 1.5.1. Do you still see it?
>>>
>>> On Tue, Oct 20, 2015 at 2:11 PM, Jerry Lam <[email protected]> wrote:
>>>
>>>> I disabled it because of the "Could not acquire 65536 bytes of memory".
>>>> It happens to fail the job. So for now, I'm not touching it.
>>>>
>>>> On Tue, Oct 20, 2015 at 4:48 PM, charmee <[email protected]> wrote:
>>>>
>>>>> We had disabled tungsten after we found few performance issues, but
>>>>> had to
>>>>> enable it back because we found that when we had large number of group
>>>>> by
>>>>> fields, if tungsten is disabled the shuffle keeps failing.
>>>>>
>>>>> Here is an excerpt from one of our engineers with his analysis.
>>>>>
>>>>> With Tungsten Enabled (default in spark 1.5):
>>>>> ~90 files of 0.5G each:
>>>>>
>>>>> Ingest (after applying broadcast lookups) : 54 min
>>>>> Aggregation (~30 fields in group by and another 40 in aggregation) :
>>>>> 18 min
>>>>>
>>>>> With Tungsten Disabled:
>>>>>
>>>>> Ingest : 30 min
>>>>> Aggregation : Erroring out
>>>>>
>>>>> On smaller tests we found that joins are slow with tungsten enabled.
>>>>> With
>>>>> GROUP BY, disabling tungsten is not working in the first place.
>>>>>
>>>>> Hope this helps.
>>>>>
>>>>> -Charmee
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14711.html
>>>>> Sent from the Apache Spark Developers List mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [email protected]
>>>>> For additional commands, e-mail: [email protected]
>>>>>
>>>>>
>>>>
>>>
>>
>

Fwd: If you use Spark 1.5 and disabled Tungsten mode ...

Reply via email to