+1 to a better default as well.

We were working find until we ran against a real dataset which was much
larger than the test dataset we were using locally. It took me a couple
days and digging through many logs to figure out this value was what was
causing the problem.

On Sat, Feb 28, 2015 at 11:38 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> Having good out-of-box experience is desirable.
>
> +1 on increasing the default.
>
>
> On Sat, Feb 28, 2015 at 8:27 AM, Sean Owen <so...@cloudera.com> wrote:
>
>> There was a recent discussion about whether to increase or indeed make
>> configurable this kind of default fraction. I believe the suggestion
>> there too was that 9-10% is a safer default.
>>
>> Advanced users can lower the resulting overhead value; it may still
>> have to be increased in some cases, but a fatter default may make this
>> kind of surprise less frequent.
>>
>> I'd support increasing the default; any other thoughts?
>>
>> On Sat, Feb 28, 2015 at 3:34 PM, Koert Kuipers <ko...@tresata.com> wrote:
>> > hey,
>> > running my first map-red like (meaning disk-to-disk, avoiding in memory
>> > RDDs) computation in spark on yarn i immediately got bitten by a too low
>> > spark.yarn.executor.memoryOverhead. however it took me about an hour to
>> find
>> > out this was the cause. at first i observed failing shuffles leading to
>> > restarting of tasks, then i realized this was because executors could
>> not be
>> > reached, then i noticed in containers got shut down and reallocated in
>> > resourcemanager logs (no mention of errors, it seemed the containers
>> > finished their business and shut down successfully), and finally i
>> found the
>> > reason in nodemanager logs.
>> >
>> > i dont think this is a pleasent first experience. i realize
>> > spark.yarn.executor.memoryOverhead needs to be set differently from
>> > situation to situation. but shouldnt the default be a somewhat higher
>> value
>> > so that these errors are unlikely, and then the experts that are
>> willing to
>> > deal with these errors can tune it lower? so why not make the default
>> 10%
>> > instead of 7%? that gives something that works in most situations out
>> of the
>> > box (at the cost of being a little wasteful). it worked for me.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Reply via email to