Hello,

In some of our spark applications, when writing outputs to hdfs we
encountered an error about the spark yarn executor memory overhead :

WARN yarn.YarnAllocator: Container killed by YARN for exceeding memory
limits. 3.0 GB of 3 GB physical memory used. Consider boosting
spark.yarn.executor.memoryOverhead.

When setting an higher overhead everything works correctly, but we'd like
to know what's in this overhead in order to fix/change our code.

The documentation states that it contains VM overheads, interned strings
and other native overheads. However it's really vague.

We considered two things to be the problem :
- Interned strings, however since Java 7 it's not in the overhead anymore
- Buffers used when writting to HDFS since we use Hadoop Multiple Outputs.

So if someone could enlighten us about what's in this overhead, it could be
very helpful.

Regards,

-- 


*Olivier Devoisin*
Data Infrastructure Engineer
olivier.devoi...@contentsquare.com
http://www.contentsquare.com
50 Avenue Montaigne - 75008 Paris

Reply via email to