Re: YARN process with Spark

Steve Loughran Mon, 14 Mar 2016 03:36:33 -0700

On 11 Mar 2016, at 23:01, Alexander Pivovarov 
<[email protected]<mailto:[email protected]>> wrote:


Forgot to mention. To avoid unnecessary container termination add the following 
setting to yarn

yarn.nodemanager.vmem-check-enabled = false


That can kill performance on a shared cluster: if your container code starts to 
swap, performance of everything suffers. A good ops team will decline such a 
request in a multi-tenant cluster.

In such a cluster: aask for the amount of memory you think you actually need, 
and let the scheduler find space for it. This not only stops you killing 
cluster performance, it means that on a busy cluster, you get the same memory 
and CPU is you would on an idle one: so more consistent workloads. (and nobody 
else swapping your code out)

regarding the numbers, people need to remember that if they are running python 
work in the cluster, they need to include more headroom.

if you are going to turn off memory monitoring, have a play with 
yarn.nodemanager.pmem-check-enabled=false too

Re: YARN process with Spark

Reply via email to