Hi Yuichiro,

The way to avoid this is to boost spark.yarn.executor.memoryOverhead until
the executors have enough off-heap memory to avoid going over their limits.

-Sandy

On Tue, Mar 24, 2015 at 11:49 AM, Yuichiro Sakamoto <ks...@muc.biglobe.ne.jp
> wrote:

> Hello.
>
> We use ALS(Collaborative filtering) of Spark MLlib on YARN.
> Spark version is 1.2.0 included CDH 5.3.1.
>
> 1,000,000,000 records(5,000,000 users data and 5,000,000 items data) are
> used for machine learning with ALS.
> These large quantities of data increases virtual memory usage,
> node manager of YARN kills Spark worker process.
> Even though Spark run again after killing process, Spark worker process is
> killed again.
> As a result, the whole Spark processes are terminated.
>
> # Spark worker process is killed, it seems that virtual memory usage
> increased by
> # 'Shuffle' or 'Disk writing' gets over the threshold of YARN.
>
> To avoid such a case from occurring, we use the method that
> 'yarn.nodemanager.vmem-check-enabled' is false, then exit successfully.
> But it does not seem to have an appropriate way.
> If you know, please let me know about tuning method of Spark.
>
> The conditions of machines and Spark settings are as follows.
> 1)six machines, physical memory is 32GB of each machine.
> 2)Spark settings
> - spark.executor.memory=16g
> - spark.closure.serializer=org.apache.spark.serializer.KryoSerializer
> - spark.rdd.compress=true
> - spark.shuffle.memoryFraction=0.4
>
> Thanks,
> Yuichiro Sakamoto
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-avoid-being-killed-by-YARN-node-manager-tp22199.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to