Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder

2018-03-28 Thread Gourav Sengupta
Hi Michael, I think that is what I am trying to show here as the documentation mentions "NOTE: In Spark 1.0 and later this will be overridden by SPARK_LOCAL_DIRS (Standalone, Mesos) or LOCAL_DIRS (YARN) environment variables set by the cluster manager." So, in a way I am supporting your

Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder

2018-03-28 Thread Michael Shtelma
Hi, this property will be used in YARN mode only by the driver. Executors will use the properties coming from YARN for storing temporary files. Best, Michael On Wed, Mar 28, 2018 at 7:37 AM, Gourav Sengupta wrote: > Hi, > > > As per documentation in:

Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder

2018-03-27 Thread Gourav Sengupta
Hi, As per documentation in: https://spark.apache.org/docs/latest/configuration.html spark.local.dir /tmp Directory to use for "scratch" space in Spark, including map output files and RDDs that get stored on disk. This should be on a fast, local disk in your system. It can also be a

Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder

2018-03-26 Thread Michael Shtelma
Hi Keith, Thanks for the suggestion! I have solved this already. The problem was, that the yarn process was not responding to start/stop commands and has not applied my configuration changes. I have killed it and restarted my cluster, and after that yarn has started using

Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder

2018-03-26 Thread Keith Chapman
Hi Michael, sorry for the late reply. I guess you may have to set it through the hdfs core-site.xml file. The property you need to set is "hadoop.tmp.dir" which defaults to "/tmp/hadoop-${user.name}" Regards, Keith. http://keith-chapman.com On Mon, Mar 19, 2018 at 1:05 PM, Michael Shtelma

Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder

2018-03-19 Thread Michael Shtelma
Hi Keith, Thank you for the idea! I have tried it, so now the executor command is looking in the following way : /bin/bash -c /usr/java/latest//bin/java -server -Xmx51200m '-Djava.io.tmpdir=my_prefered_path'

Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder

2018-03-19 Thread Keith Chapman
Can you try setting spark.executor.extraJavaOptions to have -D java.io.tmpdir=someValue Regards, Keith. http://keith-chapman.com On Mon, Mar 19, 2018 at 10:29 AM, Michael Shtelma wrote: > Hi Keith, > > Thank you for your answer! > I have done this, and it is working for

Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder

2018-03-19 Thread Michael Shtelma
Hi Keith, Thank you for your answer! I have done this, and it is working for spark driver. I would like to make something like this for the executors as well, so that the setting will be used on all the nodes, where I have executors running. Best, Michael On Mon, Mar 19, 2018 at 6:07 PM, Keith

Re: Running out of space on /tmp file system while running spark job on yarn because of size of blockmgr folder

2018-03-19 Thread Keith Chapman
Hi Michael, You could either set spark.local.dir through spark conf or java.io.tmpdir system property. Regards, Keith. http://keith-chapman.com On Mon, Mar 19, 2018 at 9:59 AM, Michael Shtelma wrote: > Hi everybody, > > I am running spark job on yarn, and my problem is