I have Spark on yarn

I defined yarn.nodemanager.local-dirs to be /data01/yarn/nm,/data02/yarn/nm

when I look at yarn executor container log I see that blockmanager files
created in /data01/yarn/nm,/data02/yarn/nm

But output files to upload to s3 still created in /tmp on slaves

I do not want Spark write heavy files to /tmp because /tmp is only 5GB

spark slaves have two big additional disks /disk01 and /disk02 attached

Probably I can set spark.local.dir to be /data01/tmp,/data02/tmp

But spark master also writes some files to spark.local.dir
But my master box has only one additional disk /data01

So, what should I use for  spark.local.dir the
spark.local.dir=/data01/tmp
or
spark.local.dir=/data01/tmp,/data02/tmp

?

Reply via email to