Thanks, Shao. :-) I am wondering if the spark will rebalance the storage overhead in runtime…since still there is some available space on other nodes.
Best, Yifan LI > On 06 May 2015, at 14:57, Saisai Shao <sai.sai.s...@gmail.com> wrote: > > I think you could configure multiple disks through spark.local.dir, default > is /tmp. Anyway if your intermediate data is larger than available disk > space, still will meet this issue. > > spark.local.dir /tmp Directory to use for "scratch" space in Spark, > including map output files and RDDs that get stored on disk. This should be > on a fast, local disk in your system. It can also be a comma-separated list > of multiple directories on different disks. NOTE: In Spark 1.0 and later this > will be overriden by SPARK_LOCAL_DIRS (Standalone, Mesos) or LOCAL_DIRS > (YARN) environment variables set by the cluster manager. > > 2015-05-06 20:35 GMT+08:00 Yifan LI <iamyifa...@gmail.com > <mailto:iamyifa...@gmail.com>>: > Hi, > > I am running my graphx application on Spark, but it failed since there is an > error on one executor node(on which available hdfs space is small) that “no > space left on device”. > > I can understand why it happened, because my vertex(-attribute) rdd was > becoming bigger and bigger during computation…, so maybe sometime the request > on that node was too bigger than available space. > > But, is there any way to avoid this kind of error? I am sure that the overall > disk space of all nodes is enough for my application. > > Thanks in advance! > > > > Best, > Yifan LI > > > > > >