[ https://issues.apache.org/jira/browse/SPARK-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-5418. ------------------------------ Resolution: Duplicate > Output directory for shuffle should consider left space of each directory set > in conf > ------------------------------------------------------------------------------------- > > Key: SPARK-5418 > URL: https://issues.apache.org/jira/browse/SPARK-5418 > Project: Spark > Issue Type: Bug > Components: Shuffle > Affects Versions: 1.2.0 > Environment: Ubuntu, others should be similar > Reporter: ding > Priority: Minor > Original Estimate: 6h > Remaining Estimate: 6h > > I set multiple directorys in conf spark.local.dir as "scratch" space, one of > them(eg. /mnt/disk1) have 30G left space while others(eg./mnt/disk2) have > 100G. In current version, spark use hash to figure out which directory is > used for "scratch" space. It means each directory has the same chance. After > hounds of iteration of pagerank, there is "No space left" exception and > driver crashes. It does not make sense since there is still 70G+ left space > in other directorys. We should take consider left space on each directorys > when figure out which directory should be map output dir. I will send a PR > for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org