Hello,

I've seen a couple emails on this issue but could not find anything to solve my 
situation.

Tried to reduce the partitioning level, enable consolidateFiles and increase 
the sizeInFlight limit, but still no help. Spill manager is sort, which is the 
default, any advice?

15/07/29 10:37:01 WARN TaskSetManager: Lost task 34.0 in stage 11.0 (TID 331, 
localhost): FetchFailed(BlockManagerId(driver, localhost, 43437), shuffleId=3, 
mapId=0, reduceId=34, message=
org.apache.spark.shuffle.FetchFailedException: 
/tmp/spark-71109b28-0f89-4e07-a521-5ff0a943472a/blockmgr-eda0751d-fd21-4229-93b0-2ee2546edf5a/0d/shuffle_3_0_0.index
 (Too many open files)
..
..
15/07/29 10:37:01 INFO Executor: Executor is trying to kill task 9.0 in stage 
11.0 (TID 306)
org.apache.spark.SparkException: Job aborted due to stage failure: Task 20 in 
stage 11.0 failed 1 times, most recent failure: Lost task 20.0 in stage 11.0 
(TID 317, localhost): java.io.FileNotFoundException: 
/tmp/spark-71109b28-0f89-4e07-a521-5ff0a943472a/blockmgr-eda0751d-fd21-4229-93b0-2ee2546edf5a/1b/temp_shuffle_a3a9815a-677a-4342-94a2-1e083d758bcc
 (Too many open files)

my fs is ext4 and currently ulist -n is 1024

Thanks
Saif

Reply via email to