Thank you both, I will take a look, but

1.       For high-shuffle tasks, is this right for the system to have the size 
and thresholds high? I hope there is no bad consequences.

2.       I will try to overlook admin access and see if I can get anything with 
only user rights

From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Wednesday, July 29, 2015 12:59 PM
To: Ellafi, Saif A.
Cc: <user@spark.apache.org>
Subject: Re: Too many open files

Please increase limit for open files:

http://stackoverflow.com/questions/34588/how-do-i-change-the-number-of-open-files-limit-in-linux


On Jul 29, 2015, at 8:39 AM, 
<saif.a.ell...@wellsfargo.com<mailto:saif.a.ell...@wellsfargo.com>> 
<saif.a.ell...@wellsfargo.com<mailto:saif.a.ell...@wellsfargo.com>> wrote:
Hello,

I’ve seen a couple emails on this issue but could not find anything to solve my 
situation.

Tried to reduce the partitioning level, enable consolidateFiles and increase 
the sizeInFlight limit, but still no help. Spill manager is sort, which is the 
default, any advice?

15/07/29 10:37:01 WARN TaskSetManager: Lost task 34.0 in stage 11.0 (TID 331, 
localhost): FetchFailed(BlockManagerId(driver, localhost, 43437), shuffleId=3, 
mapId=0, reduceId=34, message=
org.apache.spark.shuffle.FetchFailedException: 
/tmp/spark-71109b28-0f89-4e07-a521-5ff0a943472a/blockmgr-eda0751d-fd21-4229-93b0-2ee2546edf5a/0d/shuffle_3_0_0.index
 (Too many open files)
..
..
15/07/29 10:37:01 INFO Executor: Executor is trying to kill task 9.0 in stage 
11.0 (TID 306)
org.apache.spark.SparkException: Job aborted due to stage failure: Task 20 in 
stage 11.0 failed 1 times, most recent failure: Lost task 20.0 in stage 11.0 
(TID 317, localhost): java.io.FileNotFoundException: 
/tmp/spark-71109b28-0f89-4e07-a521-5ff0a943472a/blockmgr-eda0751d-fd21-4229-93b0-2ee2546edf5a/1b/temp_shuffle_a3a9815a-677a-4342-94a2-1e083d758bcc
 (Too many open files)

my fs is ext4 and currently ulist –n is 1024

Thanks
Saif

Reply via email to