you probably should increase file handles limit for user that all processes
are running with(spark master & workers)
e.g.
http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/

On 29 July 2015 at 18:39, <saif.a.ell...@wellsfargo.com> wrote:

>  Hello,
>
> I’ve seen a couple emails on this issue but could not find anything to
> solve my situation.
>
> Tried to reduce the partitioning level, enable consolidateFiles and
> increase the sizeInFlight limit, but still no help. Spill manager is sort,
> which is the default, any advice?
>
> 15/07/29 10:37:01 WARN TaskSetManager: Lost task 34.0 in stage 11.0 (TID
> 331, localhost): FetchFailed(BlockManagerId(driver, localhost, 43437),
> shuffleId=3, mapId=0, reduceId=34, message=
> org.apache.spark.shuffle.FetchFailedException:
> /tmp/spark-71109b28-0f89-4e07-a521-5ff0a943472a/blockmgr-eda0751d-fd21-4229-93b0-2ee2546edf5a/0d/shuffle_3_0_0.index
> (Too many open files)
> ..
> ..
> 15/07/29 10:37:01 INFO Executor: Executor is trying to kill task 9.0 in
> stage 11.0 (TID 306)
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 20
> in stage 11.0 failed 1 times, most recent failure: Lost task 20.0 in stage
> 11.0 (TID 317, localhost): java.io.FileNotFoundException:
> /tmp/spark-71109b28-0f89-4e07-a521-5ff0a943472a/blockmgr-eda0751d-fd21-4229-93b0-2ee2546edf5a/1b/temp_shuffle_a3a9815a-677a-4342-94a2-1e083d758bcc
> (Too many open files)
>
> my fs is ext4 and currently ulist –n is 1024
>
> Thanks
> Saif
>
>

Reply via email to