Re: [External] Re: Too many open files

Korb, Michael [USA] Thu, 13 Feb 2014 09:55:22 -0800

No I don't. I ran all Spark processes as a user with ulimit = unlimited.

From: Mayur Rustagi <mayur.rust...@gmail.com<mailto:mayur.rust...@gmail.com>>
Reply-To: 
"user@spark.incubator.apache.org<mailto:user@spark.incubator.apache.org>" 
<user@spark.incubator.apache.org<mailto:user@spark.incubator.apache.org>>
Date: Thursday, February 13, 2014 12:34 PM
To: "user@spark.incubator.apache.org<mailto:user@spark.incubator.apache.org>" 
<user@spark.incubator.apache.org<mailto:user@spark.incubator.apache.org>>
Subject: [External] Re: Too many open files


The limit could be on any of the machines(including the master). Do you have 
ganglia setup?

Mayur Rustagi
Ph: +919632149971
h<https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com<http://www.sigmoidanalytics.com>
https://twitter.com/mayur_rustagi



On Thu, Feb 13, 2014 at 7:13 AM, Korb, Michael [USA] 
<korb_mich...@bah.com<mailto:korb_mich...@bah.com>> wrote:
Hi,

When I submit a job to a cluster, I get a large string of errors like this:

WARN TaskSetManager: Loss was due to java.io.FileNotFoundException
java.io.FileNotFoundException: /tmp/spark-local* (Too many open files)

All the answers I can find say to increase ulimit, but I have it set to 
unlimited (as the user running the spark daemons as well as the user submitting 
the job) and am still getting the error.

I'm attempting to create an RDD like this: 
sc.textFile(/path/to/files/*).persist(StorageLevel.MEMORY_ONLY_SER()), and run 
a series of maps and filters on the data. There are about 2k files for a total 
of about 230g of data, and my current cluster is 3 nodes, 32 cores each, with 
spark.executor.memory set to 32g. I've tried different StorageLevel settings 
but still have the same error. Interestingly, the job works if I write and 
submit with pyspark, but I want to get it working in Java.

Thanks,
Mike

Re: [External] Re: Too many open files

Reply via email to