Mostly, you will have to restart the machines to get the ulimit effect (or
relogin). What operation are you doing? Are you doing too many
repartitions?

Thanks
Best Regards

On Mon, Mar 30, 2015 at 4:52 PM, Masf <masfwo...@gmail.com> wrote:

> Hi
>
> I have a problem with temp data in Spark. I have fixed
> spark.shuffle.manager to "SORT". In /etc/secucity/limits.conf set the next
> values:
> *               soft    nofile  1000000
> *               hard    nofile  1000000
> In spark-env.sh set ulimit -n 1000000
> I've restarted the spark service and it continues crashing (Too many open
> files)
>
> How can I resolve? I'm executing Spark 1.2.0 in Cloudera 5.3.2
>
> java.io.FileNotFoundException:
> /tmp/spark-local-20150330115312-37a7/2f/temp_shuffle_c4ba5bce-c516-4a2a-9e40-56121eb84a8c
> (Too many open files)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
>         at
> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:123)
>         at
> org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:360)
>         at
> org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:355)
>         at scala.Array$.fill(Array.scala:267)
>         at
> org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:355)
>         at
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
>         at
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>         at org.apache.spark.scheduler.Task.run(Task.scala:56)
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 15/03/30 11:54:18 WARN TaskSetManager: Lost task 22.0 in stage 3.0 (TID
> 27, localhost): java.io.FileNotFoundException:
> /tmp/spark-local-20150330115312-37a7/2f/temp_shuffle_c4ba5bce-c516-4a2a-9e40-56121eb84a8c
> (Too many open files)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
>         at
> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:123)
>         at
> org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:360)
>         at
> org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:355)
>         at scala.Array$.fill(Array.scala:267)
>         at
> org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:355)
>         at
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
>         at
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>         at org.apache.spark.scheduler.Task.run(Task.scala:56)
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
> Thanks!!!!!
> --
>
>
> Regards.
> Miguel Ángel
>

Reply via email to