You could also try setting your `nofile` value in /etc/security/limits.conf for `soft` to some ridiculously high value if you haven't done so already.
On Fri, Apr 3, 2015 at 2:09 AM Akhil Das <ak...@sigmoidanalytics.com> wrote: > Did you try these? > > - Disable shuffle : spark.shuffle.spill=false > - Enable log rotation: > > sparkConf.set("spark.executor.logs.rolling.strategy", "size") > .set("spark.executor.logs.rolling.size.maxBytes", "1024") > .set("spark.executor.logs.rolling.maxRetainedFiles", "3") > > > Thanks > Best Regards > > On Fri, Apr 3, 2015 at 9:09 AM, a mesar <amesa...@gmail.com> wrote: > >> Yes, with spark.cleaner.ttl set there is no cleanup. We pass >> --properties-file >> spark-dev.conf to spark-submit where spark-dev.conf contains: >> >> spark.master spark://10.250.241.66:7077 >> spark.logConf true >> spark.cleaner.ttl 1800 >> spark.executor.memory 10709m >> spark.cores.max 4 >> spark.shuffle.consolidateFiles true >> >> On Thu, Apr 2, 2015 at 7:12 PM, Tathagata Das <t...@databricks.com> >> wrote: >> >>> Are you saying that even with the spark.cleaner.ttl set your files are >>> not getting cleaned up? >>> >>> TD >>> >>> On Thu, Apr 2, 2015 at 8:23 AM, andrem <amesa...@gmail.com> wrote: >>> >>>> Apparently Spark Streaming 1.3.0 is not cleaning up its internal files >>>> and >>>> the worker nodes eventually run out of inodes. >>>> We see tons of old shuffle_*.data and *.index files that are never >>>> deleted. >>>> How do we get Spark to remove these files? >>>> >>>> We have a simple standalone app with one RabbitMQ receiver and a two >>>> node >>>> cluster (2 x r3large AWS instances). >>>> Batch interval is 10 minutes after which we process data and write >>>> results >>>> to DB. No windowing or state mgmt is used. >>>> >>>> I've poured over the documentation and tried setting the following >>>> properties but they have not helped. >>>> As a work around we're using a cron script that periodically cleans up >>>> old >>>> files but this has a bad smell to it. >>>> >>>> SPARK_WORKER_OPTS in spark-env.sh on every worker node >>>> spark.worker.cleanup.enabled true >>>> spark.worker.cleanup.interval >>>> spark.worker.cleanup.appDataTtl >>>> >>>> Also tried on the driver side: >>>> spark.cleaner.ttl >>>> spark.shuffle.consolidateFiles true >>>> >>>> >>>> >>>> -- >>>> View this message in context: http://apache-spark-user-list. >>>> 1001560.n3.nabble.com/Spark-Streaming-Worker-runs-out-of- >>>> inodes-tp22355.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: user-h...@spark.apache.org >>>> >>>> >>> >> >