Hi guys,

I keep running into a strange problem where my jobs start to fail with the 
dreaded "Resubmitted (resubmitted due to lost executor)” because of having too 
many temp files from previous runs.

Both /var/run and /spill have enough disk space left, but after a given amount 
of jobs have run, following jobs will struggle with completion. There are a lot 
of failures without any exception message, only the above mentioned lost 
executor. As soon as I clear out /var/run/spark/work/ and the spill disk, 
everything goes back to normal.

Thanks for any hint,
- Marius


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to