GitHub user pankajarora12 opened a pull request:
https://github.com/apache/spark/pull/4770
[CORE][YARN] SPARK-6011: Used Current Working directory for sparklocaldirs
instead of Application Directory so that spark-local-files gets deleted when
executor exits abruptly.
Spark uses current application directory to save shuffle files for all
Executors. But when Executor gets killed abruptly not allowing
DiskBlockManager.scala shutdownhook to get executed. These files remain there
till application is up.
This is causing out of disk space error for long/infinitley running
applications.
In this fix i used current working directory, which is inside executor's
directory, to save shuffle files instead of application's directory. So that
Yarn clears those directories when executor gets killed.
-Pankaj
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/pankajarora12/spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4770.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4770
----
commit d6bfba3d7b9236a02a7e91233f8e512bea761af0
Author: pankaj.arora <[email protected]>
Date: 2014-05-31T11:11:05Z
[SPARK-1979] Added Error Handling if user passes application params with
--arg
commit 7c838862d69095ad9ccf9fa8e3ff9a582b0e647d
Author: pankaj arora <[email protected]>
Date: 2015-02-25T17:33:50Z
Merge upstream
commit 3db9a19baec2f7d891f0b1a18d89907d633c3c02
Author: pankaj arora <[email protected]>
Date: 2015-02-25T18:21:47Z
[CORE] SPARK-6011: Used Current Working directory for sparklocaldirs
instead of Application Directory so that spark-local-files gets deleted when
executor exits abruptly.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]