GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/895
[SPARK-1940] Enabling rolling of executor logs, and automatic cleanup of
old executor logs
Currently, in the default log4j configuration, all the executor logs get
sent to the file <code>[executor-working-dir]/stderr</code>. This does not all
log files to be rolled, so old logs cannot be removed.
Using log4j RollingFileAppender allows log4j logs to be rolled, but all the
logs get sent to a different set of files, other than the files
<code>stdout</code> and <code>stderr</code> . So the logs are not visible in
the Spark web UI any more as Spark web UI only reads the files
<code>stdout</code> and <code>stderr</code>. Furthermore, it still does not
allow the stdout and stderr to be cleared periodically in case a large amount
of stuff gets written to them (e.g. by explicit println inside map function).
This PR solves this by implementing a simple RollingFileAppender within
Spark (disabled by default). When enabled (using configuration parameter
`spark.executor.rollingLogs.enabled`), the logs can get rolled over by time
interval (daily, by default). Old logs (older than 7 days, by default) will get
deleted automatically. The web UI can show the logs across the rolled over
files.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tdas/spark rolling-logs
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/895.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #895
----
commit cb4fb6d804df9a094589ee6c88304c74d5325954
Author: Tathagata Das <[email protected]>
Date: 2014-05-17T04:25:53Z
Added FileAppender and RollingFileAppender to generate rolling executor
logs.
commit 931f8fb3822e29194a406cb85b3a02d4554cab96
Author: Tathagata Das <[email protected]>
Date: 2014-05-19T06:28:00Z
Changed log viewer in Spark web UI to handle rolling log files.
commit adf49103e707060631ef926bd57cd832fff11d47
Author: Tathagata Das <[email protected]>
Date: 2014-05-19T06:28:19Z
Merge remote-tracking branch 'apache-github/master' into rolling-logs
commit 6cc09c74fee97790356bac653431430b0e26bf9f
Author: Tathagata Das <[email protected]>
Date: 2014-05-27T21:37:54Z
Fixed bugs in rolling logs, and added more debug statements.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---