GitHub user tdas opened a pull request:

    https://github.com/apache/spark/pull/895

    [SPARK-1940] Enabling rolling of executor logs, and automatic cleanup of 
old executor logs

    Currently, in the default log4j configuration, all the executor logs get 
sent to the file <code>[executor-working-dir]/stderr</code>. This does not all 
log files to be rolled, so old logs cannot be removed.
    
    Using log4j RollingFileAppender allows log4j logs to be rolled, but all the 
logs get sent to a different set of files, other than the files 
<code>stdout</code> and <code>stderr</code> . So the logs are not visible in 
the Spark web UI any more as Spark web UI only reads the files 
<code>stdout</code> and <code>stderr</code>. Furthermore, it still does not 
allow the stdout and stderr to be cleared periodically in case a large amount 
of stuff gets written to them (e.g. by explicit println inside map function).
    
    This PR solves this by implementing a simple RollingFileAppender within 
Spark (disabled by default). When enabled (using configuration parameter 
`spark.executor.rollingLogs.enabled`), the logs can get rolled over by time 
interval (daily, by default). Old logs (older than 7 days, by default) will get 
deleted automatically. The web UI can show the logs across the rolled over 
files.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tdas/spark rolling-logs

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/895.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #895
    
----
commit cb4fb6d804df9a094589ee6c88304c74d5325954
Author: Tathagata Das <[email protected]>
Date:   2014-05-17T04:25:53Z

    Added FileAppender and RollingFileAppender to generate rolling executor 
logs.

commit 931f8fb3822e29194a406cb85b3a02d4554cab96
Author: Tathagata Das <[email protected]>
Date:   2014-05-19T06:28:00Z

    Changed log viewer in Spark web UI to handle rolling log files.

commit adf49103e707060631ef926bd57cd832fff11d47
Author: Tathagata Das <[email protected]>
Date:   2014-05-19T06:28:19Z

    Merge remote-tracking branch 'apache-github/master' into rolling-logs

commit 6cc09c74fee97790356bac653431430b0e26bf9f
Author: Tathagata Das <[email protected]>
Date:   2014-05-27T21:37:54Z

    Fixed bugs in rolling logs, and added more debug statements.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to