GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/4149
[SPARK-5147][Streaming] Delete the received data WAL log periodically
This is a refactored fix based on @jerryshao 's PR #4037
This enabled deletion of old WAL files containing the received block data.
Improvements over #4037
- Respecting the rememberDuration of all receiver streams. In #4037, if
there were two receiver streams with multiple remember durations, the deletion
would have delete based on the shortest remember duration, thus deleting data
prematurely for the receiver stream with longer remember duration.
- Added unit test to test creation of receiver WAL, automatic deletion, and
respecting of remember duration.
@jerryshao I am going to merge this ASAP to make it 1.2.1 Thanks for the
initial draft of this PR. Made my job much easier.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tdas/spark SPARK-5147
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4149.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4149
----
commit 2736fd1fa9cbe148ec44b44f97f0aee2d5a42ec7
Author: jerryshao <[email protected]>
Date: 2015-01-14T06:26:21Z
Delete the old WAL log periodically
commit 2579b270e670afad71e185d94d13eb9099d5b54b
Author: Tathagata Das <[email protected]>
Date: 2015-01-21T23:55:28Z
Refactored the fix to make sure that the cleanup respects the remember
duration of all the receiver streams
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]