GitHub user ivoson opened a pull request:
https://github.com/apache/spark/pull/21400
[SPARK-24351][SS]offsetLog/commitLog purge thresholdBatchId should be
computed with cuâ¦
## What changes were proposed in this pull request?
Compute the thresholdBatchId to purge metadata based on current committed
epoch instead of currentBatchId in CP mode to avoid cleaning all the committed
metadata in some case as described in the jira.
## How was this patch tested?
Manually tested.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ivoson/spark branch-cp-meta
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21400.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21400
----
commit d72fc13ce91e2be6b5960d1fd84ec3086b7f0265
Author: Huang Tengfei <tengfei.h@...>
Date: 2018-05-22T15:02:08Z
offsetLog/commitLog purge thresholdBatchId should be computed with current
committed epoch but not currentBatchId in CP mode
Change-Id: Ice9710f1236a3935966efb35f4ba280560f973a1
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]