GitHub user gaborgsomogyi opened a pull request:

    https://github.com/apache/spark/pull/21430

    [SPARK-23991][DSTREAMS] Fix data loss when WAL write fails in 
allocateBlocksToBatch

    ## What changes were proposed in this pull request?
    
    When blocks tried to get allocated to a batch and WAL write fails then the 
blocks will be removed from the received block queue. This fact simply produces 
data loss because the next allocation will not find the mentioned blocks in the 
queue.
    
    In this PR blocks will be removed from the received queue only if WAL write 
succeded.
    
    ## How was this patch tested?
    
    Additional unit test.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gaborgsomogyi/spark SPARK-23991

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21430.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21430
    
----
commit 2d35dfacd54d747e6a4167d46234d4b3ce87529b
Author: Gabor Somogyi <gabor.g.somogyi@...>
Date:   2018-05-25T12:52:36Z

    [SPARK-23991][DSTREAMS] Fix data loss when WAL write fails in 
allocateBlocksToBatch

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to