[GitHub] spark pull request #21616: [SPARK-24552][core] Use unique id instead of atte...

vanzin Fri, 22 Jun 2018 13:25:55 -0700

GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/21616


    [SPARK-24552][core] Use unique id instead of attempt number for writes 
[branch-2.2].

    This passes a unique attempt id to the Hadoop APIs, because attempt
    number is reused when stages are retried. When attempt numbers are
    reused, sources that track data by partition id and attempt number
    may incorrectly clean up data because the same attempt number can
    be both committed and aborted.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-24552-2.2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21616.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21616
    
----
commit 88679a0631bb3ddd6707c2f2b81f8886bf837fd8
Author: Marcelo Vanzin <vanzin@...>
Date:   2018-06-22T19:58:16Z

    [SPARK-24552][core] Use unique id instead of attempt number for writes 
[branch-2.2].
    
    This passes a unique attempt id to the Hadoop APIs, because attempt
    number is reused when stages are retried. When attempt numbers are
    reused, sources that track data by partition id and attempt number
    may incorrectly clean up data because the same attempt number can
    be both committed and aborted.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21616: [SPARK-24552][core] Use unique id instead of atte...

Reply via email to