[GitHub] spark pull request #21606: [SPARK-24552][core][SQL] Use task ID instead of a...

vanzin Thu, 21 Jun 2018 13:00:32 -0700

GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/21606


    [SPARK-24552][core][SQL] Use task ID instead of attempt number for writes.

    This passes the unique task attempt id instead of attempt number to v2 data 
sources because attempt number is reused when stages are retried. When attempt 
numbers are reused, sources that track data by partition id and attempt number 
may incorrectly clean up data because the same attempt number can be both 
committed and aborted.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-24552.2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21606.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21606
    
----
commit 6c60d1462c34f01610ada50c989832775b6fd117
Author: Ryan Blue <blue@...>
Date:   2018-06-13T19:50:00Z

    SPARK-24552: Use task ID instead of attempt number for v2 writes.

commit 2e6552460eed3013e649b06b16a1d14b1e542e2d
Author: Marcelo Vanzin <vanzin@...>
Date:   2018-06-21T17:21:00Z

    Rename attemptId -> taskId for clarity.

commit 3561723341c3062ba7d8682ea272c549b4bdc245
Author: Marcelo Vanzin <vanzin@...>
Date:   2018-06-21T17:28:12Z

    Use task ID instead of attempt for the Hadoop API too.

commit d5a079d439740f3067722d4e8c9e8e94f292017c
Author: Marcelo Vanzin <vanzin@...>
Date:   2018-06-21T18:37:54Z

    Merge branch 'master' into SPARK-24552.2

commit fdcd39c852e9a2d70da95c37da04190910e7b2f0
Author: Marcelo Vanzin <vanzin@...>
Date:   2018-06-21T18:51:48Z

    Log message update.

commit 7233a5fd7b154e2a1400c5fac11d0356a22f5f98
Author: Marcelo Vanzin <vanzin@...>
Date:   2018-06-21T18:57:02Z

    Javadoc updates.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21606: [SPARK-24552][core][SQL] Use task ID instead of a...

Reply via email to