Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16959#discussion_r103552112
  
    --- Diff: 
core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala
 ---
    @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite extends 
SparkFunSuite with BeforeAndAfter {
         sc.runJob(rdd, 
OutputCommitFunctions(tempDir.getAbsolutePath).callCanCommitMultipleTimes _,
            0 until rdd.partitions.size)
       }
    +
    +  test("SPARK-19631: Do not allow failed attempts to be authorized for 
committing") {
    +    val stage: Int = 1
    +    val partition: Int = 1
    +    val failedAttempt: Int = 0
    +    outputCommitCoordinator.stageStart(stage, maxPartitionId = 1)
    +    outputCommitCoordinator.taskCompleted(stage, partition, attemptNumber 
= failedAttempt,
    +      reason = ExecutorLostFailure("0", exitCausedByApp = true, None))
    +    assert(!outputCommitCoordinator.canCommit(stage, partition, 
failedAttempt))
    --- End diff --
    
    BTW this is very unlikely to happen with filesystems that have atomic moves 
(or close enough to that), such as HDFS. If all attempts have the same set of 
outputs, you'd end up with the correct output eventually.
    
    It might be trickier to reason about with filesystems that don't have that 
feature (hello S3), depending on the actual semantics of how (and if) tasks 
fail.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to