Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/22112
  
    @cloud-fan @tgravescs was wondering if we could get an ETA on this landing?
    
    Also, I tried running something analogous to the example script from the 
description of https://issues.apache.org/jira/browse/SPARK-23207, but for RDDs. 
However, it did not manifest the correctness problem even before this patch was 
applied. Are there any ways to reliably reproduce this with a minimal script?
    
    The below script is run in my Spark shell, Spark standalone mode 
single-node cluster with 2 workers, client mode, with the external shuffle 
service enabled. It does not reproduce the issue.
    
    ```
    import scala.sys.process._
      
    import org.apache.spark.TaskContext
    val res = sc.parallelize(0 until 1000 * 1000, 1).coalesce(200, shuffle = 
true).map { x =>
      x
    }.coalesce(200, shuffle = true).map { x =>
      if (TaskContext.get.attemptNumber == 0 && TaskContext.get.partitionId < 
2) {
        throw new Exception("pkill -f -n java".!!) // Kills the newest Java 
process, ideally the executors
      }
      x
    }
    res.distinct().count()
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to