Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22112 @tgravescs thanks for testing it out! I've created https://issues.apache.org/jira/browse/SPARK-25341 and https://issues.apache.org/jira/browse/SPARK-25342 to track the followup. I think these two together is the long-term solution. Users can do sort/checkpoint to eliminate the indeterminacy, or use a reliable shuffle storage to avoid fetch failure(someone is proposing it in dev list). If users can't avoid it and hit the issue, this PR provides a final guard to rerun some stages and get correct result. For Spark 2.4 we just fail the job, and we will finish the above 2 tickets in Spark 3.0.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org