Github user jose-torres commented on the issue: https://github.com/apache/spark/pull/21353 As I've mentioned elsewhere, stages are currently submitted sequentially. That is, for a stage X, all the stage dependencies of X are completed before the tasks within X start. This change proposes to violate that invariant, and it's not obvious that this is a safe approach. The questions we need to answer are: * How can we attempt to validate that this is indeed safe to change, and will not break the scheduler or things dependent on it in subtle ways? * What benefits do we derive from adding the additional risk of a scheduler change, rather than handling continuous shuffles entirely at the RDD layer?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org