GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/21918
[SPARK-24821][Core] Fail fast when submitted job compute on a subset of all the partitions for a barrier stage ## What changes were proposed in this pull request? Check on `DAGScheduler.submitJob()` to make sure we are not launching a barrier stage on only a subset of all the partitions(one example is the `first()` operation), otherwise shall fail fast. ## How was this patch tested? Add new test case in `BarrierStageOnSubmittedSuite`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jiangxb1987/spark SPARK-24821 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21918.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21918 ---- commit b93d21267d6204f25c8fabeec681d1b6e9ebffb6 Author: Xingbo Jiang <xingbo.jiang@...> Date: 2018-07-30T15:30:33Z Fail fast when submitted job compute on a subset of all the partitions for a barrier stage ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org