xuanyuanking commented on a change in pull request #25420: [SPARK-28699][Core]
Cache an indeterminate RDD could lead to incorrect result while stage rerun
URL: https://github.com/apache/spark/pull/25420#discussion_r312947461
##########
File path:
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
##########
@@ -2710,7 +2710,7 @@ class DAGSchedulerSuite extends SparkFunSuite with
LocalSparkContext with TimeLi
assert(countSubmittedMapStageAttempts() === 2)
}
- test("SPARK-23207: retry all the succeeding stages when the map stage is
indeterminate") {
+ ignore("SPARK-23207: retry all the succeeding stages when the map stage is
indeterminate") {
Review comment:
Ignore this for the behavior change, as the approach now, we need to abort
the stage of the current mapStage.
As we will finally support stage rerun, I suggest to skip this behavior, we
can directly support the cache scenario after SPARK-25341 merged.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]