Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/21927#discussion_r208123913
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -1946,4 +1990,11 @@ private[spark] object DAGScheduler {
// Number of consecutive stage attempts allowed before a stage is aborted
val DEFAULT_MAX_CONSECUTIVE_STAGE_ATTEMPTS = 4
+
+ // Error message when running a barrier stage that have unsupported RDD
chain pattern.
+ val ERROR_MESSAGE_RUN_BARRIER_WITH_UNSUPPORTED_RDD_CHAIN_PATTERN =
+ "[SPARK-24820][SPARK-24821]: Barrier execution mode does not allow the
following pattern of " +
+ "RDD chain within a barrier stage:\n1. Ancestor RDDs that have
different number of " +
+ "partitions from the resulting RDD (eg.
union()/coalesce()/first()/PartitionPruningRDD);\n" +
--- End diff --
collect() is expensive though?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]