mridulm commented on a change in pull request #34122:
URL: https://github.com/apache/spark/pull/34122#discussion_r794718066



##########
File path: core/src/main/scala/org/apache/spark/Dependency.scala
##########
@@ -145,20 +148,26 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C: 
ClassTag](
   }
 
   /**
-   * Returns true if push-based shuffle is disabled for this stage or empty 
RDD,
-   * or if the shuffle merge for this stage is finalized, i.e. the shuffle 
merge
-   * results for all partitions are available.
+   * Returns true if the RDD is an empty RDD or if the shuffle merge for this 
shuffle is
+   * finalized.
    */
   def shuffleMergeFinalized: Boolean = {
-    // Empty RDD won't be computed therefore shuffle merge finalized should be 
true by default.
-    if (shuffleMergeEnabled && numPartitions > 0) {
-      _shuffleMergedFinalized
+    _shuffleMergedFinalized
+  }

Review comment:
       Thinking more, we are changing semantics of `shuffleMergeFinalized` 
here, which was publically exposed.
   What used to be `shuffleMergeFinalized` is now becoming 
`isShuffleMergeFinalizedIfEnabled` - so the behavior is exposed by another 
method.
   
   We should preserve the earlier semantics for the method ... 
   For this PR, that would mean:
   
   1) Rename `shuffleMergeFinalized` as `isShuffleMergeFinalized` (a 
private[spark] method).
   
   2) Rename `isShuffleMergeFinalizedIfEnabled` to `shuffleMergeFinalized` - 
which will preserve the earlier behavior as well.
   
   Thoughts ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to