tgravescs commented on a change in pull request #28053: [SPARK-29153][CORE]Add 
ability to merge resource profiles within a stage with Stage Level Scheduling
URL: https://github.com/apache/spark/pull/28053#discussion_r401855194
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
 ##########
 @@ -447,10 +449,27 @@ private[spark] class DAGScheduler(
       stageResourceProfiles: HashSet[ResourceProfile]): ResourceProfile = {
     logDebug(s"Merging stage rdd profiles: $stageResourceProfiles")
     val resourceProfile = if (stageResourceProfiles.size > 1) {
-      // add option later to actually merge profiles - SPARK-29153
-      throw new IllegalArgumentException("Multiple ResourceProfile's specified 
in the RDDs for " +
-        "this stage, please resolve the conflicting ResourceProfile's as Spark 
doesn't" +
-        "currently support merging them.")
+      if (shouldMergeResourceProfiles) {
+        val startResourceProfile = stageResourceProfiles.head
+        val mergedProfile = stageResourceProfiles.drop(1)
+          .foldLeft(startResourceProfile)((a, b) => mergeResourceProfiles(a, 
b))
+        // compared merged profile with existing ones so we we don't add it 
over and over again
+        // if the user runs the same operation multiple times
+        val resProfile = 
sc.resourceProfileManager.getEquivalentProfile(mergedProfile)
+        resProfile match {
+          case Some(existingRp) => existingRp
+          case None =>
+            // this ResourceProfile could be different if it was merged so we 
have to add it to
+            // our ResourceProfileManager
+            sc.resourceProfileManager.addResourceProfile(mergedProfile)
+            mergedProfile
+        }
+      } else {
+        throw new IllegalArgumentException("Multiple ResourceProfiles 
specified in the RDDs for " +
+          "this stage, either resolve the conflicting ResourceProfile's 
yourself or enable " +
 
 Review comment:
   oops I should have removed ' from there as well. It means the user has 
specified multiple resource profiles on rdds that went into the same stage. 
They should basically resolve that conflict. Either by choosing a different one 
that meets the needs of both or removing one.  I'm not sure how to say all that 
in an Exception, but was planning on documenting the merge information in the 
follow on doc jira

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to