otterc commented on a change in pull request #32007:
URL: https://github.com/apache/spark/pull/32007#discussion_r624885672



##########
File path: core/src/main/scala/org/apache/spark/storage/BlockId.scala
##########
@@ -87,6 +87,29 @@ case class ShufflePushBlockId(shuffleId: Int, mapIndex: Int, 
reduceId: Int) exte
   override def name: String = "shufflePush_" + shuffleId + "_" + mapIndex + 
"_" + reduceId
 }
 
+@DeveloperApi
+case class ShuffleMergedBlockId(appId: String, shuffleId: Int, reduceId: Int) 
extends BlockId {
+  override def name: String = "mergedShuffle_" + appId + "_" + shuffleId + "_" 
+ reduceId + ".data"
+}
+
+@DeveloperApi
+case class ShuffleMergedIndexBlockId(
+  appId: String,
+  shuffleId: Int,
+  reduceId: Int) extends BlockId {
+  override def name: String =
+    "mergedShuffle_" + appId + "_" + shuffleId + "_" + reduceId + ".index"

Review comment:
       > If ESS does not support the new RPC, how is the spark application 
supposed to behave ?
   
   So this is the case that the cluster does not support push-based shuffle but 
client is sending push-based shuffle related messages. In this case, it should 
be okay that the executor fails early because it is trying push-based shuffle 
which will not work. For this case, we can throw a `SparkException` with 
message that "push-based is not supported by the cluster so please turn it off".
   
   > If we are taking this path, it would be better for ESS to manage the 
merger location entirely - and not have executors create/update it (as 
discussed above). It will help ESS evolve independently.
   
   ESS cannot create the merge_manager directory under application local 
directory because it doesn't have permissions to do so. App local dirs have 
permission `750` and shuffle service is part of NM process that are run usually 
by `yarn` user. 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to