zhouyejoe commented on a change in pull request #32007:
URL: https://github.com/apache/spark/pull/32007#discussion_r615513625



##########
File path: core/src/main/scala/org/apache/spark/storage/BlockId.scala
##########
@@ -87,6 +87,29 @@ case class ShufflePushBlockId(shuffleId: Int, mapIndex: Int, 
reduceId: Int) exte
   override def name: String = "shufflePush_" + shuffleId + "_" + mapIndex + 
"_" + reduceId
 }
 
+@DeveloperApi
+case class ShuffleMergedBlockId(appId: String, shuffleId: Int, reduceId: Int) 
extends BlockId {
+  override def name: String = "mergedShuffle_" + appId + "_" + shuffleId + "_" 
+ reduceId + ".data"
+}
+
+@DeveloperApi
+case class ShuffleMergedIndexBlockId(
+  appId: String,
+  shuffleId: Int,
+  reduceId: Int) extends BlockId {
+  override def name: String =
+    "mergedShuffle_" + appId + "_" + shuffleId + "_" + reduceId + ".index"

Review comment:
       @otterc If the merge_dir can get created under the 
blockmgr_UUID.randomUUID dir? Since RegisterExecutor message would send this 
blockmgr_UUID.randomUUID dir to ESS, ESS would know what are the local dirs to 
be used for merge dir. In our internal version, ESS will use the first 
RegisterExecutor message as the merge dirs list to be used. Suppose the 
scenario described: Executor1 gets the local dirs list 
"/grid/[a-c]/yarn/usercache/test/appcache/application_id/", Executor2 gets the 
local dirs list "/grid/[d-f]". Either of them will create the merge_dir under 
their own local dirs, for example: Executor1 creates 
"/grid/[a-c]/........./merge_dir"  and Executor2 creates 
"/grid/[d-f]//........./merge_dir". Executor1 is lucky to be the the first one 
to register with the local ESS, ESS will only use the dirs for "/grid/[a-c]". 
But during the Executor registration, the actual dirs in the message is 
"/grid/a/yarn/tmp/usercache/testuser/appCache/application_id/blockmgr_RandomID",
 "/grid/b/
 yarn/tmp/usercache/testuser/appCache/application_id/blockmgr_RandomID",
   
"/grid/c/yarn/tmp/usercache/testuser/appCache/application_id/blockmgr_RandomID".
 Internally, we trim the dirs to the 
"/grid/[a_c]/yarn/tmp/usercache/testuser/appCache/application_id/", assume the 
merge_dirs are 
"/grid/[a_c]/yarn/tmp/usercache/testuser/appCache/application_id/mergedirs". 
Even though there are other "/grid/[d-f]//........./merge_dir" created by 
Executor2, but since ESS will only use the dirs from the Executor1, those dirs 
will not be used. With the same logic, if we move merge_dir to subdir of 
blockmgr_RandomIDs, this will still work, right? The only cons would be there 
would be empty merge_dirs/subdirs created by other executors.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to