Ngone51 commented on a change in pull request #28911:
URL: https://github.com/apache/spark/pull/28911#discussion_r458097873



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -1391,10 +1391,12 @@ package object config {
 
   private[spark] val SHUFFLE_HOST_LOCAL_DISK_READING_ENABLED =
     ConfigBuilder("spark.shuffle.readHostLocalDisk")
-      .doc(s"If enabled (and `${SHUFFLE_USE_OLD_FETCH_PROTOCOL.key}` is 
disabled and external " +
-        s"shuffle `${SHUFFLE_SERVICE_ENABLED.key}` is enabled), shuffle " +
-        "blocks requested from those block managers which are running on the 
same host are read " +
-        "from the disk directly instead of being fetched as remote blocks over 
the network.")
+      .doc(s"If enabled (and `${SHUFFLE_USE_OLD_FETCH_PROTOCOL.key}` is 
disabled and 1) external " +
+        s"shuffle `${SHUFFLE_SERVICE_ENABLED.key}` is enabled or 2) 
${DYN_ALLOCATION_ENABLED.key}" +
+        s" is disabled), shuffle blocks requested from those block managers 
which are running on " +

Review comment:
       > As of Spark 3 we no longer require dynamic allocation to have a 
shuffle service.
   
   This exactly what I mentioned in P.S.. I can update the documentation 
according to this feature.
   
   But please also note dynamic allocation without external shuffle service is 
still an experimental feature disabled by default. And it has a main problem 
that the user needs to config when to delete shuffle files while most common 
users have no idea about this. And by default, shuffle files will not be 
removed until GC happens at the driver side. It also means executors won't come 
and go more frequently than dynamic allocation with shuffle service. Therefore, 
I think we were discussing a more general problem above when using dynamic 
allocation with shuffle service.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to