attilapiros commented on a change in pull request #25299: [SPARK-27651][Core]
Avoid the network when shuffle blocks are fetched from the same host
URL: https://github.com/apache/spark/pull/25299#discussion_r357140899
##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -1075,6 +1075,24 @@ package object config {
.booleanConf
.createWithDefault(false)
+ private[spark] val SHUFFLE_HOST_LOCAL_DISK_READING_ENABLED =
+ ConfigBuilder("spark.shuffle.readHostLocalDisk.enabled")
Review comment:
What about disabling this feature when `spark.shuffle.useOldFetchProtocol`
is true (and updating the error message along with the corresponding docs; see
below)?
As `spark.shuffle.useOldFetchProtocol` is already exists and documented in
one of the migration guide (I do not know why in sql-migration-guide.md
although).
https://github.com/apache/spark/blob/master/docs/sql-migration-guide.md:
> Since Spark 3.0, we use a new protocol for fetching shuffle blocks, for
external shuffle service users, we need to upgrade the server correspondingly.
Otherwise, we'll get the error message UnsupportedOperationException:
Unexpected message: FetchShuffleBlocks. If it is hard to upgrade the shuffle
service right now, you can still use the old protocol by setting
spark.shuffle.useOldFetchProtocol to true.
I am creating a new jira and PR with this.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]