attilapiros commented on a change in pull request #24499: [SPARK-25888][Core]
Serve local disk persisted blocks by the external service after releasing
executor by dynamic allocation
URL: https://github.com/apache/spark/pull/24499#discussion_r279997057
##########
File path:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -179,6 +179,18 @@ public ManagedBuffer getBlockData(
return getSortBasedShuffleBlockData(executor, shuffleId, mapId, reduceId);
}
+ public ManagedBuffer getBlockData(
Review comment:
The call chain lead to the Worker (and I have not found any other place
where this is called):
https://github.com/apache/spark/blob/df3a80da4270c7b5eddb83383d8149ed64b25a66/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala#L753-L755
I can see several possible solutions here:
1) mixing "spark.shuffle.service.enabled" into this condition so basically
switching off the cleanup when external shuffle service is enabled and
producing just a warning about this decision and the reason behind.
2) add a `require` (so throw an `IllegalArgumentException`) when both
"spark.storage.cleanupFilesAfterExecutorExit" and external shuffle service is
enabled.
As I see standalone mode already has some weaknesses in this area:
https://github.com/apache/spark/blob/df3a80da4270c7b5eddb83383d8149ed64b25a66/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala#L779-L791
I am open for more possibilities. Do you agree the first one with the
warning is good solution here?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]