attilapiros commented on issue #24499: [SPARK-27677][Core] Serve local disk 
persisted blocks by the external service after releasing executor by dynamic 
allocation
URL: https://github.com/apache/spark/pull/24499#issuecomment-493194196
 
 
   In the 74f3461 commit I have extended the external shuffle service with the 
capability to remove disk persisted RDD blocks. There is two new 
`BlockTransferMessage`s are introduced:
   - `RemoveBlocks`: containing the removable BlockIDs and the executorID 
(because of the message encoding possibilities I have taken this solution over 
an array of executorId and blockIDs pairs which would generate a bit less 
network traffic, there would be only one request for all the released executors 
which was running on the same host)
   - `BlocksRemoved`:  a reply message containing the number of removed blocks 
(successful delete)
   
   I think the critical change in this commit is definitely 
`org.apache.spark.network.shuffle.ExternalShuffleClient#removeBlocks` as an 
`RpcEndpointRef` would provide what we need in here but the current solution 
seams to me a simpler solution.
   
   @vanzin the config you asked is not introduced yet, but I already thought 
about its name, what about `spark.shuffle.service.fetch.rdd.enabled`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to