[GitHub] [spark] attilapiros commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

GitBox Thu, 04 Jun 2020 09:03:18 -0700


attilapiros commented on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-638950902



   > Hey @attilapiros can you explain to my why you think we need to test the 
different kinds of block fetches? When we migrate we're always migrating to 
disk so I'm not seeing how it should matter.
   
   I meant to test the different kinds of block uploads (not fetches): like 
streaming uploads and non-streaming ones. Those are currently tested by the 
`BlockManagerDecommissionSuite` controlled by the flag 
`spark.network.maxRemoteBlockSizeFetchToMem`.
   
   But you are right shuffle uploads always should manifest in the file system. 
To achieve this we should change this condition:
   
   
https://github.com/apache/spark/blob/ccb8827c61e3178c83ae505dbddd30b1a6a2d361/core/src/main/scala/org/apache/spark/network/netty/NettyBlockTransferService.scala#L171
   
   to
   
   ```scala
       val asStream = blockId.isInternalShuffle ||
         blockData.size() > conf.get(config.MAX_REMOTE_BLOCK_SIZE_FETCH_TO_MEM)
   
   ```
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] attilapiros commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

Reply via email to