[GitHub] [spark] holdenk commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

GitBox Thu, 04 Jun 2020 11:01:54 -0700


holdenk commented on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-639013472



   > > Hey @attilapiros can you explain to my why you think we need to test the 
different kinds of block fetches? When we migrate we're always migrating to 
disk so I'm not seeing how it should matter.
   > 
   > I meant to test the different kinds of block uploads (not fetches): like 
streaming uploads and non-streaming ones. Those are currently tested by the 
`BlockManagerDecommissionSuite` controlled by the flag 
`spark.network.maxRemoteBlockSizeFetchToMem`.
   > 
   > But you are right shuffle uploads always should manifest in the file 
system. To achieve this we should change this condition:
   > 
   > 
https://github.com/apache/spark/blob/ccb8827c61e3178c83ae505dbddd30b1a6a2d361/core/src/main/scala/org/apache/spark/network/netty/NettyBlockTransferService.scala#L171
   > 
   > to
   > 
   > ```scala
   >     val asStream = blockId.isInternalShuffle ||
   >       blockData.size() > 
conf.get(config.MAX_REMOTE_BLOCK_SIZE_FETCH_TO_MEM)
   > ```
   
   Yeah I know that we test it explicitly right now, but I'm not seeing why so 
I think maybe just taking out the test entirely is better.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] holdenk commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

Reply via email to