cloud-fan opened a new pull request #23510: [SPARK-26590][SQL][CORE] make 
fetch-block-to-disk backward compatible
URL: https://github.com/apache/spark/pull/23510
 
 
   ## What changes were proposed in this pull request?
   
   This is a followup of https://github.com/apache/spark/pull/16989
   
   The fetch-block-to-disk feature is disabled by default, because it's not 
compatible with external shuffle service prior to Spark 2.2. The client sends 
stream request to fetch block chunks, and old shuffle service can't support it.
   
   This PR proposes a new approach:
   1. extend `ChunkFetchRequest` to add an optional `fetchAsStream` boolean 
flag. It will only be encoded to the message when it's true. 
`ChunkFetchRequest` from old clients do not have this flag, which means this 
flag is false for them.
   2. server side takes care of the new flag in `ChunkFetchRequest`. If the 
flag is true, return a new chunk stream response, otherwise return a normal 
chunk fetch response.
   3. when client side sends `ChunkFetchRequest` with `fetchAsStream=true`, it 
will set up two callbacks for the new chunk stream response and the normal 
chunk fetch response. This is necessary because the server side may be an old 
version which ignores the `fetchAsStream` flag.
   
   This is fully compatible:
   1. new client <-> new server: Definitely fine
   2. old client <-> new server: The `ChunkFetchRequest` message doesn't have 
the `fetchAsStream` flag, the server treats it as a normal fetch request, and 
returns normal fetch request response.
   3. new client <-> old server: The `ChunkFetchRequest` message contains the 
`fetchAsStream` flag, but the old server doesn't know about it and stops 
reading the message right before the `fetchAsStream` part. Then the old server 
returns normal chunk fetch response, and new client accept it.
   
   TODO: setup different versions of shuffle service and test it.
   
   ## How was this patch tested?
   
   existing tests.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to