[GitHub] [spark] otterc commented on pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

GitBox Thu, 05 Nov 2020 08:54:15 -0800


otterc commented on pull request #30062:
URL: https://github.com/apache/spark/pull/30062#issuecomment-722502275



   > Why would the error happen? Does it mean we calculate the wrong message 
length somewhere when allocating the buffer?
   
   In `RemoteBlockPushResolver` we re-use the `trackerBuf` when we serialize 
the `chunkTracker`. The code is 
[here](https://github.com/linkedin/spark/blob/a8dd6f58fe65db34770ac4165192188fe3b98639/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java#L864).
   Once the bytes are written to the file,  the `trackerBuf` is cleared.
   `trackerBuf` is a java heap buffer with an initial capacity which can expand 
its capacity on demand.
   
   In the magnet-upstream branch it was using `buf.writeBytes(bytes)` to do 
this. It's 
[here](https://github.com/linkedin/spark/blob/7478a3edff46b77d325d30bd952d6ba0a2f479ff/common/network-common/src/main/java/org/apache/spark/network/protocol/Encoders.java#L102).
 The implementation of `AbstractByteBuf.writeBytes` also calls `ensureWritable` 
before writing to the buf.
   ```
       @Override
       public ByteBuf writeBytes(byte[] src, int srcIndex, int length) {
           ensureWritable(length);
           setBytes(writerIndex, src, srcIndex, length);
           writerIndex += length;
           return this;
       }
   ``` 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] otterc commented on pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

Reply via email to