Victsm commented on a change in pull request #30433:
URL: https://github.com/apache/spark/pull/30433#discussion_r528850602



##########
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java
##########
@@ -827,13 +833,16 @@ void resetChunkTracker() {
     void updateChunkInfo(long chunkOffset, int mapIndex) throws IOException {
       long idxStartPos = -1;
       try {
-        // update the chunk tracker to meta file before index file
-        writeChunkTracker(mapIndex);
         idxStartPos = indexFile.getFilePointer();
         logger.trace("{} shuffleId {} reduceId {} updated index current {} 
updated {}",
           appShuffleId.appId, appShuffleId.shuffleId, reduceId, 
this.lastChunkOffset,
           chunkOffset);
-        indexFile.writeLong(chunkOffset);
+        indexFile.write(Longs.toByteArray(chunkOffset));
+        // Chunk bitmap should be written to the meta file after the index 
file because if there are
+        // any exceptions during writing the offset to the index file, meta 
file should not be
+        // updated. If the update to the index file is successful but the 
update to meta file isn't
+        // then the index file position is reset in the catch clause.
+        writeChunkTracker(mapIndex);

Review comment:
       There are temporary issues as well.
   For example, a user's job might be filling up the yarn local dir with its 
temporary data, which would then lead to IOExceptions for merge writes.
   When this issue occurs, it could trigger a clean up script which purges the 
temporary data from that job to recover the disk.
   In this scenario, the IOException could be temporary.
   
   For the too large chunk issue, it would only be too large if we have many 
consecutive IOExceptions.
   Once recovered, the following chunks would go back to the normal size.
   Only 2 consecutive IOExceptions within the same chunk is far from causing 
the potential too large chunk issue.
   
   If we want to have a reasonable policy to stop early due to too many 
IOExceptions, I think that policy should require more than 2 consecutive 
IOExceptions during chunk metadata write to trigger.
   Note that, the same issue could affect writing data file as well.
   If we want to address this issue holistically, it would require some state 
tracking.
   If so, I'd rather track it on the client side instead of the server side.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to