otterc commented on pull request #34934:
URL: https://github.com/apache/spark/pull/34934#issuecomment-1016956929


   > @otterc I reproduced this issue today and sent a email to you with logs 
and spark confs you requested 
[here](https://github.com/apache/spark/pull/35076#issuecomment-1010458523), 
highly suspect that 
[SPARK-37675](https://issues.apache.org/jira/browse/SPARK-37675) and 
[SPARK-37793](https://issues.apache.org/jira/browse/SPARK-37793) share the same 
root cause. Please let me know if any other things I can do.
   
   @pan3793 Would you be able to add these changes and rerun this test?
   1. Log the reduceId in the iterator for which the assertion fails. Changing 
the assertion to this will work:  
   `assert(numChunks > 0, s"zero chunks for $blockId")`
   
   2. In the `RemoteBlockPushResolver.finalizeShuffleMerge`, add this condition 
for the partition `mapTracker` and `reduceId` to be added to the results:
   ```
            try {
               // This can throw IOException which will marks this shuffle 
partition as not merged.
               partition.finalizePartition();
               if (partition.mapTracker.getCardinality() > 0) { // needs to be 
added
                 bitmaps.add(partition.mapTracker);
                 reduceIds.add(partition.reduceId);
                 sizes.add(partition.getLastChunkOffset());
               }
             } catch (IOException ioe) {
               logger.warn("Exception while finalizing shuffle partition {}_{} 
{} {}", msg.appId,
                 msg.appAttemptId, msg.shuffleId, partition.reduceId, ioe);
             } finally {
               partition.closeAllFilesAndDeleteIfNeeded(false);
             }
   ```
   Please let me know if you can rerun with these changes and share the logs 
with me. 
   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to