ahmarsuhail opened a new pull request, #14329:
URL: https://github.com/apache/iceberg/pull/14329

   This PR:
   
   * Adds ConnectionClosedException to the retry list. These surface fairly 
often, and currently lead to task failures.
   
   * Removes the read() on the abort.
   
   When the abort() is called on stream close, it will do a `read()`.
   
   When the abort is a result of a reset after an exception, in the 
`resetForRetry()`, the `read()` throws an exception:
   
   ```
   software.amazon.awssdk.core.exception.RetryableException: Data read has a 
different checksum than expected. 
   Was 0x06625dea76e80c661d03ca53f74a3f11, but expected 
0x00000000000000000000000000000000
   ```
   
   This seems to be because after a failure the underlying stream will return a 
-1, and this triggers SDK's checksum validation. Since the read was not 
completed, the checksums are not updated, and the validation and the abort() 
fails.
   
   While this does not appear to have any impact as such, it does add a lot of 
noise to the logs. And instead of doing a `read()` to check EoF, this can be 
determined from the current pos in the stream vs content length. This is what 
S3A does as well, here: 
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L732


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to