dramaticlly commented on issue #4168: URL: https://github.com/apache/iceberg/issues/4168#issuecomment-1185805540
>Looks like one problem is that the exception is [logged but not rethrown in the async future](https://github.com/apache/iceberg/blob/master/aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputStream.java#L309-L318). Because no exception is thrown, the call to CompleteableFuture::join in completeMultipartUpload succeeds and should return null for that part. Yes @rdblue and I think we also observed that, and align with what @danielcweeks suggested earlier in https://github.com/apache/iceberg/issues/4168#issuecomment-1123000990. I think rethrow after logging is desired, but subsequent multi-part uploads will also fail due to previous abort. We also attempted to reproduce abort MPU in integration tests , verified we can assert thrown exceptions on close. > Either way, it looks like some exception should cause close to fail, but that is apparently ignored since the writer creates metadata for the file. yep I think if there's any suggestion/pointers can help us understand why this is happening, it would be of tremendous help. Today we are seeing 3 more occurrence of this happening, some is related to flink TM out of memory/heap issue and somes are not, so have to disable S3FileIO for flink before this is root caused and prevention is in-place. On the other side, we have turned S3FileIO on for Spark for months now but never see any issue of similar kind. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
