krisdas opened a new issue #4168:
URL: https://github.com/apache/iceberg/issues/4168


   Flink version : 1.13.2
   Iceberg version : 0.12
   We were trying S3Fileio instead of default hadoop fileio, to ingest data 
into iceberg on S3.
   
   While commit/saving checkpoint, we see below call stack while uploading data 
file. So looks datafile (for example : 
s3/bucket/.../iceberg_db.db/iceberg_table/data/file.parquet) upload fails.
   
   `
   exception: { [-] 
         exception_class:  java.util.concurrent.CompletionException 
         exception_message:  java.io.UncheckedIOException: 
java.nio.file.NoSuchFileException: /tmp/s3fileio-224113294985193160.tmp 
         stacktrace:  java.util.concurrent.CompletionException: 
java.io.UncheckedIOException: java.nio.file.NoSuchFileException: 
/tmp/s3fileio-224113294985193160.tmp
        at 
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
        at 
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
        at 
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1702)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
   Caused by: java.io.UncheckedIOException: java.nio.file.NoSuchFileException: 
/tmp/s3fileio-224113294985193160.tmp
        at 
software.amazon.awssdk.utils.FunctionalUtils.asRuntimeException(FunctionalUtils.java:180)
        at 
software.amazon.awssdk.utils.FunctionalUtils.lambda$safeSupplier$4(FunctionalUtils.java:110)
        at 
software.amazon.awssdk.utils.FunctionalUtils.invokeSafely(FunctionalUtils.java:136)
        at 
software.amazon.awssdk.core.sync.RequestBody.fromFile(RequestBody.java:88)
        at 
software.amazon.awssdk.core.sync.RequestBody.fromFile(RequestBody.java:99)
        at 
org.apache.iceberg.aws.s3.S3OutputStream.lambda$uploadParts$1(S3OutputStream.java:237)
        at 
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
        ... 3 more
   Caused by: java.nio.file.NoSuchFileException: 
/tmp/s3fileio-224113294985193160.tmp
        at 
java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
        at 
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
        at 
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
        at 
java.base/sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
        at 
java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:149)
        at 
java.base/sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
        at java.base/java.nio.file.Files.readAttributes(Files.java:1764)
        at java.base/java.nio.file.Files.size(Files.java:2381)
        at 
software.amazon.awssdk.core.sync.RequestBody.lambda$fromFile$0(RequestBody.java:88)
        at 
software.amazon.awssdk.utils.FunctionalUtils.lambda$safeSupplier$4(FunctionalUtils.java:108)
        ... 8 more
   `
   
   But on the iceberg manifest file, that data file 
(s3/bucket/.../iceberg_db.db/iceberg_table/data/file.parquet) is still 
referenced, as result, iceberg thinks that data file is valid and exists.
   
   Now running another query, which needs to use the above data file, throws 
exception.
   
   Has anybody seen this scenario? We are trying to re-produce the scenario for 
further investigation.
   
   Code link : 
https://github.com/apache/iceberg/blob/master/aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputStream.java#L291


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to