Hi,

We are running Flink 1.20.1, and see a strange issue when trying to read a
savepoint from minio/S3 to a hashmap backend. At first we'd think the file
is not there, but when checking the S3 bucket the file is there. This is
not systematic and only happens from time to time. We think it's an
environmental issue. we were wondering if there were any options available
to maybe give it a retry ?
This is the exception we see:

org.apache.flink.runtime.state.BackendBuildingException: Failed when
trying to restore heap backend
        at 
org.apache.flink.runtime.state.heap.HeapKeyedStateBackendBuilder.restoreState(HeapKeyedStateBackendBuilder.java:174)
        at 
org.apache.flink.runtime.state.heap.HeapKeyedStateBackendBuilder.build(HeapKeyedStateBackendBuilder.java:108)
        at 
org.apache.flink.runtime.state.hashmap.HashMapStateBackend.createKeyedStateBackend(HashMapStateBackend.java:119)
        at 
org.apache.flink.runtime.state.hashmap.HashMapStateBackend.createKeyedStateBackend(HashMapStateBackend.java:61)
        at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$3(StreamTaskStateInitializerImpl.java:446)
        at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:173)
        at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:137)
        at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:457)
        at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:203)
        at 
org.apache.flink.state.api.input.StreamOperatorContextBuilder.build(StreamOperatorContextBuilder.java:129)
        at 
org.apache.flink.state.api.input.KeyedStateInputFormat.open(KeyedStateInputFormat.java:176)
        at 
org.apache.flink.state.api.input.KeyedStateInputFormat.open(KeyedStateInputFormat.java:66)
        at 
org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:92)
        at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:113)
        at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:71)
        at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:338)
Caused by: 
com.facebook.presto.hive.s3.PrestoS3FileSystem$UnrecoverableS3OperationException:
com.amazonaws.services.s3.model.AmazonS3Exception: The specified key
does not exist. (Service: Amazon S3; Status Code: 404; Error Code:
NoSuchKey; Request ID: 1841CDA87F9BAC8F; S3 Extended Request ID:
e5c2c7654856b7589c81653d762ab26f50a21aeaa0de520cb1263639b9f43328;
Proxy: null), S3 Extended Request ID:
e5c2c7654856b7589c81653d762ab26f50a21aeaa0de520cb1263639b9f43328
(Path: 
s3://aiops-ir-lifecycle/savepoints/savepoint-7a276c-8ba7a1a7741b/2bef5371-e008-4e36-a0fe-c7e6fe11c844)
       at 
com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$openStream$2(PrestoS3FileSystem.java:1114)
        at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:139)
        at 
com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:1099)
        at 
com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:1084)
        at 
com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.seekStream(PrestoS3FileSystem.java:1077)
        at 
com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$read$1(PrestoS3FileSystem.java:1021)
        at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:139)
        at 
com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.read(PrestoS3FileSystem.java:1020)
        at 
java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:244)
        at 
java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:263)
        at java.base/java.io.FilterInputStream.read(FilterInputStream.java:82)
        at 
org.apache.flink.fs.s3presto.common.HadoopDataInputStream.read(HadoopDataInputStream.java:88)
        at java.base/java.io.DataInputStream.readInt(DataInputStream.java:381)
        at 
org.apache.flink.core.io.VersionedIOReadableWritable.read(VersionedIOReadableWritable.java:47)
        at 
org.apache.flink.runtime.state.KeyedBackendSerializationProxy.read(KeyedBackendSerializationProxy.java:143)
        at 
org.apache.flink.runtime.state.restore.FullSnapshotRestoreOperation.readMetaData(FullSnapshotRestoreOperation.java:194)
        at 
org.apache.flink.runtime.state.restore.FullSnapshotRestoreOperation.restoreKeyGroupsInStateHandle(FullSnapshotRestoreOperation.java:171)
        at 
org.apache.flink.runtime.state.restore.FullSnapshotRestoreOperation.access$100(FullSnapshotRestoreOperation.java:113)
        at 
org.apache.flink.runtime.state.restore.FullSnapshotRestoreOperation$1.next(FullSnapshotRestoreOperation.java:158)
        at 
org.apache.flink.runtime.state.restore.FullSnapshotRestoreOperation$1.next(FullSnapshotRestoreOperation.java:140)
        at 
org.apache.flink.runtime.state.heap.HeapSavepointRestoreOperation.restore(HeapSavepointRestoreOperation.java:116)
        at 
org.apache.flink.runtime.state.heap.HeapSavepointRestoreOperation.restore(HeapSavepointRestoreOperation.java:58)
        at 
org.apache.flink.runtime.state.heap.HeapKeyedStateBackendBuilder.restoreState(HeapKeyedStateBackendBuilder.java:171)
        ... 15 more
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The
specified key does not exist. (Service: Amazon S3; Status Code: 404;
Error Code: NoSuchKey; Request ID: 1841CDA87F9BAC8F; S3 Extended
Request ID: e5c2c7654856b7589c81653d762ab26f50a21aeaa0de520cb1263639b9f43328;
Proxy: null), S3 Extended Request ID:
e5c2c7654856b7589c81653d762ab26f50a21aeaa0de520cb1263639b9f43328
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1912)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1450)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1419)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1183)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:838)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:805)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:779)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:735)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:717)
        at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:581)
        at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
        at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5593)
        at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5540)
        at 
com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1574)
        at 
com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$openStream$2(PrestoS3FileSystem.java:1102)



Thanks

JM

Reply via email to