aidadenski opened a new issue, #6777:
URL: https://github.com/apache/paimon/issues/6777

   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/paimon/issues) 
and found nothing similar.
   
   
   ### Paimon version
   
   0.8.2
   
   ### Compute Engine
   
   Flink
   
   ### Minimal reproduce step
   
   1. Run a Flink job writing to Paimon on Object Storage (S3).
   
   2. Simulate a network timeout specifically during the "Commit/Rename" phase 
of a checkpoint.
   
   3. Trigger the file system client to retry the rename operation.
   
   4. Observe the deletion of the target snapshot file.
   
   5. Wait for Flink failover; the job will get stuck in a restart loop with 
FileNotFoundException.
   `2025-11-12 18:05:09
   java.lang.RuntimeException: java.io.FileNotFoundException: File 
's3://dt-warehouse/paimon-warehouse/nd_game_sjmy_cdm.db/dwd_sjmy_gmlog_boss_user_bekilllist/manifest/manifest-list-20f67c4d-ba05-4dbf-a26a-24f229b949cc-32'
 not found, Possible causes: 1.snapshot expires too fast, you can configure 
'snapshot.time-retained' option with a larger value. 2. consumption is too 
slow, you can improve the performance of consumption (For example, increasing 
parallelism).
        at 
org.apache.paimon.flink.sink.AsyncLookupSinkWrite.<init>(AsyncLookupSinkWrite.java:75)
        at 
org.apache.paimon.flink.sink.FlinkSink.lambda$createWriteProvider$672a9d60$1(FlinkSink.java:147)
        at 
org.apache.paimon.flink.sink.TableWriteOperator.initializeState(TableWriteOperator.java:78)
        at 
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:122)
        at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:316)
        at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:306)
        at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:759)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:734)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:699)
        at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:971)
        at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:940)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:764)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:574)
        at java.base/java.lang.Thread.run(Thread.java:829)
   Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
's3://dt-warehouse/paimon-warehouse/nd_game_sjmy_cdm.db/dwd_sjmy_gmlog_boss_user_bekilllist/manifest/manifest-list-20f67c4d-ba05-4dbf-a26a-24f229b949cc-32'
 not found, Possible causes: 1.snapshot expires too fast, you can configure 
'snapshot.time-retained' option with a larger value. 2. consumption is too 
slow, you can improve the performance of consumption (For example, increasing 
parallelism).
        at 
org.apache.paimon.utils.ObjectsCache.readSegments(ObjectsCache.java:143)
        at org.apache.paimon.utils.ObjectsCache.read(ObjectsCache.java:93)
        at 
org.apache.paimon.utils.ObjectsFile.readWithIOException(ObjectsFile.java:149)
        at org.apache.paimon.utils.ObjectsFile.read(ObjectsFile.java:134)
        at org.apache.paimon.utils.ObjectsFile.read(ObjectsFile.java:105)
        at 
org.apache.paimon.manifest.ManifestList.readDataManifests(ManifestList.java:90)
        at 
org.apache.paimon.operation.ManifestsReader.readManifests(ManifestsReader.java:128)
        at 
org.apache.paimon.operation.ManifestsReader.read(ManifestsReader.java:114)
        at 
org.apache.paimon.operation.AbstractFileStoreScan.readManifests(AbstractFileStoreScan.java:417)
        at 
org.apache.paimon.operation.AbstractFileStoreScan.plan(AbstractFileStoreScan.java:257)
        at 
org.apache.paimon.operation.AbstractFileStoreWrite.scanExistingFileMetas(AbstractFileStoreWrite.java:491)
        at 
org.apache.paimon.operation.AbstractFileStoreWrite.createWriterContainer(AbstractFileStoreWrite.java:440)`
   
   ### What doesn't meet your expectations?
   
   The rename operation should be idempotent or check for the existence of the 
target file before considering the operation failed. If the source is missing 
but the target exists and matches expectations during a retry, it should be 
treated as a success, or at least the target file should not be deleted.
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [x] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to