leosanqing opened a new issue, #1483: URL: https://github.com/apache/fluss/issues/1483
### Search before asking - [x] I searched in the [issues](https://github.com/alibaba/fluss/issues) and found nothing similar. ### Fluss version 0.7.0 (latest release) ### Please describe the bug 🐞 env: Flink 1.18 On Yarn 3.2.x. Fluss 0.8-snapshot on 3 nodes. coordinator at 01, tablet server at 02,03 After writing data in Fluss table, query table(in Streaming mode, batch mode is normal) occur exception like: ``` 2025-08-05 10:32:07,672 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: pk_table01[4] -> ConstraintEnforcer[5] -> Sink: Collect table sink (1/1) (34300f6d721719fa11f7a22b725d4020_cbc357ccb763df2852fee8c4fc7d55f2_0_8) switched from RUNNING to FAILED on container_1752656188192_0009_01_000002 @ bigdata03 (dataPort=40855). java.lang.RuntimeException: One or more fetchers have encountered exception at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.checkErrors(SplitFetcherManager.java:263) ~[flink-connector-files-1.18.1.jar:1.18.1] at org.apache.flink.connector.base.source.reader.SourceReaderBase.getNextFetch(SourceReaderBase.java:185) ~[flink-connector-files-1.18.1.jar:1.18.1] at org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:147) ~[flink-connector-files-1.18.1.jar:1.18.1] at org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:419) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:562) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:807) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:932) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746) ~[flink-dist-1.18.1.jar:1.18.1] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) ~[flink-dist-1.18.1.jar:1.18.1] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_361] Caused by: java.lang.RuntimeException: SplitFetcher thread 0 received unexpected exception while polling the records at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:168) ~[flink-connector-files-1.18.1.jar:1.18.1] at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:117) ~[flink-connector-files-1.18.1.jar:1.18.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_361] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_361] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_361] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_361] ... 1 more Caused by: com.alibaba.fluss.exception.FlussRuntimeException: Failed to get snapshot metadata at com.alibaba.fluss.client.table.scanner.TableScan.createBatchScanner(TableScan.java:127) ~[fluss-flink-1.18-0.8-SNAPSHOT.jar:0.8-SNAPSHOT] at com.alibaba.fluss.flink.source.reader.FlinkSourceSplitReader.checkSnapshotSplitOrStartNext(FlinkSourceSplitReader.java:375) ~[fluss-flink-1.18-0.8-SNAPSHOT.jar:0.8-SNAPSHOT] at com.alibaba.fluss.flink.source.reader.FlinkSourceSplitReader.fetch(FlinkSourceSplitReader.java:143) ~[fluss-flink-1.18-0.8-SNAPSHOT.jar:0.8-SNAPSHOT] at org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58) ~[flink-connector-files-1.18.1.jar:1.18.1] at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:165) ~[flink-connector-files-1.18.1.jar:1.18.1] at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:117) ~[flink-connector-files-1.18.1.jar:1.18.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_361] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_361] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_361] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_361] ... 1 more Caused by: java.util.concurrent.ExecutionException: com.alibaba.fluss.exception.KvSnapshotNotExistException: Failed to get kv snapshot metadata for table bucket TableBucket{tableId=5, bucket=0} and snapshot id 0. Error: /tmp/fluss-remote-data/kv/fluss/pk_table01-5/0/snap-0/_METADATA (No such file or directory) at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[?:1.8.0_361] at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) ~[?:1.8.0_361] at com.alibaba.fluss.client.table.scanner.TableScan.createBatchScanner(TableScan.java:125) ~[fluss-flink-1.18-0.8-SNAPSHOT.jar:0.8-SNAPSHOT] at com.alibaba.fluss.flink.source.reader.FlinkSourceSplitReader.checkSnapshotSplitOrStartNext(FlinkSourceSplitReader.java:375) ~[fluss-flink-1.18-0.8-SNAPSHOT.jar:0.8-SNAPSHOT] at com.alibaba.fluss.flink.source.reader.FlinkSourceSplitReader.fetch(FlinkSourceSplitReader.java:143) ~[fluss-flink-1.18-0.8-SNAPSHOT.jar:0.8-SNAPSHOT] at org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58) ~[flink-connector-files-1.18.1.jar:1.18.1] at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:165) ~[flink-connector-files-1.18.1.jar:1.18.1] at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:117) ~[flink-connector-files-1.18.1.jar:1.18.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_361] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_361] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_361] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_361] ... 1 more Caused by: com.alibaba.fluss.exception.KvSnapshotNotExistException: Failed to get kv snapshot metadata for table bucket TableBucket{tableId=5, bucket=0} and snapshot id 0. Error: /tmp/fluss-remote-data/kv/fluss/pk_table01-5/0/snap-0/_METADATA (No such file or directory) ``` I find Metadata file in 01 node(Coordinator), but snapshot data in 03 node(tablet server). ### Solution Switch to hdfs path.(like prod env) ### Are you willing to submit a PR? - [x] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
