zhangzhaohuazai commented on issue #9254: URL: https://github.com/apache/hudi/issues/9254#issuecomment-1646790547
I am now increasing the s3 time configuration of core-site.xml:  but I am reporting a new error when the number of table writes is approaching 10 million: ``` 2023-07-23 07:23:27 org.apache.flink.util.FlinkException: Global failure triggered by OperatorCoordinator for 'stream_write: metasource_sync_hudi01' (operator 8217d97d977888787b7cc15b629051fb). at org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder$LazyInitializedCoordinatorContext.failJob(OperatorCoordinatorHolder.java:545) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.lambda$start$0(StreamWriteOperatorCoordinator.java:187) at org.apache.hudi.sink.utils.NonThrownExecutor.handleException(NonThrownExecutor.java:146) at org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:133) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hudi.exception.HoodieException: Executor executes action [commits the instant 20230723072228761] error ... 6 more Caused by: org.apache.hudi.exception.HoodieException: Error fetching partition paths from metadata table at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:318) at org.apache.hudi.table.action.compact.HoodieCompactor.generateCompactionPlan(HoodieCompactor.java:279) at org.apache.hudi.table.action.compact.ScheduleCompactionActionExecutor.scheduleCompaction(ScheduleCompactionActionExecutor.java:123) at org.apache.hudi.table.action.compact.ScheduleCompactionActionExecutor.execute(ScheduleCompactionActionExecutor.java:93) at org.apache.hudi.table.HoodieFlinkMergeOnReadTable.scheduleCompaction(HoodieFlinkMergeOnReadTable.java:112) at org.apache.hudi.client.BaseHoodieWriteClient.scheduleTableServiceInternal(BaseHoodieWriteClient.java:1349) at org.apache.hudi.client.BaseHoodieWriteClient.scheduleTableService(BaseHoodieWriteClient.java:1326) at org.apache.hudi.client.BaseHoodieWriteClient.scheduleCompactionAtInstant(BaseHoodieWriteClient.java:1005) at org.apache.hudi.client.BaseHoodieWriteClient.scheduleCompaction(BaseHoodieWriteClient.java:996) at org.apache.hudi.util.CompactionUtil.scheduleCompaction(CompactionUtil.java:65) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.lambda$notifyCheckpointComplete$2(StreamWriteOperatorCoordinator.java:246) at org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:130) ... 3 more Caused by: org.apache.hudi.exception.HoodieException: Error occurs when executing flatMap at org.apache.hudi.common.function.FunctionWrapper.lambda$throwingFlatMapWrapper$1(FunctionWrapper.java:50) at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267) at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:747) at java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:721) at java.util.stream.AbstractTask.compute(AbstractTask.java:316) at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:401) at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734) at java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:714) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at org.apache.hudi.client.common.HoodieFlinkEngineContext.flatMap(HoodieFlinkEngineContext.java:135) at org.apache.hudi.metadata.FileSystemBackedTableMetadata.getAllPartitionPaths(FileSystemBackedTableMetadata.java:86) at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:316) ... 14 more Caused by: java.io.FileNotFoundException: No such file or directory: s3a://metadatahudi/metasource_sync_hudi28/e5aff3d2-d85a-4598-8581-81bf235d7be4_2-4-0_20230723065928010.parquet at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2344) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2226) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2160) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1961) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$9(S3AFileSystem.java:1940) at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:1940) at org.apache.hudi.metadata.FileSystemBackedTableMetadata.lambda$getAllPartitionPaths$4e4c112d$1(FileSystemBackedTableMetadata.java:88) at org.apache.hudi.common.function.FunctionWrapper.lambda$throwingFlatMapWrapper$1(FunctionWrapper.java:48) ... 31 more ``` And flink restarts after encountering this error, restarting a new round of data writing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
