lm520hy opened a new issue, #8781:
URL: https://github.com/apache/seatunnel/issues/8781

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22)
 and found no similar issues.
   
   
   ### What happened
   
   Spark changed the metadata source information of Hudi after writing data. 
Seatunnel reported an error when writing data and read Hudi's metadata 
information
   spark写入数据后改变了hudi的元数据源信息   seatunnel同步数据报错读取hudi的元数据信息报错了
   
   ### SeaTunnel Version
   
   2.3.8
   
   ### SeaTunnel Config
   
   ```conf
   env {
     parallelism = 1
     job.mode = "BATCH"
   }
   source {
     FakeSource {
       parallelism = 1
       result_table_name = "fake2"
       row.num = 16
       schema = {
         fields {
            id = "int"
           name = "string"
           price = "double"
                ts = "bigint"
         }
       }
       rows = [
         {
          kind = INSERT
          fields = [ 7,"l", 1100,117]
         }
       ]
     }
   
   }
   sink {
     Hudi {
       table_dfs_path = "hdfs:///hudi/"
       table_name = "hudi_mor_tbl2"
       table_type = "COPY_ON_WRITE"
       conf_files_path = 
"/soft/hadoop/etc/hadoop/hdfs-site.xml;/soft/hadoop/etc/hadoop/core-site.xml;/soft/hadoop/etc/hadoop/yarn-site.xml"
       batch_size = 10000
     }
   ```
   
   ### Running Command
   
   ```shell
   env {
     parallelism = 1
     job.mode = "BATCH"
   }
   source {
     FakeSource {
       parallelism = 1
       result_table_name = "fake2"
       row.num = 16
       schema = {
         fields {
            id = "int"
           name = "string"
           price = "double"
                ts = "bigint"
         }
       }
       rows = [
         {
          kind = INSERT
          fields = [ 7,"l", 1100,117]
         }
       ]
     }
   
   }
   sink {
     Hudi {
       table_dfs_path = "hdfs:///hudi/"
       table_name = "hudi_mor_tbl2"
       table_type = "COPY_ON_WRITE"
       conf_files_path = 
"/soft/hadoop/etc/hadoop/hdfs-site.xml;/soft/hadoop/etc/hadoop/core-site.xml;/soft/hadoop/etc/hadoop/yarn-site.xml"
       batch_size = 10000
     }
   ```
   
   ### Error Exception
   
   ```log
   2025-02-20 19:27:42,144 INFO  [a.h.c.t.t.HoodieActiveTimeline] 
[st-multi-table-sink-writer-2] - Loaded instants upto : 
Option{val=[20250220192742010__clean__COMPLETED__20250220192742120]}
   2025-02-20 19:27:42,145 INFO  [o.a.h.c.t.HoodieTableConfig   ] 
[st-multi-table-sink-writer-2] - Loading table properties from 
hdfs:/hudi/default/hudi_mor_tbl2/.hoodie/hoodie.properties
   2025-02-20 19:27:42,147 WARN  [o.a.s.e.s.TaskExecutionService] 
[BlockingWorker-TaskGroupLocation{jobId=944918221583024129, pipelineId=1, 
taskGroupId=50000}] - [localhost]:5801 [seatunnel-825957] [5.1] Exception in 
org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask@1f445812
   java.lang.RuntimeException: java.lang.RuntimeException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieException: Error limiting instant archival 
based on metadata table
        at 
org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:253)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:66)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:70)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:50)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:51)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:693)
 [seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:1018)
 [seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.api.tracing.MDCRunnable.run(MDCRunnable.java:39) 
[seatunnel-starter.jar:2.3.8]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_381]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_381]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_381]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_381]
        at java.lang.Thread.run(Thread.java:750) [?:1.8.0_381]
   Caused by: java.lang.RuntimeException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieException: Error limiting instant archival 
based on metadata table
        at 
org.apache.seatunnel.api.sink.multitablesink.MultiTableSinkWriter.prepareCommit(MultiTableSinkWriter.java:258)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:188)
 ~[seatunnel-starter.jar:2.3.8]
        ... 17 more
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieException: Error limiting instant archival 
based on metadata table
        at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
~[?:1.8.0_381]
        at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
~[?:1.8.0_381]
        at 
org.apache.seatunnel.api.sink.multitablesink.MultiTableSinkWriter.prepareCommit(MultiTableSinkWriter.java:256)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:188)
 ~[seatunnel-starter.jar:2.3.8]
        ... 17 more
   Caused by: org.apache.hudi.exception.HoodieException: Error limiting instant 
archival based on metadata table
        at 
org.apache.hudi.client.HoodieTimelineArchiver.getInstantsToArchive(HoodieTimelineArchiver.java:520)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.HoodieTimelineArchiver.archiveIfRequired(HoodieTimelineArchiver.java:165)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.archive(BaseHoodieTableServiceClient.java:782)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.BaseHoodieWriteClient.archive(BaseHoodieWriteClient.java:867)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.BaseHoodieWriteClient.autoArchiveOnCommit(BaseHoodieWriteClient.java:596)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.BaseHoodieWriteClient.mayBeCleanAndArchive(BaseHoodieWriteClient.java:562)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.BaseHoodieWriteClient.postWrite(BaseHoodieWriteClient.java:528)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.HoodieJavaWriteClient.insert(HoodieJavaWriteClient.java:141)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.seatunnel.connectors.seatunnel.hudi.sink.writer.HudiRecordWriter.flush(HudiRecordWriter.java:160)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.seatunnel.connectors.seatunnel.hudi.sink.writer.HudiRecordWriter.prepareCommit(HudiRecordWriter.java:186)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.seatunnel.connectors.seatunnel.hudi.sink.writer.HudiSinkWriter.prepareCommit(HudiSinkWriter.java:101)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.seatunnel.api.sink.multitablesink.MultiTableSinkWriter.lambda$prepareCommit$4(MultiTableSinkWriter.java:241)
 ~[seatunnel-starter.jar:2.3.8]
        ... 6 more
   Caused by: java.lang.UnsupportedOperationException
        at 
org.apache.hudi.metadata.FileSystemBackedTableMetadata.getLatestCompactionTime(FileSystemBackedTableMetadata.java:280)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.HoodieTimelineArchiver.getInstantsToArchive(HoodieTimelineArchiver.java:510)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.HoodieTimelineArchiver.archiveIfRequired(HoodieTimelineArchiver.java:165)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.archive(BaseHoodieTableServiceClient.java:782)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.BaseHoodieWriteClient.archive(BaseHoodieWriteClient.java:867)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.BaseHoodieWriteClient.autoArchiveOnCommit(BaseHoodieWriteClient.java:596)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.BaseHoodieWriteClient.mayBeCleanAndArchive(BaseHoodieWriteClient.java:562)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.BaseHoodieWriteClient.postWrite(BaseHoodieWriteClient.java:528)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.hudi.client.HoodieJavaWriteClient.insert(HoodieJavaWriteClient.java:141)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.seatunnel.connectors.seatunnel.hudi.sink.writer.HudiRecordWriter.flush(HudiRecordWriter.java:160)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.seatunnel.connectors.seatunnel.hudi.sink.writer.HudiRecordWriter.prepareCommit(HudiRecordWriter.java:186)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.seatunnel.connectors.seatunnel.hudi.sink.writer.HudiSinkWriter.prepareCommit(HudiSinkWriter.java:101)
 ~[connector-hudi-2.3.8.jar:2.3.8]
        at 
org.apache.seatunnel.api.sink.multitablesink.MultiTableSinkWriter.lambda$prepareCommit$4(MultiTableSinkWriter.java:241)
 ~[seatunnel-starter.jar:2.3.8]
        ... 6 more
   2025-02-20 19:27:42,152 INFO  [o.a.s.e.s.TaskExecutionService] 
[BlockingWorker-TaskGroupLocation{jobId=944918221583024129, pipelineId=1, 
taskGroupId=50000}] - [localhost]:5801 [seatunnel-825957] [5.1] taskDone, 
taskId = 70000, taskGroup = TaskGroupLocation{jobId=944918221583024129, 
pipelineId=1, taskGroupId=50000}
   2025-02-20 19:27:42,152 INFO  [o.a.s.e.s.TaskExecutionService] 
[BlockingWorker-TaskGroupLocation{jobId=944918221583024129, pipelineId=1, 
taskGroupId=50000}] - [localhost]:5801 [seatunnel-825957] [5.1] task 70000 
error with exception: [java.lang.RuntimeException: java.lang.RuntimeException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieException: Error limiting instant archival 
based on metadata table], cancel other task in taskGroup 
TaskGroupLocation{jobId=944918221583024129, pipelineId=1, taskGroupId=50000}.
   2025-02-20 19:27:42,152 WARN  [o.a.s.e.s.TaskExecutionService] 
[BlockingWorker-TaskGroupLocation{jobId=944918221583024129, pipelineId=1, 
taskGroupId=50000}] - [localhost]:5801 [seatunnel-825957] [5.1] Interrupted 
task 60000 - 
org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask@4fe9a7d5
   2025-02-20 19:27:42,152 INFO  [o.a.s.e.s.TaskExecutionService] 
[BlockingWorker-TaskGroupLocation{jobId=944918221583024129, pipelineId=1, 
taskGroupId=50000}] - [localhost]:5801 [seatunnel-825957] [5.1] taskDone, 
taskId = 60000, taskGroup = TaskGroupLocation{jobId=944918221583024129, 
pipelineId=1, taskGroupId=50000}
   2025-02-20 19:27:42,154 INFO  [.a.h.c.t.HoodieTableMetaClient] 
[ForkJoinPool.commonPool-worker-1] - Loading HoodieTableMetaClient from 
hdfs:///hudi//default/hudi_mor_tbl2
   2025-02-20 19:27:42,155 INFO  [o.a.h.c.t.HoodieTableConfig   ] 
[ForkJoinPool.commonPool-worker-1] - Loading table properties from 
hdfs:/hudi/default/hudi_mor_tbl2/.hoodie/hoodie.properties
   2025-02-20 19:27:42,158 INFO  [.a.h.c.t.HoodieTableMetaClient] 
[ForkJoinPool.commonPool-worker-1] - Finished Loading Table of type 
COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from 
hdfs:///hudi//default/hudi_mor_tbl2
   2025-02-20 19:27:42,158 INFO  [.a.h.c.t.HoodieTableMetaClient] 
[ForkJoinPool.commonPool-worker-1] - Loading Active commit timeline for 
hdfs:///hudi//default/hudi_mor_tbl2
   2025-02-20 19:27:42,159 INFO  [o.a.s.e.s.TaskExecutionService] 
[BlockingWorker-TaskGroupLocation{jobId=944918221583024129, pipelineId=1, 
taskGroupId=50000}] - [localhost]:5801 [seatunnel-825957] [5.1] taskGroup 
TaskGroupLocation{jobId=944918221583024129, pipelineId=1, taskGroupId=50000} 
complete with FAILED
   2025-02-20 19:27:42,160 INFO  [o.a.s.e.s.TaskExecutionService] 
[BlockingWorker-TaskGroupLocation{jobId=944918221583024129, pipelineId=1, 
taskGroupId=50000}] - [localhost]:5801 [seatunnel-825957] [5.1] task 60000 
error with exception: [java.lang.RuntimeException: java.lang.RuntimeException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieException: Error limiting instant archival 
based on metadata table], cancel other task in taskGroup 
TaskGroupLocation{jobId=944918221583024129, pipelineId=1, taskGroupId=50000}.
   2025-02-20 19:27:42,160 INFO  [o.a.s.e.s.TaskExecutionService] 
[hz.main.seaTunnel.task.thread-6] - [localhost]:5801 [seatunnel-825957] [5.1] 
Task TaskGroupLocation{jobId=944918221583024129, pipelineId=1, 
taskGroupId=50000} complete with state FAILED
   2025-02-20 19:27:42,160 INFO  [a.h.c.t.t.HoodieActiveTimeline] 
[ForkJoinPool.commonPool-worker-1] - Loaded instants upto : 
Option{val=[20250220192742010__clean__COMPLETED__20250220192742120]}
   2025-02-20 19:27:42,160 INFO  [o.a.h.c.u.CleanerUtils        ] 
[ForkJoinPool.commonPool-worker-1] - Cleaned failed attempts if any
   2025-02-20 19:27:42,160 INFO  [o.a.s.e.s.CoordinatorService  ] 
[hz.main.seaTunnel.task.thread-6] - [localhost]:5801 [seatunnel-825957] [5.1] 
Received task end from execution TaskGroupLocation{jobId=944918221583024129, 
pipelineId=1, taskGroupId=50000}, state FAILED
   2025-02-20 19:27:42,161 INFO  [.a.h.c.t.HoodieTableMetaClient] 
[ForkJoinPool.commonPool-worker-1] - Loading HoodieTableMetaClient from 
hdfs:///hudi//default/hudi_mor_tbl2
   2025-02-20 19:27:42,161 INFO  [o.a.s.a.e.LoggingEventHandler ] 
[hz.main.generic-operation.thread-36] - log event: 
ReaderCloseEvent(createdTime=1740050862160, jobId=944918221583024129, 
eventType=LIFECYCLE_READER_CLOSE)
   2025-02-20 19:27:42,162 INFO  [o.a.h.c.t.HoodieTableConfig   ] 
[ForkJoinPool.commonPool-worker-1] - Loading table properties from 
hdfs:/hudi/default/hudi_mor_tbl2/.hoodie/hoodie.properties
   2025-02-20 19:27:42,162 INFO  [o.a.s.e.s.d.p.PhysicalVertex  ] 
[hz.main.seaTunnel.task.thread-6] - Job SeaTunnel_Job (944918221583024129), 
Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-FakeSource]-SourceTask (1/1)] 
turned from state RUNNING to FAILED.
   2025-02-20 19:27:42,162 INFO  [o.a.s.e.s.d.p.PhysicalVertex  ] 
[hz.main.seaTunnel.task.thread-6] - Job SeaTunnel_Job (944918221583024129), 
Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-FakeSource]-SourceTask (1/1)] 
state process is stopped
   2025-02-20 19:27:42,162 ERROR [o.a.s.e.s.d.p.PhysicalVertex  ] 
[hz.main.seaTunnel.task.thread-6] - Job SeaTunnel_Job (944918221583024129), 
Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-FakeSource]-SourceTask (1/1)] 
end with state FAILED and Exception: java.lang.RuntimeException: 
java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieException: Error limiting instant archival 
based on metadata table
        at 
org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:253)
        at 
org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:66)
        at 
org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
        at 
org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
        at 
org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:70)
        at 
org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:50)
        at 
org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:51)
        at 
org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73)
        at 
org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
        at 
org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78)
        at 
org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:693)
        at 
org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:1018)
        at org.apache.seatunnel.api.tracing.MDCRunnable.run(MDCRunnable.java:39)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
   Caused by: java.lang.RuntimeException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieException: Error limiting instant archival 
based on metadata table
        at 
org.apache.seatunnel.api.sink.multitablesink.MultiTableSinkWriter.prepareCommit(MultiTableSinkWriter.java:258)
        at 
org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:188)
        ... 17 more
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieException: Error limiting instant archival 
based on metadata table
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.seatunnel.api.sink.multitablesink.MultiTableSinkWriter.prepareCommit(MultiTableSinkWriter.java:256)
        ... 18 more
   Caused by: org.apache.hudi.exception.HoodieException: Error limiting instant 
archival based on metadata table
   ```
   
   ### Zeta or Flink or Spark Version
   
   _No response_
   
   ### Java or Scala Version
   
   _No response_
   
   ### Screenshots
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to