zlinsc opened a new issue, #10285:
URL: https://github.com/apache/hudi/issues/10285
**Describe the problem you faced**
hudi sink job cannot restart normally from checkpoint because of
InvalidAvroMagicException (CleanFunction)
**Environment Description**
* Hudi version : 0.14.0
* Hadoop version : 3.2.1
* Storage (HDFS/S3/GCS..) : hdfs
* Running on Docker? (yes/no) : no
**Stacktrace**
```java
2023-12-08 22:34:28,663 ERROR org.apache.hudi.sink.CleanFunction
[] - Executor executes action [wait for cleaning finish] error
org.apache.hudi.exception.HoodieException:
org.apache.hudi.org.apache.avro.InvalidAvroMagicException: Not an Avro data file
at
org.apache.hudi.common.util.CompactionUtils.getCompactionPlan(CompactionUtils.java:201)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.util.CompactionUtils.getCompactionPlan(CompactionUtils.java:189)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.util.CompactionUtils.lambda$getCompactionPlansByTimeline$4(CompactionUtils.java:163)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
~[?:1.8.0_251]
at
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
~[?:1.8.0_251]
at
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
~[?:1.8.0_251]
at
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
~[?:1.8.0_251]
at
java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
~[?:1.8.0_251]
at
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
~[?:1.8.0_251]
at
java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
~[?:1.8.0_251]
at
org.apache.hudi.common.util.CompactionUtils.getCompactionPlansByTimeline(CompactionUtils.java:164)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.util.CompactionUtils.getAllPendingCompactionPlans(CompactionUtils.java:133)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.util.CompactionUtils.getAllPendingCompactionOperations(CompactionUtils.java:213)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.table.view.AbstractTableFileSystemView.init(AbstractTableFileSystemView.java:121)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.table.view.HoodieTableFileSystemView.init(HoodieTableFileSystemView.java:115)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.table.view.HoodieTableFileSystemView.<init>(HoodieTableFileSystemView.java:109)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.table.view.FileSystemViewManager.createInMemoryFileSystemView(FileSystemViewManager.java:176)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.table.view.FileSystemViewManager.lambda$createViewManager$8be8b1a6$1(FileSystemViewManager.java:270)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.table.view.FileSystemViewManager.lambda$getFileSystemView$1(FileSystemViewManager.java:115)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
~[?:1.8.0_251]
at
org.apache.hudi.common.table.view.FileSystemViewManager.getFileSystemView(FileSystemViewManager.java:114)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.table.HoodieTable.getHoodieView(HoodieTable.java:341)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.table.action.clean.CleanPlanner.<init>(CleanPlanner.java:93)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:105)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:151)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.table.action.clean.CleanPlanActionExecutor.execute(CleanPlanActionExecutor.java:177)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.table.HoodieFlinkCopyOnWriteTable.scheduleCleaning(HoodieFlinkCopyOnWriteTable.java:359)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.client.BaseHoodieTableServiceClient.scheduleTableServiceInternal(BaseHoodieTableServiceClient.java:628)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.client.BaseHoodieTableServiceClient.clean(BaseHoodieTableServiceClient.java:751)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:861)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:834)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.sink.CleanFunction.lambda$open$0(CleanFunction.java:70)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:130)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_251]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_251]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_251]
Caused by: org.apache.hudi.org.apache.avro.InvalidAvroMagicException: Not an
Avro data file
at
org.apache.hudi.org.apache.avro.file.DataFileReader.openReader(DataFileReader.java:57)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeAvroMetadata(TimelineMetadataUtils.java:207)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeCompactionPlan(TimelineMetadataUtils.java:169)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
at
org.apache.hudi.common.util.CompactionUtils.getCompactionPlan(CompactionUtils.java:198)
~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
... 35 more
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]