shenbachand opened a new issue, #9787: URL: https://github.com/apache/hudi/issues/9787
We are utilizing AWS Managed Apache Flink to handle streaming data and send it to S3 through the Hudi connector. Additionally, I'm running an AWS Glue ETL Job to execute GDPR-related custom data deletions (both soft and hard deletes) on the same Hudi data stored in S3. This process is guided by the Hudi Spark Guide, specifically the section on deletes (https://hudi.apache.org/docs/quick-start-guide#deletes). Flink Application : Flink 1.15.2 , Hudi 0.13.0 Glue Job: Glue 4.0, Spark3.3,0 Hudi 0.12.1 As my Flink streaming job writes data to S3 and I execute the Glue job to delete certain records, it will delete the records (checked in AWS Athena). However, it results in an exception being thrown by my Flink job. `org.apache.flink.util.FlinkException: Global failure triggered by OperatorCoordinator for 'stream_write: vcdp_enhanced_hudi_s3_output' (operator 175ec69964e38e7016a35f5d0892d9ac). at org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder$LazyInitializedCoordinatorContext.failJob(OperatorCoordinatorHolder.java:556) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.lambda$start$0(StreamWriteOperatorCoordinator.java:190) at org.apache.hudi.sink.utils.NonThrownExecutor.handleException(NonThrownExecutor.java:142) at org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:133) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: org.apache.hudi.exception.HoodieException: Executor executes action [initialize instant 20230919161134978] error ... 6 more Caused by: java.lang.IllegalArgumentException at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:31) at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:633) at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:614) at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.saveAsComplete(HoodieActiveTimeline.java:223) at org.apache.hudi.client.BaseHoodieWriteClient.commit(BaseHoodieWriteClient.java:283) at org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:233) at org.apache.hudi.client.HoodieFlinkWriteClient.commit(HoodieFlinkWriteClient.java:111) at org.apache.hudi.client.HoodieFlinkWriteClient.commit(HoodieFlinkWriteClient.java:74) at org.apache.hudi.client.BaseHoodieWriteClient.commit(BaseHoodieWriteClient.java:199) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.doCommit(StreamWriteOperatorCoordinator.java:537) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.commitInstant(StreamWriteOperatorCoordinator.java:513) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.commitInstant(StreamWriteOperatorCoordinator.java:484) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.lambda$initInstant$6(StreamWriteOperatorCoordinator.java:402) at org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:130) ... 3 more` Consequently, the Flink job enters a restart loop and is unable to recover from this state. I would greatly appreciate any advice or assistance in resolving this issue, Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
