noahtaite opened a new issue, #9805: URL: https://github.com/apache/hudi/issues/9805
**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at [email protected]. - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly. **Describe the problem you faced** I'm running Hudi 0.13.1 on AWS EMR 6.12. We recently ran "delete_partition" operation to clean up some specific partitions data due to bad data being ingested. We then re-ingested the correct data. Now, our sync to hive metastore using **AwsGlueCatalogSyncTool** is failing with the following: ``` partitionsToDelete' failed to satisfy constraint: Member must have length less than or equal to 25 (Service: AWSGlue; Status Code: 400; Error Code: ``` 1 - How to get around this validation constraint for valid deletes of 25+ partitions in Glue? 2 - These partitions should not be deleted from Glue. They were re-created with the good ingestion, and my users use Glue as a metastore. **When I run a Glue sync manually using ./hudi-sync-tool, those partitions are actually removed. It appears the "delete_partition" replacecommit overrides the later deltacommit that has those partitions re-ingested.** This appears to be a bug unless I am missing something with how `delete_partition` is expected to behave. **To Reproduce** Steps to reproduce the behavior: 1. Generate Hudi table with multiple partitions using bulk_insert e.g. [datasource=1/year=2000/month=1] 2. Run delete_partition operation to delete all partitions with `datasource=1/*` 3. Re-generate new partitions for datasource=1 with correct data. 4. Hive sync fails trying to delete 25+ partitions. 5. Manual hive sync leaves Glue table with only 1 new partition (datasource=1/year=2023/month=10). **Expected behavior** I expect the following: 1 - AWS Glue sync should not fail with 25+ partitions in total request. It should batch properly. 2 - My Glue table should not even be deleting the partitions. The final state **should** have all the partitions for datasource=1 but that is not being respected (it seems delete_partition replacecommit takes precedence)! **Environment Description** * Hudi version : 0.13.1-amzn-0 * Spark version : 3.4.0 * Hive version : 3.1.3 * Hadoop version : 3.3.3 * Storage (HDFS/S3/GCS..) : S3 * Running on Docker? (yes/no) : No **Additional context** Running on AWS EMR 6.12. Many of my consumers use Glue as a data catalog. Will missing partitions reduce performance or prevent any new data from being accessed via Glue directly? **Stacktrace** ``` 23/09/28 16:32:47 ERROR Client: Application diagnostics message: User class threw exception: org.apache.hudi.exception.HoodieException: Could not sync using the meta sync class org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool at org.apache.hudi.sync.common.util.SyncUtilHelpers.runHoodieMetaSync(SyncUtilHelpers.java:61) at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$2(HoodieSparkSqlWriter.scala:888) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at org.apache.hudi.HoodieSparkSqlWriter$.metaSync(HoodieSparkSqlWriter.scala:886) at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:984) at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:381) at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:150) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:47) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:104) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107) at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:250) at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:123) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$9(SQLExecution.scala:160) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107) at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:250) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$8(SQLExecution.scala:160) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:271) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:159) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:69) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:101) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:554) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:107) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:554) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:530) at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:97) at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:84) at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:82) at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:142) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:856) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:387) at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:360) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:239) at com.example.spark.datalake.hudi.HudiDatalake.persist(HudiDatalake.java:62) at com.example.spark.datalake.hudi.HudiDatalake.save(HudiDatalake.java:39) at com.example.spark.datalake.FilteredDatalake.save(FilteredDatalake.java:24) at com.example.spark.tier2.datalake.HudiDatalakeUpdater.saveToHudi(HudiDatalakeUpdater.java:86) at com.example.spark.tier2.datalake.HudiDatalakeUpdater.upsert(HudiDatalakeUpdater.java:61) at com.example.spark.tier2.extractor.BaseExtractor.extract(BaseExtractor.java:58) at java.util.ArrayList.forEach(ArrayList.java:1259) at com.example.spark.tier2.extractor.KViewsExtractor.extract(KViewsExtractor.java:35) at com.example.spark.tier2.DmsTierTwoExtractorRunner.run(DmsTierTwoExtractorRunner.java:239) at com.example.spark.tier2.DmsTierTwoExtractorRunner.main(DmsTierTwoExtractorRunner.java:138) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:760) Caused by: org.apache.hudi.exception.HoodieException: Got runtime exception when hive syncing table_all at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:165) at org.apache.hudi.sync.common.util.SyncUtilHelpers.runHoodieMetaSync(SyncUtilHelpers.java:59) ... 56 more Caused by: org.apache.hudi.hive.HoodieHiveSyncException: Failed to sync partitions for table table_all at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:429) at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:280) at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:188) at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:162) ... 57 more Caused by: org.apache.hudi.aws.sync.HoodieGlueSyncException: Fail to drop partitions to dms_hudi_db.table_all at org.apache.hudi.aws.sync.AWSGlueCatalogSyncClient.dropPartitions(AWSGlueCatalogSyncClient.java:222) at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:457) at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:424) ... 60 more Caused by: org.apache.hudi.com.amazonaws.services.glue.model.ValidationException: 1 validation error detected: Value '[PartitionValueList(values=[[email protected], 2021, 11]), PartitionValueList(values=[[email protected], 2021, 12]), PartitionValueList(values=[[email protected], 2021, 10]), PartitionValueList(values=[[email protected], 2021, 9]), PartitionValueList(values=[[email protected], 2021, 8]), PartitionValueList(values=[[email protected], 2021, 7]), PartitionValueList(values=[[email protected], 2021, 6]), PartitionValueList(values=[[email protected], 2019, 2]), PartitionValueList(values=[[email protected], 2021, 5]), PartitionValueList(values=[[email protected], 2021, 4]), PartitionValueList(values=[[email protected], 2019, 1]), PartitionValueList(values=[[email protected], 2018, 10]), PartitionValueList(values=[[email protected], 2019, 4]), PartitionValueList(values=[[email protected], 2019, 3]), Part itionValueList(values=[[email protected], 2019, 6]), PartitionValueList(values=[[email protected], 2019, 5]), PartitionValueList(values=[[email protected], 2019, 8]), PartitionValueList(values=[[email protected], 2019, 7]), PartitionValueList(values=[[email protected], 2018, 12]), PartitionValueList(values=[[email protected], 2018, 11]), PartitionValueList(values=[[email protected], 2019, 9]), PartitionValueList(values=[[email protected], 2017, 12]), PartitionValueList(values=[[email protected], 2017, 10]), PartitionValueList(values=[[email protected], 2017, 11]), PartitionValueList(values=[[email protected], 2021, 3]), PartitionValueList(values=[[email protected], 2021, 2]), PartitionValueList(values=[[email protected], 2021, 1]), PartitionValueList(values=[[email protected], 2016, 2]), PartitionValueList(values=[[email protected], 2016, 3]), PartitionValueList(values=[[email protected], 2016, 4]), Part itionValueList(values=[[email protected], 2016, 5]), PartitionValueList(values=[[email protected], 2016, 1]), PartitionValueList(values=[[email protected], 2016, 6]), PartitionValueList(values=[[email protected], 2016, 7]), PartitionValueList(values=[[email protected], 2016, 8]), PartitionValueList(values=[[email protected], 2016, 9]), PartitionValueList(values=[[email protected], 2022, 6]), PartitionValueList(values=[[email protected], 2022, 5]), PartitionValueList(values=[[email protected], 2022, 4]), PartitionValueList(values=[[email protected], 2022, 3]), PartitionValueList(values=[[email protected], 2022, 9]), PartitionValueList(values=[[email protected], 2022, 8]), PartitionValueList(values=[[email protected], 2022, 7]), PartitionValueList(values=[[email protected], 2022, 2]), PartitionValueList(values=[[email protected], 2022, 1]), PartitionValueList(values=[[email protected], 2001, 1]), Partition ValueList(values=[[email protected], 2017, 5]), PartitionValueList(values=[[email protected], 2017, 6]), PartitionValueList(values=[[email protected], 2017, 7]), PartitionValueList(values=[[email protected], 2017, 8]), PartitionValueList(values=[[email protected], 2017, 9]), PartitionValueList(values=[[email protected], 2017, 1]), PartitionValueList(values=[[email protected], 2017, 2]), PartitionValueList(values=[[email protected], 2017, 3]), PartitionValueList(values=[[email protected], 2017, 4]), PartitionValueList(values=[[email protected], 2014, 9]), PartitionValueList(values=[[email protected], 2018, 9]), PartitionValueList(values=[[email protected], 2018, 8]), PartitionValueList(values=[[email protected], 2018, 5]), PartitionValueList(values=[[email protected], 2018, 4]), PartitionValueList(values=[[email protected], 2018, 7]), PartitionValueList(values=[[email protected], 2018, 6]), PartitionValue List(values=[[email protected], 2018, 1]), PartitionValueList(values=[[email protected], 2018, 3]), PartitionValueList(values=[[email protected], 2018, 2]), PartitionValueList(values=[[email protected], 2016, 10]), PartitionValueList(values=[[email protected], 2016, 12]), PartitionValueList(values=[[email protected], 2016, 11]), PartitionValueList(values=[[email protected], 2020, 8]), PartitionValueList(values=[[email protected], 2020, 7]), PartitionValueList(values=[[email protected], 2020, 6]), PartitionValueList(values=[[email protected], 2022, 10]), PartitionValueList(values=[[email protected], 2022, 11]), PartitionValueList(values=[[email protected], 2020, 5]), PartitionValueList(values=[[email protected], 2022, 12]), PartitionValueList(values=[[email protected], __HIVE_DEFAULT_PARTITION__, __HIVE_DEFAULT_PARTITION__]), PartitionValueList(values=[[email protected], 2020, 9]), PartitionValueList(values=[e [email protected], 2023, 1]), PartitionValueList(values=[[email protected], 2023, 2]), PartitionValueList(values=[[email protected], 2023, 3]), PartitionValueList(values=[[email protected], 2023, 4]), PartitionValueList(values=[[email protected], 2023, 5]), PartitionValueList(values=[[email protected], 2023, 6]), PartitionValueList(values=[[email protected], 2023, 7]), PartitionValueList(values=[[email protected], 2023, 8]), PartitionValueList(values=[[email protected], 2023, 9]), PartitionValueList(values=[[email protected], 2020, 4]), PartitionValueList(values=[[email protected], 2020, 3]), PartitionValueList(values=[[email protected], 2020, 2]), PartitionValueList(values=[[email protected], 2020, 1]), PartitionValueList(values=[[email protected], 2019, 12]), PartitionValueList(values=[[email protected], 2019, 11]), PartitionValueList(values=[[email protected], 2019, 10]), PartitionValueList(values=[exa [email protected], 2020, 11]), PartitionValueList(values=[[email protected], 2020, 10]), PartitionValueList(values=[[email protected], 2015, 4]), PartitionValueList(values=[[email protected], 2020, 12]), PartitionValueList(values=[[email protected], 2015, 1]), PartitionValueList(values=[[email protected], 2027, 10]), PartitionValueList(values=[[email protected], 2012, 12])]' at 'partitionsToDelete' failed to satisfy constraint: Member must have length less than or equal to 25 (Service: AWSGlue; Status Code: 400; Error Code: ValidationException; Request ID: xxx; Proxy: null) at org.apache.hudi.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879) at org.apache.hudi.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1418) at org.apache.hudi.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1387) at org.apache.hudi.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157) at org.apache.hudi.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814) at org.apache.hudi.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781) at org.apache.hudi.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755) at org.apache.hudi.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715) at org.apache.hudi.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697) at org.apache.hudi.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561) at org.apache.hudi.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541) at org.apache.hudi.com.amazonaws.services.glue.AWSGlueClient.doInvoke(AWSGlueClient.java:13784) at org.apache.hudi.com.amazonaws.services.glue.AWSGlueClient.invoke(AWSGlueClient.java:13751) at org.apache.hudi.com.amazonaws.services.glue.AWSGlueClient.invoke(AWSGlueClient.java:13740) at org.apache.hudi.com.amazonaws.services.glue.AWSGlueClient.executeBatchDeletePartition(AWSGlueClient.java:406) at org.apache.hudi.com.amazonaws.services.glue.AWSGlueClient.batchDeletePartition(AWSGlueClient.java:375) at org.apache.hudi.aws.sync.AWSGlueCatalogSyncClient.dropPartitions(AWSGlueCatalogSyncClient.java:214) ... 62 more Exception in thread "main" org.apache.spark.SparkException: Application application_1695917956184_0001 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1337) at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1770) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1066) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1158) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1167) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
