Ethan Guo created HUDI-3635:
-------------------------------
Summary: Fix HoodieMetadataTableValidator around comparison of
partition path listing
Key: HUDI-3635
URL: https://issues.apache.org/jira/browse/HUDI-3635
Project: Apache Hudi
Issue Type: Task
Reporter: Ethan Guo
Scenario: multi-writer test, one writer doing ingesting with Deltastreamer
continuous mode, COW, inserts, async clustering and cleaning (partitions under
2022/1, 2022/2), another writer with Spark datasource doing backfills to
different partitions (2021/12). When the backfill to 2021/12/3 failed in the
middle, the partition listing between FS and metadata table mismatch, but this
is expected and should not fail the validation.
{code:java}
22/03/14 10:24:24 ERROR HoodieMetadataTableValidator: Compare Partitions
Failed! AllPartitionPathsFromFS : [2021/12/1, 2021/12/2, 2021/12/3, 2022/1/24,
2022/1/25, 2022/1/26, 2022/1/27, 2022/1/28, 2022/1/29, 2022/1/30, 2022/1/31,
2022/2/1, 2022/2/2] and allPartitionPathsMeta : [2021/12/1, 2021/12/2,
2022/1/24, 2022/1/25, 2022/1/26, 2022/1/27, 2022/1/28, 2022/1/29, 2022/1/30,
2022/1/31, 2022/2/1, 2022/2/2]
22/03/14 10:24:24 ERROR HoodieMetadataTableValidator: Metadata table validation
failed to HoodieValidationException
org.apache.hudi.exception.HoodieValidationException: Compare Partitions Failed!
AllPartitionPathsFromFS : [2021/12/1, 2021/12/2, 2021/12/3, 2022/1/24,
2022/1/25, 2022/1/26, 2022/1/27, 2022/1/28, 2022/1/29, 2022/1/30, 2022/1/31,
2022/2/1, 2022/2/2] and allPartitionPathsMeta : [2021/12/1, 2021/12/2,
2022/1/24, 2022/1/25, 2022/1/26, 2022/1/27, 2022/1/28, 2022/1/29, 2022/1/30,
2022/1/31, 2022/2/1, 2022/2/2]
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.validatePartitions(HoodieMetadataTableValidator.java:395)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.doMetadataTableValidation(HoodieMetadataTableValidator.java:349)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.doHoodieMetadataTableValidationOnce(HoodieMetadataTableValidator.java:324)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.run(HoodieMetadataTableValidator.java:310)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.main(HoodieMetadataTableValidator.java:294)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)