[
https://issues.apache.org/jira/browse/HUDI-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Guo updated HUDI-6670:
----------------------------
Description:
Metadata table validator (`HoodieMetadataTableValidator`) throws the following
exception when there is completed rollback and no completed commits in the data
table and there is no completed instants in the MDT. In this case, both data
table and MDT are empty, but the timeline check has a bug causing the
validation to fail, which is a false positive.
{code:java}
23/08/08 22:48:13 WARN HoodieMetadataTableValidator: Metadata table is not
available to read for now,
org.apache.hudi.exception.HoodieValidationException: There is no completed
instant for metadata table.
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.checkMetadataTableIsAvailable(HoodieMetadataTableValidator.java:500)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.doMetadataTableValidation(HoodieMetadataTableValidator.java:405)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.doHoodieMetadataTableValidationOnce(HoodieMetadataTableValidator.java:377)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.run(HoodieMetadataTableValidator.java:362)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.main(HoodieMetadataTableValidator.java:342)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) {code}
Data table timeline
{code:java}
Aug 6 22:57 20230807055713708.deltacommit.requested
Aug 6 22:57 20230807055743816.rollback
Aug 6 22:57 20230807055743816.rollback.inflight
Aug 6 22:57 20230807055743816.rollback.requested {code}
MDT timeline
{code:java}
Aug 6 22:56 00000000000000010.deltacommit.inflight
Aug 6 22:56 00000000000000010.deltacommit.requested
Aug 6 22:57 20230807055748276.restore.inflight
Aug 6 22:57 20230807055748276.restore.requested
Aug 6 22:57 20230807055749482.rollback.inflight
Aug 6 22:57 20230807055749482.rollback.requested {code}
was:
Metadata table validator throws the following exception when there is completed
rollback and no completed commits in the data table and there is no completed
instants in the MDT. In this case, both data table and MDT are empty, but the
{code:java}
23/08/08 22:48:13 WARN HoodieMetadataTableValidator: Metadata table is not
available to read for now,
org.apache.hudi.exception.HoodieValidationException: There is no completed
instant for metadata table.
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.checkMetadataTableIsAvailable(HoodieMetadataTableValidator.java:500)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.doMetadataTableValidation(HoodieMetadataTableValidator.java:405)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.doHoodieMetadataTableValidationOnce(HoodieMetadataTableValidator.java:377)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.run(HoodieMetadataTableValidator.java:362)
at
org.apache.hudi.utilities.HoodieMetadataTableValidator.main(HoodieMetadataTableValidator.java:342)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) {code}
Data table timeline
{code:java}
Aug 6 22:57 20230807055713708.deltacommit.requested
Aug 6 22:57 20230807055743816.rollback
Aug 6 22:57 20230807055743816.rollback.inflight
Aug 6 22:57 20230807055743816.rollback.requested {code}
MDT timeline
{code:java}
Aug 6 22:56 00000000000000010.deltacommit.inflight
Aug 6 22:56 00000000000000010.deltacommit.requested
Aug 6 22:57 20230807055748276.restore.inflight
Aug 6 22:57 20230807055748276.restore.requested
Aug 6 22:57 20230807055749482.rollback.inflight
Aug 6 22:57 20230807055749482.rollback.requested {code}
> Fix timeline check in metadata table validator
> ----------------------------------------------
>
> Key: HUDI-6670
> URL: https://issues.apache.org/jira/browse/HUDI-6670
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Ethan Guo
> Priority: Major
>
> Metadata table validator (`HoodieMetadataTableValidator`) throws the
> following exception when there is completed rollback and no completed commits
> in the data table and there is no completed instants in the MDT. In this
> case, both data table and MDT are empty, but the timeline check has a bug
> causing the validation to fail, which is a false positive.
> {code:java}
> 23/08/08 22:48:13 WARN HoodieMetadataTableValidator: Metadata table is not
> available to read for now,
> org.apache.hudi.exception.HoodieValidationException: There is no completed
> instant for metadata table.
> at
> org.apache.hudi.utilities.HoodieMetadataTableValidator.checkMetadataTableIsAvailable(HoodieMetadataTableValidator.java:500)
> at
> org.apache.hudi.utilities.HoodieMetadataTableValidator.doMetadataTableValidation(HoodieMetadataTableValidator.java:405)
> at
> org.apache.hudi.utilities.HoodieMetadataTableValidator.doHoodieMetadataTableValidationOnce(HoodieMetadataTableValidator.java:377)
> at
> org.apache.hudi.utilities.HoodieMetadataTableValidator.run(HoodieMetadataTableValidator.java:362)
> at
> org.apache.hudi.utilities.HoodieMetadataTableValidator.main(HoodieMetadataTableValidator.java:342)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
> at
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) {code}
> Data table timeline
> {code:java}
> Aug 6 22:57 20230807055713708.deltacommit.requested
> Aug 6 22:57 20230807055743816.rollback
> Aug 6 22:57 20230807055743816.rollback.inflight
> Aug 6 22:57 20230807055743816.rollback.requested {code}
> MDT timeline
> {code:java}
> Aug 6 22:56 00000000000000010.deltacommit.inflight
> Aug 6 22:56 00000000000000010.deltacommit.requested
> Aug 6 22:57 20230807055748276.restore.inflight
> Aug 6 22:57 20230807055748276.restore.requested
> Aug 6 22:57 20230807055749482.rollback.inflight
> Aug 6 22:57 20230807055749482.rollback.requested {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)