[GitHub] [hudi] ganczarek opened a new issue #4666: [SUPPORT] Table downgrade fails to delete non-existing file

GitBox Fri, 21 Jan 2022 08:42:46 -0800


ganczarek opened a new issue #4666:
URL: https://github.com/apache/hudi/issues/4666



   When downgrading v2 table to v1 with hudi-cli, Hudi fails to delete a file 
that doesn't exist (see stack trace below). I suspect it's fine to just ignore 
files that don't exist during deletion.
   
   I don't know how I ended up having only 
`.hoodie/20220121153537.commit.requested` without corresponding 
`.hoodie/.temp/20220121153537`, but manual deletion of 
`.hoodie/20220121153537.commit.requested` fixed the problem, though perhaps it 
can have dire consequences later.
   
   **Environment Description**
   
   * Hudi version : 0.10.0
   * Spark version : 3.1.1
   * Hadoop version : 3.2.1
   * Storage : S3
   * Running on Docker? : no
   
   **Additional context**
   
   I built hudi-cli myself from `tags/release-0.10.0-rc2` with the following 
flags
   ```
   mvn clean package -DskipTests -Dscala-2.12 -Dspark3
   ```
   
   Downgrade command in hudi-cli:
   ```
   downgrade table --toVersion ONE --sparkProperties 
/etc/spark/conf/spark-defaults.conf --sparkMaster local
   ```
   
   **Stacktrace**
   
   ```
   22/01/21 16:25:31 WARN SparkMain: Failed: Could not upgrade/downgrade table 
at "s3://bucket/table" to version "ONE".
   org.apache.hudi.exception.HoodieIOException: File 
s3://bucket/table/.hoodie/.temp/20220121153537 does not exist.
   hudi:rawat 
org.apache.hudi.common.fs.FSUtils.parallelizeSubPathProcess(FSUtils.java:684)
   hudi:rawat 
org.apache.hudi.table.upgrade.TwoToOneDowngradeHandler.deleteTimelineBasedMarkerFiles(TwoToOneDowngradeHandler.java:129)
   hudi:rawat 
org.apache.hudi.table.upgrade.TwoToOneDowngradeHandler.convertToDirectMarkers(TwoToOneDowngradeHandler.java:120)
   hudi:rawat 
org.apache.hudi.table.upgrade.TwoToOneDowngradeHandler.downgrade(TwoToOneDowngradeHandler.java:67)
   hudi:rawat 
org.apache.hudi.table.upgrade.UpgradeDowngrade.downgrade(UpgradeDowngrade.java:155)
   hudi:rawat 
org.apache.hudi.table.upgrade.UpgradeDowngrade.run(UpgradeDowngrade.java:125)
   hudi:rawat 
org.apache.hudi.cli.commands.SparkMain.upgradeOrDowngradeTable(SparkMain.java:462)
   hudi:rawat org.apache.hudi.cli.commands.SparkMain.main(SparkMain.java:230)
   hudi:rawat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   hudi:rawat 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   hudi:rawat 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   hudi:rawat java.lang.reflect.Method.invoke(Method.java:498)
   hudi:rawat 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   hudi:rawat 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959)
   hudi:rawat 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   hudi:rawat org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   hudi:rawat org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   hudi:rawat 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1038)
   hudi:rawat org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1047)
   hudi:rawat org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   Caused by: java.io.FileNotFoundException: File 
s3://bucket/table/.hoodie/.temp/20220121153537 does not exist.
   hudi:rawat 
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.listStatus(S3NativeFileSystem.java:709)
   hudi:rawat 
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.listStatus(S3NativeFileSystem.java:636)
   hudi:rawat 
com.amazon.ws.emr.hadoop.fs.EmrFileSystem.listStatus(EmrFileSystem.java:473)
   hudi:rawat 
org.apache.hudi.common.fs.FSUtils.parallelizeSubPathProcess(FSUtils.java:677)
   hudi:raw... 19 
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] ganczarek opened a new issue #4666: [SUPPORT] Table downgrade fails to delete non-existing file

Reply via email to