Alexander Filipchik created HUDI-716:
----------------------------------------

             Summary: Exception: Not an Avro data file when running 
HoodieCleanClient.runClean
                 Key: HUDI-716
                 URL: https://issues.apache.org/jira/browse/HUDI-716
             Project: Apache Hudi (incubating)
          Issue Type: Bug
          Components: DeltaStreamer
            Reporter: Alexander Filipchik


Just upgraded to upstream master from 0.5 and seeing an issue at the end of the 
delta sync run: 

20/03/17 02:13:49 ERROR HoodieDeltaStreamer: Got error running delta sync once. 
Shutting down20/03/17 02:13:49 ERROR HoodieDeltaStreamer: Got error running 
delta sync once. Shutting downorg.apache.hudi.exception.HoodieIOException: Not 
an Avro data file at 
org.apache.hudi.client.HoodieCleanClient.runClean(HoodieCleanClient.java:144) 
at 
org.apache.hudi.client.HoodieCleanClient.lambda$clean$0(HoodieCleanClient.java:88)
 at 
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) 
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) 
at org.apache.hudi.client.HoodieCleanClient.clean(HoodieCleanClient.java:86) at 
org.apache.hudi.client.HoodieWriteClient.clean(HoodieWriteClient.java:843) at 
org.apache.hudi.client.HoodieWriteClient.postCommit(HoodieWriteClient.java:520) 
at 
org.apache.hudi.client.AbstractHoodieWriteClient.commit(AbstractHoodieWriteClient.java:168)
 at 
org.apache.hudi.client.AbstractHoodieWriteClient.commit(AbstractHoodieWriteClient.java:111)
 at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:395)
 at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:237) 
at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:121)
 at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:294)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
 at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at 
org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at 
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at 
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at 
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
java.io.IOException: Not an Avro data file at 
org.apache.avro.file.DataFileReader.openReader(DataFileReader.java:50) at 
org.apache.hudi.common.util.AvroUtils.deserializeAvroMetadata(AvroUtils.java:147)
 at 
org.apache.hudi.common.util.CleanerUtils.getCleanerPlan(CleanerUtils.java:87) 
at 
org.apache.hudi.client.HoodieCleanClient.runClean(HoodieCleanClient.java:141) 
... 24 more

 

It is attempting to read an old cleanup file (2 month old) and crashing

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to