[GitHub] [hudi] zyclove commented on issue #9016: [SUPPORT]spark-sql MOR query error with org.apache.avro.SchemaParseException: Cannot parse schema

via GitHub Tue, 27 Jun 2023 05:21:23 -0700


zyclove commented on issue #9016:
URL: https://github.com/apache/hudi/issues/9016#issuecomment-1609389487


   @xushiyan @ad1happy2go 
   After upgrading hudi 0.13.1, it can be executed normally after adding the 
above two configurations, but it failed again today, and still reports the same 
error.
   Can you take a look at this issue first, thank you.
   
   ```java
   23/06/27 12:16:18 ERROR Executor: Exception in task 3.0 in stage 7.0 (TID 99)
   org.apache.avro.SchemaParseException: Cannot parse <null> schema
           at org.apache.avro.Schema.parse(Schema.java:1633)
           at org.apache.avro.Schema$Parser.parse(Schema.java:1430)
           at org.apache.avro.Schema$Parser.parse(Schema.java:1418)
           at 
org.apache.hudi.common.util.InternalSchemaCache.getInternalSchemaByVersionId(InternalSchemaCache.java:225)
           at 
org.apache.spark.sql.execution.datasources.parquet.Spark32PlusHoodieParquetFileFormat.$anonfun$buildReaderWithPartitionValues$2(Spark32PlusHoodieParquetFileFormat.scala:159)
           at 
org.apache.hudi.HoodieDataSourceHelper$.$anonfun$buildHoodieParquetReader$1(HoodieDataSourceHelper.scala:71)
           at 
org.apache.hudi.HoodieBaseRelation.$anonfun$createBaseFileReader$2(HoodieBaseRelation.scala:569)
           at 
org.apache.hudi.HoodieBaseRelation$BaseFileReader.apply(HoodieBaseRelation.scala:637)
           at 
org.apache.hudi.RecordMergingFileIterator.<init>(Iterators.scala:188)
           at 
org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:100)
           at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
           at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
           at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
           at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
           at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
           at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
           at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
           at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
           at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
           at org.apache.spark.scheduler.Task.run(Task.scala:133)
           at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
           at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1474)
           at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)``` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] zyclove commented on issue #9016: [SUPPORT]spark-sql MOR query error with org.apache.avro.SchemaParseException: Cannot parse schema

Reply via email to