[GitHub] [hudi] hseagle commented on issue #2538: [SUPPORT]MOR doesn't work on version 0.7.0

GitBox Fri, 05 Feb 2021 20:35:13 -0800


hseagle commented on issue #2538:
URL: https://github.com/apache/hudi/issues/2538#issuecomment-774397457



   After set  **spark.sql.hive.convertMetastoreParquet=false**, the duplicated 
key can be removed. 
   
   But another issue is encounted, the column with timestamp data type cann't 
be fetched.  The error message is showed as below
   
   ```log
   Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable 
cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable
     at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableTimestampObjectInspector.getPrimitiveJavaObject(WritableTimestampObjectInspector.java:39)
     at 
org.apache.spark.sql.hive.HadoopTableReader$.$anonfun$fillObject$14(TableReader.scala:468)
     at 
org.apache.spark.sql.hive.HadoopTableReader$.$anonfun$fillObject$14$adapted(TableReader.scala:467)
     at 
org.apache.spark.sql.hive.HadoopTableReader$.$anonfun$fillObject$18(TableReader.scala:493)
     at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
     at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
     at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
     at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
     at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729)
     at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:340)
     at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:872)
     at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:872)
     at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
     at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
     at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
     at org.apache.spark.scheduler.Task.run(Task.scala:127)
     at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
     at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
     at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
     at java.lang.Thread.run(Thread.java:748)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] hseagle commented on issue #2538: [SUPPORT]MOR doesn't work on version 0.7.0

Reply via email to