tandonraghav opened a new issue #2165:
URL: https://github.com/apache/hudi/issues/2165
**Describe the problem you faced**
I am using Spark DF to persist Hudi Table and Hive sync is enabled. But when
i query *_ro table all works fine but *_rt table is not working and giving
exception.
- I am using custom class to do `preCombine` and combineAndUpdateValue` , so
I have included my jar file in ${Hive}/lib folder
- Also, tried to set conf in a Hive session `set
hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;` and `set
hive.fetch.task.conversion=none;`
Hive - 2.3.7
Spark - 2
hudi-hadoop-mr-bundle-0.6.0.jar
Hudi - 0.6.0
Actual Exception -> **Caused by: java.lang.ClassCastException:
org.apache.hudi.org.apache.avro.generic.GenericData$Record cannot be cast to
org.apache.avro.generic.GenericRecord**
````
CREATE EXTERNAL TABLE `bhuvan_123_ro`(
`_hoodie_commit_time` string,
`_hoodie_commit_seqno` string,
`_hoodie_record_key` string,
`_hoodie_partition_path` string,
`_hoodie_file_name` string,
`ts_ms` bigint,
`pincode` double,
`image_link` string,
`_id` string,
`op` string,
`a` string,
`b` string,
`c` string,
`d` string,
`e` double)
PARTITIONED BY (
`db_name` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hudi.hadoop.HoodieParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
'file:/tmp/test/hudi-user-data/MOE_PRODUCT_INFO.bhuvan_123'
TBLPROPERTIES (
'last_commit_time_sync'='20201010202918',
'transient_lastDdlTime'='1602341935')
Time taken: 0.192 seconds, Fetched: 29 row(s)
````
Exception-
````
org.apache.hudi.exception.HoodieException: Unable to instantiate payload
class
at
org.apache.hudi.common.util.ReflectionUtils.loadPayload(ReflectionUtils.java:78)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at
org.apache.hudi.common.util.SpillableMapUtils.convertToHoodieRecordPayload(SpillableMapUtils.java:116)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at
org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processDataBlock(AbstractHoodieLogRecordScanner.java:277)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at
org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processQueuedBlocksForInstant(AbstractHoodieLogRecordScanner.java:306)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at
org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.scan(AbstractHoodieLogRecordScanner.java:239)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:81)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at
org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.getMergedLogRecordScanner(RealtimeCompactedRecordReader.java:76)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at
org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.<init>(RealtimeCompactedRecordReader.java:55)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at
org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.constructRecordReader(HoodieRealtimeRecordReader.java:70)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at
org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.<init>(HoodieRealtimeRecordReader.java:47)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at
org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat.getRecordReader(HoodieParquetRealtimeInputFormat.java:186)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:376)
~[hive-exec-2.3.7.jar:2.3.7]
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:169)
~[hadoop-mapreduce-client-core-2.10.0.jar:?]
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:438)
~[hadoop-mapreduce-client-core-2.10.0.jar:?]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
~[hadoop-mapreduce-client-core-2.10.0.jar:?]
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:270)
~[hadoop-mapreduce-client-common-2.10.0.jar:?]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[?:1.8.0_222]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[?:1.8.0_222]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_222]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_222]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method) ~[?:1.8.0_222]
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
~[?:1.8.0_222]
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
~[?:1.8.0_222]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
~[?:1.8.0_222]
at
org.apache.hudi.common.util.ReflectionUtils.loadPayload(ReflectionUtils.java:76)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
... 20 more
Caused by: java.lang.ClassCastException:
org.apache.hudi.org.apache.avro.generic.GenericData$Record cannot be cast to
org.apache.avro.generic.GenericRecord
at
com.moengage.dpm.jobs.MergeHudiPayload.<init>(MergeHudiPayload.java:41)
~[dpm-feed-spark-jobs-1.0.10-rc0.jar:?]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method) ~[?:1.8.0_222]
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
~[?:1.8.0_222]
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
~[?:1.8.0_222]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
~[?:1.8.0_222]
at
org.apache.hudi.common.util.ReflectionUtils.loadPayload(ReflectionUtils.java:76)
~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
````
Line Where Exception is thrown-
````
public MergeHudiPayload(Option<GenericRecord> record) {
this(record.isPresent() ? record.get() : null, (record1) -> 0); //
natural order
}
````
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]