praneethh opened a new issue, #7475:
URL: https://github.com/apache/hudi/issues/7475
Have created a Hudi table in hive and when reading the timestamp column from
Hive getting the below exception
`java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
cast to org.apache.hadoop.hive.serde2.io.TimestampWritableV2`
There is no issue when reading from spark-shell. How to resolve the error
when reading from Hive?
Steps to reproduce the behavior:
```
import java.sql.{Date, Timestamp}
case class SimpleData(ts: Timestamp, name: String, email: String,
src_recv_dt: Date, recvd_dt: Date)
val df1 = List(SimpleData(Timestamp.valueOf("2022-12-02 09:47:00"), "Fake
Name 5", "[email protected]", Date.valueOf("2022-12-02"),
Date.valueOf("2022-12-03")),
| SimpleData(Timestamp.valueOf("2022-12-29 09:47:00"), "Fake Name
4", "[email protected]", Date.valueOf("2022-12-29"),
Date.valueOf("2022-12-03"))).toDF().as[SimpleData]
df1.show
+-------------------+-----------+-------------------+-----------+----------+
| ts| name| email|src_recv_dt| recvd_dt|
+-------------------+-----------+-------------------+-----------+----------+
|2022-12-02 09:47:00|Fake Name 5|[email protected]| 2022-12-02|2022-12-03|
|2022-12-29 09:47:00|Fake Name 4|[email protected]| 2022-12-29|2022-12-03|
+-------------------+-----------+-------------------+-----------+----------+
df1.write.format("hudi").options(Map("hoodie.table.name"-> "rx",
| "hoodie.datasource.write.recordkey.field"-> "name",
| "hoodie.datasource.write.partitionpath.field"-> "recvd_dt",
| "hoodie.datasource.write.operation"-> "upsert",
| "hoodie.payload.ordering.field" -> "ts",
| "hoodie.index.type"-> "GLOBAL_SIMPLE",
| "hoodie.upsert.shuffle.parallelism"-> "1",
| "hoodie.simple.index.update.partition.path"-> "false",
| "hoodie.datasource.write.hive_style_partitioning" -> "true",
| "hoodie.datasource.write.payload.class" ->
"org.apache.hudi.common.model.DefaultHoodieRecordPayload",
| "hoodie.datasource.hive_sync.database" -> "stg_ww",
| "hoodie.datasource.hive_sync.table"->"rx",
| "hoodie.datasource.hive_sync.enable"->"true",
| "hoodie.datasource.hive_sync.partition_fields"->"recvd_dt",
| "hoodie.datasource.hive_sync.mode"->"hms",
| "hoodie.datasource.hive_sync.use_jdbc"->"false",
| "hoodie.datasource.write.precombine.field"->"ts",
| "hoodie.schema.on.read.enable"->"true",
| "hoodie.datasource.hive_sync.support_timestamp"->"true",
|
"hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled"->"true")).mode("append").save("gs://....rx")
hive> select ts from stg_ww.rx;
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable
cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritableV2
at
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableTimestampObjectInspector.getPrimitiveWritableObject(WritableTimestampObjectInspector.java:34)
at
org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:308)
at
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
at
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
at
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
at
org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:951)
at
org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928)
at
org.apache.hadoop.hive.ql.exec.LimitOperator.process(LimitOperator.java:63)
at
org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928)
at
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at
org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
at
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
at
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:153)
at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)
```
**Expected behavior**
A clear and concise description of what you expected to happen.
The query should return timestamp values
**Environment Description**
* Hudi version : 0.12.0
* Spark version : 3.1.3
* Hive version : 3.1.2
* Hadoop version : 3.2.3
* Storage (HDFS/S3/GCS..) : GCS
* Running on Docker? (yes/no) : No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]