[ 
https://issues.apache.org/jira/browse/HUDI-7955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

voon updated HUDI-7955:
-----------------------
    Summary: Account for 
WritableTimestampObjectInspector#getPrimitiveJavaObject Hive3 and Hive2 
discrepancies  (was: Account for Hive3 and Hive2 getPrimitiveJavaObject 
discrepancy for WritableTimestampObjectInspector )

> Account for WritableTimestampObjectInspector#getPrimitiveJavaObject Hive3 and 
> Hive2 discrepancies
> -------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-7955
>                 URL: https://issues.apache.org/jira/browse/HUDI-7955
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: voon
>            Assignee: voon
>            Priority: Major
>         Attachments: image-2024-07-05-18-11-33-420.png, 
> image-2024-07-05-18-13-28-135.png
>
>
> The invocation of *getPrimitiveJavaObject* returns a different implementation 
> of timestamp in Hive3 and Hive2. 
>  - Hive2: *java.sql.Timestamp*
>  - Hive3: *org.apache.hadoop.hive.common.type.Timestamp*
> Hudi common is compiled with Hive2, but Trino is using Hive3, causing the 
> discrepancy between compile and runtime. When execution flow falls into this 
> section of the code where the trigger conditions are listed below:
> 1. MOR table is used
> 2. User is querying the _rt table
> 3. User's table has a *TIMESTAMP* type and query requires it
> 4. Merge is required as record is present in both Parquet and Log file
> Error below will be thrown:
> {code:java}
> Query 20240704_075218_05052_yfmfc failed: 'java.sql.Timestamp 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableTimestampObjectInspector.getPrimitiveJavaObject(java.lang.Object)'
> java.lang.NoSuchMethodError: 'java.sql.Timestamp 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableTimestampObjectInspector.getPrimitiveJavaObject(java.lang.Object)'
>         at 
> org.apache.hudi.hadoop.utils.HiveAvroSerializer.serializePrimitive(HiveAvroSerializer.java:304)
>         at 
> org.apache.hudi.hadoop.utils.HiveAvroSerializer.serialize(HiveAvroSerializer.java:212)
>         at 
> org.apache.hudi.hadoop.utils.HiveAvroSerializer.setUpRecordFieldFromWritable(HiveAvroSerializer.java:121)
>         at 
> org.apache.hudi.hadoop.utils.HiveAvroSerializer.serialize(HiveAvroSerializer.java:108)
>         at 
> org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.convertArrayWritableToHoodieRecord(RealtimeCompactedRecordReader.java:185)
>         at 
> org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.mergeRecord(RealtimeCompactedRecordReader.java:172)
>         at 
> org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.next(RealtimeCompactedRecordReader.java:114)
>         at 
> org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.next(RealtimeCompactedRecordReader.java:49)
>         at 
> org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.next(HoodieRealtimeRecordReader.java:88)
>         at 
> org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.next(HoodieRealtimeRecordReader.java:36)
>         at 
> io.trino.plugin.hive.GenericHiveRecordCursor.advanceNextPosition(GenericHiveRecordCursor.java:215)
>         at 
> io.trino.spi.connector.RecordPageSource.getNextPage(RecordPageSource.java:88)
>         at 
> io.trino.plugin.hudi.HudiPageSource.getNextPage(HudiPageSource.java:120){code}
> h1. Hive3
> !image-2024-07-05-18-11-33-420.png|width=509,height=572!
> h1. Hive2
> !image-2024-07-05-18-13-28-135.png|width=507,height=501!
>  
> h1. Solution:
> Hive shimming should be applied to obtaining *getPrimitiveJavaObject* too.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to