imhy opened a new pull request, #11898:
URL: https://github.com/apache/hudi/pull/11898
For hudi MOR table with timestamp column In some cases we get exception:
org.apache.hadoop.io.LongWritable cannot be cast to
org.apache.hadoop.hive.serde2.io.TimestampWritable*
This happens because timestamp long value in log file converted to
LongWritable but must be converted to TimestampWritable.
Reason: field AbstractRealtimeRecordReader.supportTimestamp not initialized
if Schema evolution context contain internal schema.
This example reproduce problem (tested on spark3.3, hive 3.1):
spark-sql>
set hoodie.schema.on.read.enable=true;
create table hudi_test1 (col0 int, col1 float, col2 string, col3 timestamp)
using hudi
tblproperties (
type='mor',
primaryKey='col0',
preCombineField='col1',
'hoodie.compaction.payload.class'='org.apache.hudi.common.model.OverwriteWithLatestAvroPayload');
insert into hudi_test1 values(1, 1.1, 'text', timestamp('2021-12-25
12:01:01'));
update hudi_test1 set col2 = 'text2' where col0 = 1;
alter table hudi_test1 rename column col2 to col2_new;
hive>
set hoodie.schema.on.read.enable=true;
select * from hudi_test1;
Failed with exception java.io.IOException:java.lang.ClassCastException:
org.apache.hadoop.io.LongWritable cannot be cast to
org.apache.hadoop.hive.serde2.io.TimestampWritableV2
### Change Logs
Fixed initialization of supportTimestamp property.
### Impact
none
### Risk level (write none, low medium or high below)
low
### Documentation Update
none
### Contributor's checklist
- [ ] Read through [contributor's
guide](https://hudi.apache.org/contribute/how-to-contribute)
- [ ] Change Logs and Impact were stated clearly
- [ ] Adequate tests were added if applicable
- [ ] CI passed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]