[GitHub] [hudi] Zouxxyy commented on a diff in pull request #8955: [HUDI-6367] Fix NPE in HoodieAvroParquetReader and support complex schema with timestamp

via GitHub Wed, 14 Jun 2023 18:39:45 -0700


Zouxxyy commented on code in PR #8955:
URL: https://github.com/apache/hudi/pull/8955#discussion_r1230323622



##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/avro/HoodieAvroParquetReader.java:
##########
@@ -50,24 +51,20 @@ public class HoodieAvroParquetReader extends 
RecordReader<Void, ArrayWritable> {
   private Schema baseSchema;

Review Comment:
   @cdmikechen Have you ever tested `select id, ts1 from test_ts_1`?  will 
return null if don't use `baseSchema`
   Below is my full test, fell free to try
   
   ```sql
   -- spark-sql
   create table test_ts_1(
     id int, 
     ts1 timestamp)
   using hudi
   tblproperties(
     type='mor', 
     primaryKey='id'
   );
   
   INSERT INTO test_ts_1
   SELECT 1,
   cast ('2021-12-25 12:01:01' as timestamp);
   
   create table test_ts_2(
     id int, 
     ts1 array<timestamp>, 
     ts2 map<string, timestamp>, 
     ts3 struct<province:timestamp, city:string>)
   using hudi
   tblproperties(
     type='mor', 
     primaryKey='id'
   );
   
   INSERT INTO test_ts_2
   SELECT 1,
   array(cast ('2021-12-25 12:01:01' as timestamp)),
   map('key', cast ('2021-12-25 12:01:01' as timestamp)),
   struct(cast ('2021-12-25 12:01:01' as timestamp), 'test');
   
   -- hive
   select * from test_ts_1;
   select id from test_ts_1;
   select ts1 from test_ts_1;
   select id, ts1 from test_ts_1;
   select count(*) from test_ts_1;
   
   select * from test_ts_2;
   select id from test_ts_2;
   select ts1 from test_ts_2;
   select id, ts1 from test_ts_2;
   select count(*) from test_ts_2;
   ```
   
   CC @danny0405 @xicm 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] Zouxxyy commented on a diff in pull request #8955: [HUDI-6367] Fix NPE in HoodieAvroParquetReader and support complex schema with timestamp

Reply via email to