linfey90 commented on PR #6456:
URL: https://github.com/apache/hudi/pull/6456#issuecomment-1221973194
> Hi, can we explain in detail what are we trying to fix here ?
When we build a mor table, for example create a table named test1 using
Sparksql.then we'll see the table inputFormat is
HoodieParquetRealtimeInputFormat through hive client,because the default value
of inputFormat is HoodieParquetRealtimeInputFormat.When use hive sync metadata
and skip the _ro suffix.then we'll get two tables like test1,test1_rt,their
inputFormat value are all HoodieParquetRealtimeInputFormat,test1 was created
before syncing. Meta Sync does not change inputFormat, so I changed the default
value this time. Of course we can fix the code next time in meta sync.I think
we should change the default value of inputFormat,just like cow table.
hive>show create table test1;
CREATE EXTERNAL TABLE `test1`(
……
STORED AS INPUTFORMAT
'org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat'
……);
hive>show create table test1_rt;
CREATE EXTERNAL TABLE `test1_rt`(
……
OUTPUTFORMAT
STORED AS INPUTFORMAT
'org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat'
……);
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]