[ 
https://issues.apache.org/jira/browse/HUDI-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yuehanwang updated HUDI-3962:
-----------------------------
    Description: 
h1. flink cdc sink hudi failed to add hive partition fields for hive sync

 

Steps to reproduce the behavior:

1. create a mysql table like :  
```
CREATE TABLE `timeTypeTest` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `datetime1` datetime DEFAULT NULL,
  `date1` date DEFAULT NULL,
  `datetime16` datetime(6) DEFAULT NULL,
  `time16` time DEFAULT NULL,
  `timestamp16` timestamp(6) NULL DEFAULT NULL,
  `timestamp16Partition` varchar(45) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id_UNIQUE` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=latin1
```

2. insert a data
`insert into mydb.timeTypeTest values ('2', '2020-07-30 10:08:22', 
'2020-07-30', '2020-07-30 10:08:22.000000', '10:08:22', '2020-07-30 
10:08:22.000000', '2020-07-30')`

4. start a flink cdc to sink hudi with my config properties:

```
--hive-sync-enable=ture
--hive-sync-jdbc-url=jdbc:hive2://localhost:10000
--hive-sync-db=testDb
--hive-sync-table=testTable
--record-key-field=id
--partition-path-field=timestamp16
--hive-sync-partition-fields=inc_day
--hive-style-partitioning=true
--hive-sync-mode=jdbc
--hive-sync-username=hive
--hive-sync-password=hive

hoodie.deltastreamer.keygen.timebased.timestamp.type=EPOCHMILLISECONDS
hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy-MM-dd
hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled=true
hive_sync.partition_extractor_class=org.apache.hudi.keygen.TimestampBasedAvroKeyGenerator
```


**Expected behavior**
create a hive table testTable with string partition field _inc_day_ and add a 
partition "2020-07-30".  But actually the partition field is _timestamp16_ with 
bigint type.

```
show partitions testTable;  ---- "2020-07-30"
select timestamp16 from testTable; ----- null
```

  was:
h1. https://github.com/apache/hudi/issues/5394#issue-1211958235

 

 


> flink cdc sink hudi failed to add hive partition fields for hive sync
> ---------------------------------------------------------------------
>
>                 Key: HUDI-3962
>                 URL: https://issues.apache.org/jira/browse/HUDI-3962
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: yuehanwang
>            Priority: Major
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> h1. flink cdc sink hudi failed to add hive partition fields for hive sync
>  
> Steps to reproduce the behavior:
> 1. create a mysql table like :  
> ```
> CREATE TABLE `timeTypeTest` (
>   `id` int(11) NOT NULL AUTO_INCREMENT,
>   `datetime1` datetime DEFAULT NULL,
>   `date1` date DEFAULT NULL,
>   `datetime16` datetime(6) DEFAULT NULL,
>   `time16` time DEFAULT NULL,
>   `timestamp16` timestamp(6) NULL DEFAULT NULL,
>   `timestamp16Partition` varchar(45) DEFAULT NULL,
>   PRIMARY KEY (`id`),
>   UNIQUE KEY `id_UNIQUE` (`id`)
> ) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=latin1
> ```
> 2. insert a data
> `insert into mydb.timeTypeTest values ('2', '2020-07-30 10:08:22', 
> '2020-07-30', '2020-07-30 10:08:22.000000', '10:08:22', '2020-07-30 
> 10:08:22.000000', '2020-07-30')`
> 4. start a flink cdc to sink hudi with my config properties:
> ```
> --hive-sync-enable=ture
> --hive-sync-jdbc-url=jdbc:hive2://localhost:10000
> --hive-sync-db=testDb
> --hive-sync-table=testTable
> --record-key-field=id
> --partition-path-field=timestamp16
> --hive-sync-partition-fields=inc_day
> --hive-style-partitioning=true
> --hive-sync-mode=jdbc
> --hive-sync-username=hive
> --hive-sync-password=hive
> hoodie.deltastreamer.keygen.timebased.timestamp.type=EPOCHMILLISECONDS
> hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy-MM-dd
> hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled=true
> hive_sync.partition_extractor_class=org.apache.hudi.keygen.TimestampBasedAvroKeyGenerator
> ```
> **Expected behavior**
> create a hive table testTable with string partition field _inc_day_ and add a 
> partition "2020-07-30".  But actually the partition field is _timestamp16_ 
> with bigint type.
> ```
> show partitions testTable;  ---- "2020-07-30"
> select timestamp16 from testTable; ----- null
> ```



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to