[ 
https://issues.apache.org/jira/browse/FLINK-27777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Small Wong updated FLINK-27777:
-------------------------------
    Description: 
After set {*}table.exec.hive.fallback-mapred-writer=false{*}, then sink hive by 
native parquet&orc writer, but can not get the *`parquet.compression`*  of 
`{*}formatConf{*}` in class `{*}HiveTableSink{*}`.

There is no field `parquet.compression` in `jobConf` or 
`sd.getSerdeInfo().getParameters()`. And `parquet.compression` just  exists in 
`{*}hive table properties{*}` as follows. 

 
{code:java}
// code placeholder

CREATE TABLE `hive_table`(
  `user_id` int,
  `order_amount` double)
PARTITIONED BY (
  `dt` string,
  `hr` string)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
  'hdfs://xxxx'
TBLPROPERTIES (
  'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
  'sink.partition-commit.delay'='1 h',
  'sink.partition-commit.policy.kind'='metastore,success-file',
  'sink.partition-commit.trigger'='partition-time',
  'transient_lastDdlTime'='1614740641',
  'parquet.compression'='snappy') {code}
 

!image-2022-05-25-20-53-20-412.png!

!image-2022-05-25-20-57-13-241.png!

  was:
After set {*}table.exec.hive.fallback-mapred-writer=false{*}, then sink hive by 
native parquet&orc writer, but can not get the *`parquet.compression`*  of 
`{*}formatConf{*}` in class `{*}HiveTableSink{*}`.

There is no field `parquet.compression` in `jobConf` or 
`sd.getSerdeInfo().getParameters()`. And `parquet.compression` just  exists in 
`{*}hive table properties{*}`. 

 

!image-2022-05-25-20-53-20-412.png!

!image-2022-05-25-20-57-13-241.png!


> Can not get the parquet.compression when using native parquet&orc writer to 
> sink hive
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-27777
>                 URL: https://issues.apache.org/jira/browse/FLINK-27777
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Small Wong
>            Priority: Major
>         Attachments: image-2022-05-25-20-53-20-412.png, 
> image-2022-05-25-20-57-13-241.png
>
>
> After set {*}table.exec.hive.fallback-mapred-writer=false{*}, then sink hive 
> by native parquet&orc writer, but can not get the *`parquet.compression`*  of 
> `{*}formatConf{*}` in class `{*}HiveTableSink{*}`.
> There is no field `parquet.compression` in `jobConf` or 
> `sd.getSerdeInfo().getParameters()`. And `parquet.compression` just  exists 
> in `{*}hive table properties{*}` as follows. 
>  
> {code:java}
> // code placeholder
> CREATE TABLE `hive_table`(
>   `user_id` int,
>   `order_amount` double)
> PARTITIONED BY (
>   `dt` string,
>   `hr` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> LOCATION
>   'hdfs://xxxx'
> TBLPROPERTIES (
>   'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
>   'sink.partition-commit.delay'='1 h',
>   'sink.partition-commit.policy.kind'='metastore,success-file',
>   'sink.partition-commit.trigger'='partition-time',
>   'transient_lastDdlTime'='1614740641',
>   'parquet.compression'='snappy') {code}
>  
> !image-2022-05-25-20-53-20-412.png!
> !image-2022-05-25-20-57-13-241.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to