[jira] [Updated] (FLINK-30951) Release Testing: Verify FLINK-29635 Hive sink should support merge files in batch mode

luoyuxia (Jira) Tue, 07 Feb 2023 18:09:47 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-30951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


luoyuxia updated FLINK-30951:
-----------------------------
    Description: 
The issue aims to verfiy FLINK-29635.

Please verify in batch mode, the document is in 
[https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/hive/hive_read_write/#file-compaction]:

 

1: enable auto-compaction, write some data to a Hive table which results in the 
average size of files is less than compaction.small-files.avg-size(16MB by 
default), verfiy these files should be merged.

2:  enable auto-compaction, set compaction.small-files.avg-size to a smaller 
values, then write some data to a Hive table which results in the average size 
of files is greater thant the compaction.small-files.avg-size, verfiy these 
files shouldn't be merged.

3. set sink.parallelism manually, check the parallelism of the compact operator 
is equal to sink.parallelism.

4. set compaction.parallelism manually, check the parallelism of the compact 
operator is equal to compaction.parallelism.

5. set compaction.file-size, check the size of the each target file merged is 
about the `compaction.file-size`.

 

We shoud verify it with writing non-partitioned table, static partition table, 
dynamic partition table.

We can find the example sql for how to create & write hive table in the 
codebase  
[HiveTableCompactSinkITCase]([https://github.com/apache/flink/search?q=HiveTableCompactSinkITCase]).

 

 

> Release Testing: Verify FLINK-29635 Hive sink should support merge files in 
> batch mode
> --------------------------------------------------------------------------------------
>
>                 Key: FLINK-30951
>                 URL: https://issues.apache.org/jira/browse/FLINK-30951
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Connectors / Hive
>            Reporter: luoyuxia
>            Priority: Blocker
>             Fix For: 1.17.0
>
>
> The issue aims to verfiy FLINK-29635.
> Please verify in batch mode, the document is in 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/hive/hive_read_write/#file-compaction]:
>  
> 1: enable auto-compaction, write some data to a Hive table which results in 
> the average size of files is less than compaction.small-files.avg-size(16MB 
> by default), verfiy these files should be merged.
> 2:  enable auto-compaction, set compaction.small-files.avg-size to a smaller 
> values, then write some data to a Hive table which results in the average 
> size of files is greater thant the compaction.small-files.avg-size, verfiy 
> these files shouldn't be merged.
> 3. set sink.parallelism manually, check the parallelism of the compact 
> operator is equal to sink.parallelism.
> 4. set compaction.parallelism manually, check the parallelism of the compact 
> operator is equal to compaction.parallelism.
> 5. set compaction.file-size, check the size of the each target file merged is 
> about the `compaction.file-size`.
>  
> We shoud verify it with writing non-partitioned table, static partition 
> table, dynamic partition table.
> We can find the example sql for how to create & write hive table in the 
> codebase  
> [HiveTableCompactSinkITCase]([https://github.com/apache/flink/search?q=HiveTableCompactSinkITCase]).
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30951) Release Testing: Verify FLINK-29635 Hive sink should support merge files in batch mode

Reply via email to