[
https://issues.apache.org/jira/browse/FLINK-30951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
luoyuxia updated FLINK-30951:
-----------------------------
Description:
The issue aims to verfiy FLINK-29635.
Please verify in batch mode, the document is in
[https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/hive/hive_read_write/#file-compaction]:
1: enable auto-compaction, write some data to a Hive table which results in the
average size of files is less than compaction.small-files.avg-size(16MB by
default), verfiy these files should be merged.
2: enable auto-compaction, set compaction.small-files.avg-size to a smaller
values, then write some data to a Hive table which results in the average size
of files is greater thant the compaction.small-files.avg-size, verfiy these
files shouldn't be merged.
3. set sink.parallelism manually, check the parallelism of the compact operator
is equal to sink.parallelism.
4. set compaction.parallelism manually, check the parallelism of the compact
operator is equal to compaction.parallelism.
5. set compaction.file-size, check the size of the each target file merged is
about the `compaction.file-size`.
We shoud verify it with writing non-partitioned table, static partition table,
dynamic partition table.
We can find the example sql for how to create & write hive table in the
codebase
[HiveTableCompactSinkITCase]([https://github.com/apache/flink/search?q=HiveTableCompactSinkITCase]).
> Release Testing: Verify FLINK-29635 Hive sink should support merge files in
> batch mode
> --------------------------------------------------------------------------------------
>
> Key: FLINK-30951
> URL: https://issues.apache.org/jira/browse/FLINK-30951
> Project: Flink
> Issue Type: Sub-task
> Components: Connectors / Hive
> Reporter: luoyuxia
> Priority: Blocker
> Fix For: 1.17.0
>
>
> The issue aims to verfiy FLINK-29635.
> Please verify in batch mode, the document is in
> [https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/hive/hive_read_write/#file-compaction]:
>
> 1: enable auto-compaction, write some data to a Hive table which results in
> the average size of files is less than compaction.small-files.avg-size(16MB
> by default), verfiy these files should be merged.
> 2: enable auto-compaction, set compaction.small-files.avg-size to a smaller
> values, then write some data to a Hive table which results in the average
> size of files is greater thant the compaction.small-files.avg-size, verfiy
> these files shouldn't be merged.
> 3. set sink.parallelism manually, check the parallelism of the compact
> operator is equal to sink.parallelism.
> 4. set compaction.parallelism manually, check the parallelism of the compact
> operator is equal to compaction.parallelism.
> 5. set compaction.file-size, check the size of the each target file merged is
> about the `compaction.file-size`.
>
> We shoud verify it with writing non-partitioned table, static partition
> table, dynamic partition table.
> We can find the example sql for how to create & write hive table in the
> codebase
> [HiveTableCompactSinkITCase]([https://github.com/apache/flink/search?q=HiveTableCompactSinkITCase]).
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)