[
https://issues.apache.org/jira/browse/FLINK-27526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jingsong Lee closed FLINK-27526.
--------------------------------
Resolution: Fixed
> Support scaling bucket number for FileStore
> -------------------------------------------
>
> Key: FLINK-27526
> URL: https://issues.apache.org/jira/browse/FLINK-27526
> Project: Flink
> Issue Type: New Feature
> Components: Table Store
> Affects Versions: table-store-0.2.0
> Reporter: Jane Chan
> Assignee: Jane Chan
> Priority: Major
> Fix For: table-store-0.2.0
>
>
> Currently, TableStore does not support changing the number of the bucket
> (denoted by config option {{{}table-storage.bucket{}}}) once the managed
> table is created. The reason is that the LSM tree is built under bucket
> level, and thus writing with different bucket numbers will cause the same
> record to be hashed to another bucket and leads to data corruption. In the
> release-0.1, TableStore will detect this change and throw an exception when
> scanning. See FLINK-27316.
> However, this is not flexible and user-friendly. To be more specific, if the
> bucket number remains unchanged, the number of files under each bucket will
> grow fast as time passes, slowing down the scan speed to restore the LSM tree
> and finally influencing read and write latency.
> In this ticket, we aim to support changing the bucket number via {{INSERT
> OVERWRITE(...)}} to provide a way for users to reorganize the existing data
> layout.
>
> {code:sql}
> -- alter catalog metadata
> ALTER TABLE table_identifier SET('bucket' = 'bucket-num');
> -- rescale partition by overwrite
> INSERT OVERWRITE table_identifier [PARTITION (partition_spec)] SELECT ...;
> -- rescale whole table
> INSERT OVERWRITE table_identifier SELECT ...
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)