[ 
https://issues.apache.org/jira/browse/FLINK-27526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-27526.
--------------------------------
    Resolution: Fixed

> Support scaling bucket number for FileStore
> -------------------------------------------
>
>                 Key: FLINK-27526
>                 URL: https://issues.apache.org/jira/browse/FLINK-27526
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table Store
>    Affects Versions: table-store-0.2.0
>            Reporter: Jane Chan
>            Assignee: Jane Chan
>            Priority: Major
>             Fix For: table-store-0.2.0
>
>
> Currently, TableStore does not support changing the number of the bucket 
> (denoted by config option {{{}table-storage.bucket{}}}) once the managed 
> table is created. The reason is that the LSM tree is built under bucket 
> level, and thus writing with different bucket numbers will cause the same 
> record to be hashed to another bucket and leads to data corruption. In the 
> release-0.1, TableStore will detect this change and throw an exception when 
> scanning. See FLINK-27316.
> However, this is not flexible and user-friendly. To be more specific, if the 
> bucket number remains unchanged, the number of files under each bucket will 
> grow fast as time passes, slowing down the scan speed to restore the LSM tree 
> and finally influencing read and write latency.
> In this ticket, we aim to support changing the bucket number via {{INSERT 
> OVERWRITE(...)}} to provide a way for users to reorganize the existing data 
> layout.
>  
> {code:sql}
> -- alter catalog metadata
> ALTER TABLE table_identifier SET('bucket' = 'bucket-num');
> -- rescale partition by overwrite
> INSERT OVERWRITE table_identifier [PARTITION (partition_spec)] SELECT ...;
> -- rescale whole table
> INSERT OVERWRITE table_identifier SELECT ...
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to