[ 
https://issues.apache.org/jira/browse/FLINK-27526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-27526:
------------------------------
    Description: 
Currently, TableStore does not support changing the number of the bucket 
(denoted by config option {{table-storage.bucket}}) once the managed table is 
created. The reason is that the LSM tree is built under bucket level, and thus 
writing with different bucket numbers will cause the same record to be hashed 
to another bucket and leads to data corruption. In the release-0.1, TableStore 
will detect this change and throw an exception when scanning. See FLINK-27316.

However, this is not flexible and user-friendly. To be more specific, if the 
bucket number remains unchanged, the number of files under each bucket will 
grow fast as time passes, slowing down the scan speed to restore the LSM tree 
and finally influencing read and write latency.

In this ticket, we aim to support changing the bucket number and provide a way 
for users to reorganize the existing data layout.

  was:
Currently, TableStore does not support changing the number of the bucket 
(denoted by config option 'table-storage.bucket') once the managed table is 
created. The reason is that the LSM tree is built under bucket level, and thus 
writing with different bucket numbers will cause the same record to be hashed 
to another bucket and leads to data corruption. In the release-0.1, TableStore 
will detect this change and throw an exception when scanning. See FLINK-27316.

However, this is not flexible and user-friendly. To be more specific, if the 
bucket number remains unchanged, the number of files under each bucket will 
grow fast as time passes, slowing down the scan speed to restore the LSM tree 
and finally influencing read and write latency.


> Support scaling bucket number for FileStore
> -------------------------------------------
>
>                 Key: FLINK-27526
>                 URL: https://issues.apache.org/jira/browse/FLINK-27526
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table Store
>    Affects Versions: table-store-0.2.0
>            Reporter: Jane Chan
>            Assignee: Jane Chan
>            Priority: Major
>             Fix For: table-store-0.2.0
>
>
> Currently, TableStore does not support changing the number of the bucket 
> (denoted by config option {{table-storage.bucket}}) once the managed table is 
> created. The reason is that the LSM tree is built under bucket level, and 
> thus writing with different bucket numbers will cause the same record to be 
> hashed to another bucket and leads to data corruption. In the release-0.1, 
> TableStore will detect this change and throw an exception when scanning. See 
> FLINK-27316.
> However, this is not flexible and user-friendly. To be more specific, if the 
> bucket number remains unchanged, the number of files under each bucket will 
> grow fast as time passes, slowing down the scan speed to restore the LSM tree 
> and finally influencing read and write latency.
> In this ticket, we aim to support changing the bucket number and provide a 
> way for users to reorganize the existing data layout.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to