[
https://issues.apache.org/jira/browse/KUDU-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xixu Wang updated KUDU-3447:
----------------------------
Attachment: image-2023-02-13-17-16-50-491.png
> Limit the usage of network bandwidth of tablet copying
> -------------------------------------------------------
>
> Key: KUDU-3447
> URL: https://issues.apache.org/jira/browse/KUDU-3447
> Project: Kudu
> Issue Type: Improvement
> Reporter: Xixu Wang
> Priority: Minor
> Attachments: image-2023-02-09-10-38-50-512.png,
> image-2023-02-09-10-47-58-370.png, image-2023-02-13-17-08-37-256.png,
> image-2023-02-13-17-16-50-491.png
>
>
> Copying tablets from an old cluster to another new cluster is a high resource
> consumed operation using the command : kudu local_replica copy_from_remote.
> As the follow picture shows: the usage of memory is as high as 75%. And the
> network is almost occupied fully (the overall network bandwidth is 2Gb/s).
> Disk reading is every high (the overall disk bandwidth is 200MB/s).
> !image-2023-02-09-10-47-58-370.png|width=996,height=369!
> If the data size is very large, the copying process will last for a long
> time. Other service maybe get impacted and become unavailable. Therefore it
> is better to limit the tablets copying speed and make the system more stable.
> The goal is to balance the tablets copying speed and the impact to other
> services.
> As copy_from_remote is mainly downloading data from the remote cluster and
> write the data to local file system, it is better to control the downloading
> speed to control the resource consumption. There are some algorithms to
> implement a rate limiter. This patch will use the token bucket algorithm
> implemented by Facebook Folly library:
> [https://github.com/facebook/folly/blob/main/folly/TokenBucket.h]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)