[
https://issues.apache.org/jira/browse/HDFS-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
xiaojunxiang updated HDFS-17216:
--------------------------------
Description:
When distcp copies small files (file size slightly smaller than the bandwidth),
the throbber only starts to throb after 1 second, and the throttled is specific
to a single file. so the throbber becomes invalid, causing distcp to fill the
cluster bandwidth and crush production traffic, which is a terrible thing.
Also, it takes time for files to set up the IO pipeline, so you shouldn't test
with very small files, which will slow the transfer, especially as bandwidth
kicks in, which will amplify the impact of small files on the rate
was:When distcp copies small files (file size slightly smaller than the
bandwidth), the throbber only starts to throb after 1 second, and the throttled
is specific to a single file. so the throbber becomes invalid, causing distcp
to fill the cluster bandwidth and crush production traffic, which is a terrible
thing.
> When distcp handle the small files, the bandwidth parameter will be invalid,
> resulting in serious overspeed behavior
> --------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-17216
> URL: https://issues.apache.org/jira/browse/HDFS-17216
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: distcp
> Affects Versions: 3.3.4
> Reporter: xiaojunxiang
> Priority: Major
> Labels: pull-request-available
> Attachments: DiscpAnalyze.jpg
>
>
> When distcp copies small files (file size slightly smaller than the
> bandwidth), the throbber only starts to throb after 1 second, and the
> throttled is specific to a single file. so the throbber becomes invalid,
> causing distcp to fill the cluster bandwidth and crush production traffic,
> which is a terrible thing.
> Also, it takes time for files to set up the IO pipeline, so you shouldn't
> test with very small files, which will slow the transfer, especially as
> bandwidth kicks in, which will amplify the impact of small files on the rate
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]