[ 
https://issues.apache.org/jira/browse/HDFS-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojunxiang updated HDFS-17216:
--------------------------------
    Description: 
When distcp copies small files (file size slightly smaller than the bandwidth), 
the throbber only starts to throb after 1 second, and the throttled is specific 
to a single file. so the throbber becomes invalid, causing distcp to fill the 
cluster bandwidth and crush production traffic, which is a terrible thing. 

Also, it takes time for files to set up the IO pipeline, so you shouldn't test 
with very small files, which will slow the transfer, especially as bandwidth 
kicks in, which will amplify the impact of small files on the rate



  was:When distcp copies small files (file size slightly smaller than the 
bandwidth), the throbber only starts to throb after 1 second, and the throttled 
is specific to a single file. so the throbber becomes invalid, causing distcp 
to fill the cluster bandwidth and crush production traffic, which is a terrible 
thing. 


> When distcp handle the small files, the bandwidth parameter will be invalid, 
> resulting in serious overspeed behavior
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-17216
>                 URL: https://issues.apache.org/jira/browse/HDFS-17216
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: distcp
>    Affects Versions: 3.3.4
>            Reporter: xiaojunxiang
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: DiscpAnalyze.jpg
>
>
> When distcp copies small files (file size slightly smaller than the 
> bandwidth), the throbber only starts to throb after 1 second, and the 
> throttled is specific to a single file. so the throbber becomes invalid, 
> causing distcp to fill the cluster bandwidth and crush production traffic, 
> which is a terrible thing. 
> Also, it takes time for files to set up the IO pipeline, so you shouldn't 
> test with very small files, which will slow the transfer, especially as 
> bandwidth kicks in, which will amplify the impact of small files on the rate



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to