[ 
https://issues.apache.org/jira/browse/HDFS-17274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuze Yang updated HDFS-17274:
-----------------------------
    Description: 
When dataNodePeerStats and excludeSlowNodes are enabled, hdfs will distinguish 
and exclude slow datanodes when choose target placement. By avoiding use slow 
datanodes, we will achive better performance. However, writing files may failed 
after excluding slow datanodes, consider following sceneries:
 * Cluster A has 4 datanodes, named dn0, dn1, dn2, dn3. From a certain moment, 
dn0 is detected as slow disk, and dn1, dn2, dn3 become unavailable due to some 
errors.  Then write file will fail.
 * Cluster A has 4 datanodes, named dn0, dn1, dn2, dn3. dn0 has both ssd and 
hdd disks, while dn1, dn2, dn3 only have ssd disks. From a certain moment, dn0 
is detected as slow disk. Then write file will fail when using default storage 
type "HOT".

In above situation, I think we should let slow datanodes be chosen, it's more 
reasonable.

> slow datanodes should be chosen when no more normal datanodes are available
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-17274
>                 URL: https://issues.apache.org/jira/browse/HDFS-17274
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Xuze Yang
>            Priority: Major
>
> When dataNodePeerStats and excludeSlowNodes are enabled, hdfs will 
> distinguish and exclude slow datanodes when choose target placement. By 
> avoiding use slow datanodes, we will achive better performance. However, 
> writing files may failed after excluding slow datanodes, consider following 
> sceneries:
>  * Cluster A has 4 datanodes, named dn0, dn1, dn2, dn3. From a certain 
> moment, dn0 is detected as slow disk, and dn1, dn2, dn3 become unavailable 
> due to some errors.  Then write file will fail.
>  * Cluster A has 4 datanodes, named dn0, dn1, dn2, dn3. dn0 has both ssd and 
> hdd disks, while dn1, dn2, dn3 only have ssd disks. From a certain moment, 
> dn0 is detected as slow disk. Then write file will fail when using default 
> storage type "HOT".
> In above situation, I think we should let slow datanodes be chosen, it's more 
> reasonable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to