[
https://issues.apache.org/jira/browse/HDFS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453551#comment-13453551
]
Jing Zhao commented on HDFS-3912:
---------------------------------
Suresh's comments in HDFS-3703:
bq. However for the write site, not picking the stale node could result in an
issue, especially for small clusters. That is the reason why I think we should
do the write side changes in a related jira. We should consider making stale
timeout adaptive to the number of nodes marked stale in the cluster as
discussed in the previous comments. Additionally we should consider having a
separate configuration for write skipping the stale nodes.
The more detailed proposal for handling write is:
For writes do not use stale datanodes (if possible). To avoid the scenario
where a small T for judging stale state may generate new hotspots on cluster, T
is proposed to be calculated as:
T = t_c + (number of nodes already marked as stale) / (total number of nodes) *
(T_d - t_c),
where t_c is a constant value initially set in the configuration, and T_d is
the time for marking as dead (i.e., 10.5 min).
E.g., t_c can be set as 30s, then when there is no or few nodes marked as
stale, we can have a small T to satisfy the HBase requirement. In case that
there are large number nodes marked as stale, e.g., near the total number of
nodes, T will be almost T_d (i.e., ~10min), and the workload can still be
distributed to all the nodes alive.
When almost all nodes are marked as stale, include stale nodes as writing
target candidates when the number of remaining normal alive nodes is less than
the replica number.
> Detecting and avoiding stale datanodes for writing
> --------------------------------------------------
>
> Key: HDFS-3912
> URL: https://issues.apache.org/jira/browse/HDFS-3912
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Jing Zhao
> Assignee: Jing Zhao
>
> 1. Make stale timeout adaptive to the number of nodes marked stale in the
> cluster.
> 2. Consider having a separate configuration for write skipping the stale
> nodes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira