[ 
https://issues.apache.org/jira/browse/HDFS-9279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15085577#comment-15085577
 ] 

Kihwal Lee commented on HDFS-9279:
----------------------------------

bq. Because the data present in the decommissioning nodes would eventually be 
transferred over to the live nodes. Is this understanding correct?
The replicas are not invalidated on decommissioning nodes even after 
replicating, so the capacity tracking was not accurate either. It ended up 
double counting the used space toward the end, at which the process seems to 
stall more frequently nowadays (this is another topic). If a significant 
portion of a cluster is decommissioned, the stat will look very strange and 
confuse people.  That actually happened to us multiple times.  The free/total 
ratio will look considerably smaller than the actual value. Monitoring tools 
cannot easily dismiss it as 'Nah.. it's a temporary discrepancy caused by 
decommissioning.'

With this change, the storage capacity stat has become more like regular 
under-replication scenario caused by node/disk outages. Additional space will 
be used for re-replicating those blocks, but it is not yet allocated to those 
blocks. That's the actual state of used/usable storage and the stat reflects 
that now.  If we want the stat to reflect what would be used in the future, we 
are talking space reservation feature.


> Decomissioned capacity should not be considered for configured/used capacity
> ----------------------------------------------------------------------------
>
>                 Key: HDFS-9279
>                 URL: https://issues.apache.org/jira/browse/HDFS-9279
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.6.1
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>             Fix For: 3.0.0, 2.8.0
>
>         Attachments: HDFS-9279-v1.patch, HDFS-9279-v2.patch, 
> HDFS-9279-v3.patch, HDFS-9279-v4.patch
>
>
> Capacity of a decommissioned node is being accounted as configured and used 
> capacity metrics. This gives incorrect perception of cluster usage.
> Once a node is decommissioned, its capacity should be considered similar to a 
> dead node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to