[
https://issues.apache.org/jira/browse/HDFS-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617249#comment-16617249
]
Sunil Govindan commented on HDFS-12049:
---------------------------------------
As code freeze for 3.2 is crossed, moving this Jira to 3.3. Please feel free
to revert if anyone has concerns. Thank you.
> Recommissioning live nodes stalls the NN
> ----------------------------------------
>
> Key: HDFS-12049
> URL: https://issues.apache.org/jira/browse/HDFS-12049
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.6.0
> Reporter: Daryn Sharp
> Priority: Critical
>
> A node refresh will recommission included nodes that are alive and in
> decommissioning or decommissioned state. The recommission will scan all
> blocks on the node, find over replicated blocks, chose an excess, queue an
> invalidate.
> The process is expensive and worsened by overhead of storage types (even when
> not in use). It can be especially devastating because the write lock is held
> for the entire node refresh. _Recommissioning 67 nodes with ~500k
> blocks/node stalled rpc services for over 4 mins._
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]