[
https://issues.apache.org/jira/browse/HDFS-14624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878771#comment-16878771
]
Stephen O'Donnell commented on HDFS-14624:
------------------------------------------
I made a change to one more log message. There is a setting
"dfs.namenode.decommission.max.concurrent.tracked.nodes" that determines the
max number of nodes which can be in maintenance or transitioning to
decommission state at any given time. Others are queued until there is a free
slot.
In the existing log message which is printed on each 'tick' of the monitor, it
would be useful to also output the total nodes being monitored, and if any are
queued as that would help debug why some nodes are not making any progress, if
too many are already transitioning. That means this message:
{code:java}
2019-07-04 16:44:21,728 INFO blockmanagement.DatanodeAdminManager Checked
499730 blocks and 1 nodes this tick{code}
Would become:
{code:java}
2019-07-04 16:44:21,728 INFO blockmanagement.DatanodeAdminManager Checked
499730 blocks and 1 nodes this tick. 5 nodes are now in maintenance or
transitioning state. 0 nodes pending.{code}
There are no further messages I can think to add at the current time, so that
should be all for this Jira.
> When decommissioning a node, log remaining blocks to replicate periodically
> ---------------------------------------------------------------------------
>
> Key: HDFS-14624
> URL: https://issues.apache.org/jira/browse/HDFS-14624
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Affects Versions: 3.3.0
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Attachments: HDFS-14624.001.patch, HDFS-14624.002.patch
>
>
> When a node is marked for decommission, there is a monitor thread which runs
> every 30 seconds by default, and checks if the node still has pending blocks
> to be replicated before the node can complete replication.
> There are two existing debug level messages logged in the monitor thread,
> DatanodeAdminManager$Monitor.check(), which log the correct information
> already, first as the pending blocks are replicated:
> {code:java}
> LOG.debug("Node {} still has {} blocks to replicate "
> + "before it is a candidate to finish {}.",
> dn, blocks.size(), dn.getAdminState());{code}
> And then after the initial set of blocks has completed and a rescan happens:
> {code:java}
> LOG.debug("Node {} {} healthy."
> + " It needs to replicate {} more blocks."
> + " {} is still in progress.", dn,
> isHealthy ? "is": "isn't", blocks.size(), dn.getAdminState());{code}
> I would like to propose moving these messages to INFO level so it is easier
> to monitor decommission progress over time from the Namenode log.
> Based on the default settings, this would result in at most 1 log message per
> node being decommissioned every 30 seconds. The reason this is at the most,
> is because the monitor thread stops after checking after 500K blocks and
> therefore in practice it could be as little as 1 log message per 30 seconds,
> even if many DNs are being decommissioned at the same time.
> Note that the namenode webUI does display the above information, but having
> this in the NN logs would allow progress to be tracked more easily.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]