[
https://issues.apache.org/jira/browse/HDFS-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630365#comment-15630365
]
Andrew Wang commented on HDFS-11090:
------------------------------------
Thanks for the comments everyone!
To provide a little more context, this is something we ran into for an
ephemeral cluster usecase. We're starting a new cluster for the first time. The
% blocks threshold is the default, and the min datanodes threshold is 1. Our
management scripts wait for the NN to leave safemode before setting up
directories and populating HDFS with files like Oozie's sharelib. This is why
the min datanodes threshold is set to 1, this way the cluster is ready to
receive writes when it leaves safemode.
Even though there are no blocks in the cluster, since the min datanode
threshold is set, the Namenode enters safemode extension. This adds an
additional 30s to startup. We've already trivially achieved 100% of all
replicas being reported, so in this case I'd like to leave safemode as soon as
the min datanodes threshold is met.
Setting the safemode extension to 0 for the first run would work, but that
pushes additional configuration burden onto the user. I was hoping to avoid
that with this JIRA.
> Leave safemode immediately if all blocks have reported in
> ---------------------------------------------------------
>
> Key: HDFS-11090
> URL: https://issues.apache.org/jira/browse/HDFS-11090
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Affects Versions: 2.7.3
> Reporter: Andrew Wang
> Assignee: Yiqun Lin
> Attachments: HDFS-11090.001.patch
>
>
> Startup safemode is triggered by two thresholds: % blocks reported in, and
> min # datanodes. It's extended by an interval (default 30s) until these two
> thresholds are met.
> Safemode extension is helpful when the cluster has data, and the default %
> blocks threshold (0.99) is used. It gives DNs a little extra time to report
> in and thus avoid unnecessary replication work.
> However, we can leave startup safemode early if 100% of blocks have reported
> in.
> Note that operators sometimes change the % blocks threshold to > 1 to never
> automatically leave safemode. We should maintain this behavior.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]