[
https://issues.apache.org/jira/browse/HDFS-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385559#comment-14385559
]
Allen Wittenauer commented on HDFS-8003:
----------------------------------------
bq. However, since the saving namespace process can take minutes or even throw
exception, there is no way to guarantee the NN can correctly verify/do
checkpoint before getting stopped.
You can always catch the exceptions and trap the signal from within the
namenode.
In fact, given the above scenario...
bq. Instead if we add this functionality outside of NN (i.e., into the stopping
NN shell), we can make sure the checkpoint verification happens/finishes before
stopping NameNode, and the RPC timeout can provide a time bound of the
operation.
... doing it in the shell program is even worse. If it gets an exception, it's
just going to assume that the previous command worked and continue on its way.
The whole thing that this hack is supposed to prevent is going to happen
anyway, because there is no way within the shell code that it can guarantee
that the checkpoint is valid.
Here's a key question: if the checkpoint isn't valid, what is the shell code
supposed to do about it?
> hdfs has 3 new shellcheck warnings and the related code change is questionable
> ------------------------------------------------------------------------------
>
> Key: HDFS-8003
> URL: https://issues.apache.org/jira/browse/HDFS-8003
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 3.0.0
> Reporter: Allen Wittenauer
>
> HDFS-6353 introduced three new shell check warnings due to an unprotected
> ${HADOOP_OPTS}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)