[ 
https://issues.apache.org/jira/browse/HDFS-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385559#comment-14385559
 ] 

Allen Wittenauer commented on HDFS-8003:
----------------------------------------

bq. However, since the saving namespace process can take minutes or even throw 
exception, there is no way to guarantee the NN can correctly verify/do 
checkpoint before getting stopped.

You can always catch the exceptions and trap the signal from within the 
namenode.  

In fact, given the above scenario... 

bq. Instead if we add this functionality outside of NN (i.e., into the stopping 
NN shell), we can make sure the checkpoint verification happens/finishes before 
stopping NameNode, and the RPC timeout can provide a time bound of the 
operation.

... doing it in the shell program is even worse.  If it gets an exception, it's 
just going to assume that the previous command worked and continue on its way.  
The whole thing that this hack is supposed to prevent is going to happen 
anyway, because there is no way within the shell code that it can guarantee 
that the checkpoint is valid.

Here's a key question: if the checkpoint isn't valid, what is the shell code 
supposed to do about it?

> hdfs has 3 new shellcheck warnings and the related code change is questionable
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-8003
>                 URL: https://issues.apache.org/jira/browse/HDFS-8003
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Allen Wittenauer
>
> HDFS-6353 introduced three new shell check warnings due to an unprotected 
> ${HADOOP_OPTS}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to