[ 
https://issues.apache.org/jira/browse/HDFS-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645247#comment-14645247
 ] 

Jian Fang commented on HDFS-3743:
---------------------------------

I was working on other things and now come back to this JIRA again. 

In my use case, I care more about a replacement JN if one EC2 instance where a 
JN was running was gone. I looked at the format() API, seems the required 
information to format a JN is NamespaceInfo, however, such information could be 
obtained from a running name node by running a separate command line because 
the directory is locked by name node. Also, the list of IPCLoggerChannelsin in 
QJM needs to be updated if we don't restart name node. This makes me think of 
using HADOOP-7001 support for QJM to call the format() API if it is aware of 
new JNs are introduced in the hadoop configuration. The running QJM has the  
NamespaceInfo object in memory and it could update the list of 
IPCLoggerChannels as well if the new JNs are formatted successfully. 

Does this idea make sense at all? 

Thanks.

> QJM: improve formatting behavior for JNs
> ----------------------------------------
>
>                 Key: HDFS-3743
>                 URL: https://issues.apache.org/jira/browse/HDFS-3743
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: QuorumJournalManager (HDFS-3077)
>            Reporter: Todd Lipcon
>
> Currently, the JournalNodes automatically format themselves when a new writer 
> takes over, if they don't have any data for that namespace. However, this has 
> a few problems:
> 1) if the administrator accidentally points a new NN at the wrong quorum (eg 
> corresponding to another cluster), it will auto-format a directory on those 
> nodes. This doesn't cause any data loss, but would be better to bail out with 
> an error indicating that they need to be formatted.
> 2) if a journal node crashes and needs to be reformatted, it should be able 
> to re-join the cluster and start storing new segments without having to fail 
> over to a new NN.
> 3) if 2/3 JNs get accidentally reformatted (eg the mount point becomes 
> undone), and the user starts the NN, it should fail to start, because it may 
> end up missing edits. If it auto-formats in this case, the user might have 
> silent "rollback" of the most recent edits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to