[jira] [Updated] (HDFS-8913) Documentation correction regarding Secondary node, Checkpoint node & Backup node

Ravindra Babu (JIRA) Tue, 18 Aug 2015 05:22:01 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-8913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ravindra Babu updated HDFS-8913:
--------------------------------
    Assignee:     (was: Ravindra Babu)

> Documentation correction regarding Secondary node, Checkpoint node & Backup 
> node
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-8913
>                 URL: https://issues.apache.org/jira/browse/HDFS-8913
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: documentation
>    Affects Versions: 2.7.1
>         Environment: Content in documentation
>            Reporter: Ravindra Babu
>            Priority: Minor
>             Fix For: 3.0.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I checked with many people and almost all of them are confused on 
> responsibilities of Secondary Node, Checkpoint Node and Backup node.
> Link:
> http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
> Confusion:
> Secondary NameNode
> The NameNode stores modifications to the file system as a log appended to a 
> native file system file, edits. When a NameNode starts up, it reads HDFS 
> state from an image file, fsimage, and then applies edits from the edits log 
> file. It then writes new HDFS state to the fsimage and starts normal 
> operation with an empty edits file. Since NameNode merges fsimage and edits 
> files only during start up, the edits log file could get very large over time 
> on a busy cluster. Another side effect of a larger edits file is that next 
> restart of NameNode takes longer.
> Checkpoint Node
> NameNode persists its namespace using two files: fsimage, which is the latest 
> checkpoint of the namespace and edits, a journal (log) of changes to the 
> namespace since the checkpoint. When a NameNode starts up, it merges the 
> fsimage and edits journal to provide an up-to-date view of the file system 
> metadata. The NameNode then overwrites fsimage with the new HDFS state and 
> begins a new edits journal.
> Backup Node
> The Backup node provides the same checkpointing functionality as the 
> Checkpoint node, as well as maintaining an in-memory, up-to-date copy of the 
> file system namespace that is always synchronized with the active NameNode 
> state. Along with accepting a journal stream of file system edits from the 
> NameNode and persisting this to disk, the Backup node also applies those 
> edits into its own copy of the namespace in memory, thus creating a backup of 
> the namespace.
> Now all three nodes have overlapping functionalities. To add confusion to 
> this point, 
> http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> quotes that NameNode will never make RPC call to other nodes.
> The Communication Protocols
> All HDFS communication protocols are layered on top of the TCP/IP protocol. A 
> client establishes a connection to a configurable TCP port on the NameNode 
> machine. It talks the ClientProtocol with the NameNode. The DataNodes talk to 
> the NameNode using the DataNode Protocol. A Remote Procedure Call (RPC) 
> abstraction wraps both the Client Protocol and the DataNode Protocol. By 
> design, the NameNode never initiates any RPCs. Instead, it only responds to 
> RPC requests issued by DataNodes or clients.
> We need clarification regarding these points. Please enhance your 
> documentation to avoid confusion among readers.
> 1) Secondary Node, Check point Node & Backup node - Clear separation of roles
> 2) For High Availability, do we require  only One of them Or Two of them or 
> All of them? If it's not all of them, what combination is allowed?
> 3) Without RPC by Name node to data nodes, how writes and read are happening?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8913) Documentation correction regarding Secondary node, Checkpoint node & Backup node

Reply via email to