KrzyCube wrote:

Thanks Doug, thanks Konstantin.

And still i have some question about that .

1. Is there any implements that i can specify one of the "dfs.name.dir"
values a remote directory.
or if i can map a remote fs to local , and direct the valuse to that ? cause
i think it's better for backup namespace data as if the NameNode's hardware
crashed.
Yes, you can mount a remote hard drive, specify a directory in it as an entry of "dfs.name.dir". The name-node then will use it as a backup for the namespace. The drawback is that nfs can get slow, so writing into "edits" can take more
time than to a local disk.
We used to do this before the secondary name-node was implemented. Now the secondary actually keeps the new
image locally, so there is no need to backup through nfs.

2.Yes , we use the SecondaryNameNode to deal with edits periodically. Then
the "fsimage" which on dish grow big. Then , assume there is a so big
fsimage file in local disk, and try to start the NameNode, now it is trying
to load the fsimage , and i found there is a HashMap called "activeBlocks"
in FSDirectory, i think this is the map for find and delete files in hdfs.
but the fsimage is so large that the memory can not hold it , then what to
do ?  i just not found code deal with that , but i found code deal with
edits file large than 2G size.
fsimage cannot grow infinitely as the edits file because it is a refection of what you currently have in memory. If you saved name-space image in fsimage, then you should be able to load it back into the same memory.
If you use smaller amount of memory then yes that can fail.
Another thing I can think of is that the image is somehow corrupted.

Konstantin Shvachko wrote:
Hi Martin,

"fsimage" is the file that contains all name space information.
Name-node stores this information in memory and periodically checkpoints it to disk. "edits" contains a log of name space operations that occurred since the last checkpoint. So that if the name-node fails or otherwise exits between two checkpoints the
name-space could be restored by merging the latest image and the edits.
Property "dfs.name.dir" specifies the directory where fsimage, edits and other name-node
files are stored.
You can specify multiple entries (comma delimited) in "dfs.name.dir". Then all name-space files will be duplicated in these directories. This is the right way to backup the name-space
image as a local copy.
Periodic checkpointing is performed by the secondary name-node. The side effect of the checkpoint is that it removes contents of the edits file. Yes you are right the edits file can
grow big that is why periodic checkpointing is important.
Does that answer your questions?

--Konstantin

Doug Cutting wrote:

KrzyCube wrote:

   I found that "File[] editFiles" in FSEditLog.java , then i trace the
call stack and found that it can be configured as multi-case of
"dfs.name.dir" . Is this means the NameNode data can be split into pieces or just set replication as the number of the strings of dirs that configured ? Is that right the way to backup several copys of the editlog+fsimage in
local filesystem ?
According to the documentation for dfs.name.dir:

http://lucene.apache.org/hadoop/hadoop-default.html#dfs.name.dir

 Determines where on the local filesystem the DFS name node should
 store the name table. If this is a comma-delimited list of directories
 then the name table is replicated in all of the directories, for
 redundancy.

Note that as of release 0.11 there's support for a secondary namenode, started with 'bin/hadoop secondarynamenode'.

http://issues.apache.org/jira/browse/HADOOP-227#action_12458332

This documentation should probably be added to http://lucene.apache.org/hadoop/hdfs_design.html, or at least to the wiki...

Doug




Reply via email to