[ 
http://issues.apache.org/jira/browse/HADOOP-227?page=comments#action_12455497 ] 
            
Konstantin Shvachko commented on HADOOP-227:
--------------------------------------------

Merging of fsimage with the edits can be done using O(sqrt( number of files )) 
memory.

Suppose the number of files in fsimage (sorted by path name) is N.
I divide fsimage into blocks so that each block has B=sqrt(N) namespace entries.
The number of such blocks will be also M=sqrt(N).
For each block we store in memory the path name of the first entry of the 
block, and the block offset.
I then start reading the edits file. For every operation in edits I read an 
appropriate block from
fsimage using the table in-memory, look for the appropriate entry, and perform 
operation on the 
corresponding file. Update operations are performed in place, remove just 
leaves the free space 
in the block. When a new entry needs to be added current block is split into 
two new blocks each 
containing half of the records of the original block, and is stored in the end 
of the fsimage file.
The in-memory table is also updated to reflect new keys and new block offsets.
This algorithm needs to keep in memory the table of size M and one block of 
size B.
The total size of memory used is M + B = O(sqrt(N)).

If we need to tighten the memory requirement then we can divide N into smaller 
number 
of blocks (reduce M) and read a part of the block each time (reduce B).
The price is more disk IOs, which seems acceptable, for the name-node disk 
usage is not critical.


> Namespace check pointing is not performed until the namenode restarts.
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-227
>                 URL: http://issues.apache.org/jira/browse/HADOOP-227
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.2.0
>            Reporter: Konstantin Shvachko
>         Assigned To: Milind Bhandarkar
>
> In current implementation when the name node starts, it reads its image file, 
> then
> the edits file, and then saves the updated image back into the image file.
> The image file is never updated after that.
> In order to provide the system reliability reliability the namespace 
> information should
> be check pointed periodically, and the edits file should be kept relatively 
> small.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to