[ 
https://issues.apache.org/jira/browse/HDFS-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015701#comment-13015701
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1070:
----------------------------------------------

Here is an orthogonal idea for reducing image size:

In NameNode, we have internal maps for usernames, groups to serial numbers (see 
{{SerialNumberManager}}) in order to save memory in the NameNode.  How about we 
do the same for {{FSImage}}?  I.e. write the maps in the beginning of 
{{FSImage}} and then use the serial numbers in the {{INode}} entries.

Suppose the saving is 10 bytes per name, that is 20 bytes per {{INode}}.  Then, 
it is about 1.1 GB for a namespace with 60 million files/directories.

> Speedup NameNode image loading and saving by storing local file names
> ---------------------------------------------------------------------
>
>                 Key: HDFS-1070
>                 URL: https://issues.apache.org/jira/browse/HDFS-1070
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>         Attachments: trunkLocalNameImage.patch, trunkLocalNameImage1.patch, 
> trunkLocalNameImage3.patch, trunkLocalNameImage4.patch, 
> trunkLocalNameImage5.patch
>
>
> Currently each inode stores its full path in the fsimage. I'd propose to 
> store the local name instead. In order for each inode to identify its parent, 
> all inodes in a directory tree are stored in the image in in-order. This 
> proposal also requires each directory stores the number of its children in 
> image.
> This proposal would bring a few benefits as pointed below and therefore 
> speedup the image loading and saving.
> # Remove the overhead of converting java-UTF8 encoded local name to 
> string-represented full path then to UTF8 encoded full path when saving to an 
> image and vice versa when loading the image.
> # Remove the overhead of traversing the full path when inserting the inode to 
> its parent inode.
> # Reduce the number of temporary java objects during the process of image 
> loading or saving and  therefore reduce the GC overhead.
> # Reduce the size of an image.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to