[
https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592623#action_12592623
]
girish vaitheeswaran commented on HADOOP-3248:
----------------------------------------------
Konstantin,
Could you provide me with the hadoop-core.jar file so i can test the effect of
your patch on the benchmarking environment.
Thanks
-girish
> Improve Namenode startup performance
> ------------------------------------
>
> Key: HADOOP-3248
> URL: https://issues.apache.org/jira/browse/HADOOP-3248
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Reporter: girish vaitheeswaran
> Assignee: dhruba borthakur
> Attachments: fastRestarts.patch, fastRestarts.patch, FSImage.patch
>
>
> One of the things that would need to be addressed as part of Namenode
> scalability is the HDFS recovery performance especially in scenarios where
> the number of files is large. There are instances where the number of files
> are in the vicinity of 20 million and in such cases the time taken for
> namenode startup is prohibitive. Here are some benchmark numbers on the time
> taken for namenode startup. These times do not include the time to process
> block reports.
> Default scenario for 20 million files with the max java heap size set to
> 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection,
> initial java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover
> and code changes are required to bring this time down further. To this end
> some prototype optimizations were done to reduce this time. Based on some
> timing analysis saveImage and loadFSImage where the primary methods that were
> consuming most of the time. Most of the time was being spent on doing object
> allocations. The goal of the optimizations is to reduce the number of memory
> allocations as much as possible.
> Optimization 1: saveImage()
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission
> object. This is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the
> PermissionStatus object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage()
> where the fullName was being constructed using string concatenation. This
> optimization also helped improve performance
> Overall these optimizations helped bring down the overall startup time down
> to slightly over 7 minutes. Most of all the remaining time is now spent in
> loadFSImage() since we allocate the INode and INodeDirectory objects. Any
> further optimizations will need to focus on loadFSImage()
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.