[
https://issues.apache.org/jira/browse/HADOOP-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595920#action_12595920
]
Konstantin Shvachko commented on HADOOP-3022:
---------------------------------------------
After the two optimizations HADOOP-3364 and HADOOP-3369 the load time is
improved by a factor of 2.
The biggest progress is achieved in saving image and block processing, each of
which is almost 4 times faster.
- image saving is 4 times faster
- block processing is 4 times faster
The table below summarizes sizes and compares new and old time measurements.
|| ||value||vs||
|objects|10 mln||
|files & dirs| 4 mln||
|blocks| 6 mln||
|heap size| 3.275 GB||
|image size| 0.6 GB||
|edits size per day| 0.27 GB||
|# data-nodes| 500||
|blocks per node| 36,000||
|image load time| 111 sec| 132 sec|
|edits load time| 75 sec| 84 sec|
|image save time| 18 sec| 70 sec|
|block processing| 87 sec| 320 sec|
|total startup time| 291 sec = 5 min| 606 sec = 10 min|
This leads to the optimized startup time of 5 minutes, out of which
|load fsimage| 38%|
|load edits| 26%|
|save new fsimage| 6%|
|process block reports| 30%|
I think more improvements can be made here especially in the loading part.
For edits log we should optimize ADD and CLOSE transactions as noted in
HADOOP-3364.
For image loading it is probably block processing, but that needs to be
evaluated.
Leaving this issue open for now.
> Fast Cluster Restart
> --------------------
>
> Key: HADOOP-3022
> URL: https://issues.apache.org/jira/browse/HADOOP-3022
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs
> Reporter: Robert Chansler
> Assignee: Konstantin Shvachko
> Fix For: 0.18.0
>
>
> This item introduces a discussion of how to reduce the time necessary to
> start a large cluster from tens of minutes to a handful of minutes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.