[
https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066949#comment-15066949
]
Colin Patrick McCabe commented on HDFS-7784:
--------------------------------------------
Thanks, [~kihwal]. Unfortunately, that's what we've seen as well... protobuf
seems to generate a lot of garbage during startup, causing many full GCs which
really consume a lot of time. It used to be you could ignore temporary objects
as long as you didn't create tenured objects, but it turns out that if there
are too many temporaries, HotSpot pushes them into the PermGen. At this point,
it's not clear that parallelization is a win for fsimage loading unless we can
mitigate that GC problem.
Have you guys looked into using the "javanano" version of protocol buffers?
See here: https://github.com/google/protobuf/tree/master/javanano
It seems like this would generate a lot less garbage than the "official" PB
library because it avoids builders in favor of mutable state, uses ints instead
of enums, uses arrays instead of ArrayList, etc. etc. I think we should
probably adopt this on the server-side, even if we keep the client-side with
the existing PB library. This would help with RPC as well, of course.
> load fsimage in parallel
> ------------------------
>
> Key: HDFS-7784
> URL: https://issues.apache.org/jira/browse/HDFS-7784
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Walter Su
> Assignee: Walter Su
> Priority: Minor
> Labels: BB2015-05-TBR
> Attachments: HDFS-7784.001.patch, test-20150213.pdf
>
>
> When single Namenode has huge amount of files, without using federation, the
> startup/restart speed is slow. The fsimage loading step takes the most of the
> time. fsimage loading can seperate to two parts, deserialization and object
> construction(mostly map insertion). Deserialization takes the most of CPU
> time. So we can do deserialization in parallel, and add to hashmap in serial.
> It will significantly reduce the NN start time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)