[
https://issues.apache.org/jira/browse/HDFS-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358005#comment-14358005
]
Hadoop QA commented on HDFS-7876:
---------------------------------
{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12703613/HDFS-7876.001.patch
against trunk revision 7a346bc.
{color:red}-1 patch{color}. Trunk compilation may be broken.
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9848//console
This message is automatically generated.
> DataNodes start to scan blocks earlier
> --------------------------------------
>
> Key: HDFS-7876
> URL: https://issues.apache.org/jira/browse/HDFS-7876
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode, namenode
> Affects Versions: 3.0.0
> Reporter: Xinwei Qin
> Assignee: Xinwei Qin
> Attachments: HDFS-7876.001.patch
>
>
> When Hadoop cluster restarts, DataNodes will scan local blocks, and report
> this infomation to NameNode. DataNodes start to scan local blocks after
> obtaining the NamespaceInfo from NameNode via RPC call versionRequest(),
> which needs the establishment of NameNode RPC server.
> Now, the RPC server will not be created and started until the completion of
> loading FsImage. So, DataNodes cannot start to scan blocks immediately, and
> must wait for NameNode to load FsImage. This will cause time wasting of
> DataNode when the FsImage is very large.
> Since the RPC server has very little dependence of FsImage, and the
> NamespaceInfo (namespaceID, clustered, blockpoolID, cTime, etc.) can be
> constructed from VERSION file, we can create and start RPC server before
> loading FsImage, so that DataNodes can get NamespaceInfo from NameNode via
> RPC call as soon as possible, and start to scan blocks earlier, which will
> shorten restart time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)