[ 
https://issues.apache.org/jira/browse/HDFS-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13872890#comment-13872890
 ] 

Haohui Mai commented on HDFS-5783:
----------------------------------

A preliminary experiment shows that there is little impact of loading time. I 
load a 512M fsimage on my laptop, here is the number:

# Loading the FSImage, computing the digest with {{DigestInputStream}}: 9920ms
# Loading the FSImage, without computing the digest: 7467ms
# Calculating MD5 independently: 1231ms

The reason why (2) + (3) is slightly faster than (1) is because currently we 
cannot consume all I/O bandwidth when loading fsimage.

> Compute the digest before loading FSImage
> -----------------------------------------
>
>                 Key: HDFS-5783
>                 URL: https://issues.apache.org/jira/browse/HDFS-5783
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-5698 (FSImage in protobuf)
>            Reporter: Haohui Mai
>            Assignee: Haohui Mai
>
> When loading the fsimage, the current code computes its MD5 digest 
> on-the-fly. It does not work when the code does not read all the sections in 
> strictly sequential order.
> This jira proposes to compute the MD5 digest before loading fsimage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to