[
https://issues.apache.org/jira/browse/HDFS-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874756#comment-16874756
]
He Xiaoqiao commented on HDFS-13694:
------------------------------------
Thanks [~leosun08] for your report and patch, it is very interesting
improvement.
I found that you upload patch here and receive some comments from
[~jojochuang], meanwhile submit another PR at GitHub, and [~elgoiri] has also
given some other review comments there. Maybe there are some duplicate
suggestions. IMO, we should focus on one side, I prefer to communicate here
before GitHub repo is ready complete. As far as I know, only subproject ozone
turn to GitHub for code reviews. [~elgoiri],[~jojochuang] Please give some
suggestions if I am wrong.
Some minor comments for [^HDFS-13694-005.patch],
a. is it expected to change Throwable to IOException, will it break something?
{code:java}
+ @Override
+ public void run() {
+ try {
+ digest = MD5FileUtils.computeMd5ForFile(file);
+ } catch (Throwable t) {
+ if (t instanceof IOException) {
+ ioe = (IOException) t;
+ } else {
+ ioe = new IOException(t);
+ }
+ }
+ }
{code}
b. do we need one configuration item to support switch this feature or not by
default?
c. I believe this is great work, and will reduce restart time. thus I think it
will be more friendly for watchers/reviewers if attach one simple benchmark
test report.
d. It seems that patch based on branch-2.7, would you rebase and based on
branch-trunk.
Thanks [~leosun08] for your great work again.
> Making md5 computing being in parallel with image loading
> ---------------------------------------------------------
>
> Key: HDFS-13694
> URL: https://issues.apache.org/jira/browse/HDFS-13694
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: zhouyingchao
> Assignee: Lisheng Sun
> Priority: Major
> Attachments: HDFS-13694-001.patch, HDFS-13694-002.patch,
> HDFS-13694-003.patch, HDFS-13694-004.patch, HDFS-13694-005.patch
>
>
> During namenode image loading, it firstly compute the md5 and then load the
> image. Actually these two steps can be in parallel.
> Test this patch against a fsimage of a 70PB 2.4 cluster (200million files
> and 300million blocks), the image loading time be reduced from 1210 seconds
> to 1105 seconds.So it can reduce up to about 10% of time.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]