[
https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062798#comment-14062798
]
Andrew Wang commented on HDFS-5464:
-----------------------------------
Considering we'll have 8 or 10TB disks in a few years, we could be seeing a lot
more than just 500k blocks per DN. Storage dense nodes with 24+ disks are also
out there. Memory accesses and conditional branches are also expensive. If we
were just adding 500k integers together, it's not a big deal, but this loop is
doing more than that.
I'm not opposed in principle to this change since it is simpler and the same
time complexity, but I'd like to see some microbenchmark results before
committing it. Maybe rig something up with JMH?
> Simplify block report diff calculation
> --------------------------------------
>
> Key: HDFS-5464
> URL: https://issues.apache.org/jira/browse/HDFS-5464
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Tsz Wo Nicholas Sze
> Assignee: Tsz Wo Nicholas Sze
> Priority: Minor
> Attachments: h5464_20131105.patch, h5464_20131105b.patch,
> h5464_20131105c.patch, h5464_20140715.patch, h5464_20140715b.patch
>
>
> The current calculation in BlockManager.reportDiff(..) is unnecessarily
> complicated. We could simplify the calculation.
--
This message was sent by Atlassian JIRA
(v6.2#6252)