[
https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14223815#comment-14223815
]
Suresh Srinivas edited comment on HDFS-7435 at 11/25/14 12:22 AM:
------------------------------------------------------------------
bq. While I agree it's a questionably nice to have feature
Not sure why you think it is questionably nice to have feature...
bq. If 20-30MB is going to cause a promotion failure in a namenode servicing a
hundred millions of blocks - it's already game over. 2.x easily generates over
1GB garbage/sec at a mere ~20k ops/sec.
Just so that we are on the same page, java arrays requires contiguous space in
memory. In many installs, when namenode becomes unresponsive and datanodes end
up sending block reports, these block reports get promoted to older generation
(because namenode is processing them slowly). Since old generation may be
fragmented, promotions can fail when large arrays need to be promoted.
That said, I am fine doing it in a subsequent jira. It will end up touching the
same parts or replacing the code that you are adding.
was (Author: sureshms):
bq. While I agree it's a questionably nice to have feature
Not sure why you think it is questionably nice to have feature...
bq. If 20-30MB is going to cause a promotion failure in a namenode servicing a
hundred millions of blocks - it's already game over. 2.x easily generates over
1GB garbage/sec at a mere ~20k ops/sec.
Just so that we are on the same page, java arrays requires contiguous arrays in
memory. In many installs, when namenode becomes unresponsive and datanodes end
up sending block reports, these block reports get promoted to older generation
(because namenode is processing them slowly). Since old generation may be
fragmented, promotions can fail when large arrays need to be promoted.
That said, I am fine doing it in a subsequent jira. It will end up touching the
same parts or replacing the code that you are adding.
> PB encoding of block reports is very inefficient
> ------------------------------------------------
>
> Key: HDFS-7435
> URL: https://issues.apache.org/jira/browse/HDFS-7435
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode, namenode
> Affects Versions: 2.0.0-alpha, 3.0.0
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Priority: Critical
> Attachments: HDFS-7435.patch
>
>
> Block reports are encoded as a PB repeating long. Repeating fields use an
> {{ArrayList}} with default capacity of 10. A block report containing tens or
> hundreds of thousand of longs (3 for each replica) is extremely expensive
> since the {{ArrayList}} must realloc many times. Also, decoding repeating
> fields will box the primitive longs which must then be unboxed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)