[ 
https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233242#comment-14233242
 ] 

Daryn Sharp commented on HDFS-7435:
-----------------------------------

I agree that it makes sense for both the DN and NN to chunk the block list for 
their internal data structures.  All I'm saying is that the wire protocol may 
not need to be built around that implementation detail - but I think a middle 
ground is { block-count, chunk-count, blocks[], chunk-count, blocks[], ... }.  
However I think it can be done in a simpler and less invasive fashion.  I'll 
toss another patch up this afternoon.

On a side note: we're having some DN OOM issues too during block reports + 
heavy load.  In your use case, I'm presuming each disk is a storage.   If yes, 
each unencoded storage report will be ~4MB, ~1.5MB encoded.  The DN heap must 
be >64GB, easily more for HA, so a handful of MB appears to pales in comparison 
to how much memory the DN wastes building the reports.  Again, there are 
default capacity {{ArrayLists}} to build the finalized & UC lists!  I'm going 
to make minimal changes to reduce the gross inefficiency here too.

> PB encoding of block reports is very inefficient
> ------------------------------------------------
>
>                 Key: HDFS-7435
>                 URL: https://issues.apache.org/jira/browse/HDFS-7435
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, namenode
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HDFS-7435.000.patch, HDFS-7435.001.patch, HDFS-7435.patch
>
>
> Block reports are encoded as a PB repeating long.  Repeating fields use an 
> {{ArrayList}} with default capacity of 10.  A block report containing tens or 
> hundreds of thousand of longs (3 for each replica) is extremely expensive 
> since the {{ArrayList}} must realloc many times.  Also, decoding repeating 
> fields will box the primitive longs which must then be unboxed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to