[ 
https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226567#comment-14226567
 ] 

Jing Zhao commented on HDFS-7435:
---------------------------------

In the current demo patch the max chunk size is not encoded in the buffer. It 
is currently simply determined by the DataNode based on configuration. The 
segmentation is done by the DataNode and NameNode does not need to know the max 
chunk size. For simplicity each chunk still follows the same format with the 
original long[] in BlockListAsLongs (i.e., it still encodes the number of 
finalized blocks and number of uc replicas in the first two elements).

I guess to let DataNode be the only side doing segmentation can avoid NameNode 
still allocating a big contiguous array before chunking. Thus I have to change 
the {{optional bytes blocksBuffer}} into {{repeated bytes blocksBuffers}}. 
Maybe we can use {{repeated bytes blocksBuffers}} here but assume the number of 
buffer is always 1, then move the real segmentation change into a separate jira?

> PB encoding of block reports is very inefficient
> ------------------------------------------------
>
>                 Key: HDFS-7435
>                 URL: https://issues.apache.org/jira/browse/HDFS-7435
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, namenode
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HDFS-7435.000.patch, HDFS-7435.patch
>
>
> Block reports are encoded as a PB repeating long.  Repeating fields use an 
> {{ArrayList}} with default capacity of 10.  A block report containing tens or 
> hundreds of thousand of longs (3 for each replica) is extremely expensive 
> since the {{ArrayList}} must realloc many times.  Also, decoding repeating 
> fields will box the primitive longs which must then be unboxed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to