[ 
https://issues.apache.org/jira/browse/HDFS-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587324#comment-14587324
 ] 

Ajith S commented on HDFS-8574:
-------------------------------

Hi [~arpitagarwal]

Thanks for the input. Yes you are right, HDFS was not designed for tiny blocks. 
My scenario was like, i wanted to test the NN limits so i inserted 10 million 
files with size ~10KB(10KB because i had smaller disk). My DN had one 
{{data.dir}} directory, when i faced this exception. But when i increased the 
{{data.dir}} to 5, the issue was resolved. Later i checked and came across this 
piece of code where the block report was sent per volume of DN. My question is 
when we check for overflow, based on number of blocks, then why we split based 
on report, as in a single report, there might be still overflow for given limit 
{{dfs.blockreport.split.threshold}}

Please correct me if i am wrong

> When block count for a volume exceeds dfs.blockreport.split.threshold, block 
> report causes exception
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8574
>                 URL: https://issues.apache.org/jira/browse/HDFS-8574
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.7.0
>            Reporter: Ajith S
>            Assignee: Ajith S
>
> This piece of code in 
> {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}}
> {code}
> // Send one block report per message.
>         for (int r = 0; r < reports.length; r++) {
>           StorageBlockReport singleReport[] = { reports[r] };
>           DatanodeCommand cmd = bpNamenode.blockReport(
>               bpRegistration, bpos.getBlockPoolId(), singleReport,
>               new BlockReportContext(reports.length, r, reportId));
>           numReportsSent++;
>           numRPCs++;
>           if (cmd != null) {
>             cmds.add(cmd);
>           }
> {code}
> when a single volume contains many blocks, i.e more than the threshold, it is 
> trying to send the entire blockreport in one RPC, causing exception
> {code}
> java.lang.IllegalStateException: 
> com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
> large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
> the size limit.
>         at 
> org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369)
>         at 
> org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347)
>         at 
> org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:325)
>         at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:473)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to