[ 
https://issues.apache.org/jira/browse/HDFS-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069805#comment-13069805
 ] 

Suresh Srinivas commented on HDFS-395:
--------------------------------------

Tomasz, cool stuff.

Some comments:
# ReceivedDeletedBlockInfo.java is missing in the patch.
# DataNodeProtocol.java
#* Please update the DataNodeProtocol#blockReceivedAndDeleted() javadoc to 
reflect the new functionality.
# DataNode.java
#* Minor: DataNode.java indendtation of receivedAndDeletedBlockList.wait()
#* I am not clear on why you are setting deleteReportInterval to 
DFS_BLOCKREPORT_INTERVAL_MSEC_KEY and blockReportInterval to 
2*DFS_BLOCKREPORT_INTERVAL_MSEC_KEY. Can you retain blockReportInterval same as 
before and use some suitable value for deleteReportInterval. This could be 100 
times heartbeat. The reason to be aggressive with this is, NN is going to keep 
the replica in a data structure, until delete ack. I know that it is piggy 
backed with blockReceived; but still see no bad side to sending it more 
frequently.
#* reportReceivedDeletedBlocks() - save the pendingReceivedRequests in 
currentReceived in side synchronixed(receivedAndDeletedBlockList)?
#* pendingReceivedRequests accessed without lock in OfferService - you can move 
this check into reportReceivedDeletedBlocks()


> DFS Scalability: Incremental block reports
> ------------------------------------------
>
>                 Key: HDFS-395
>                 URL: https://issues.apache.org/jira/browse/HDFS-395
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: dhruba borthakur
>            Assignee: Tomasz Nykiel
>         Attachments: blockReportPeriod.patch, explicitDeleteAcks.patch
>
>
> I have a cluster that has 1800 datanodes. Each datanode has around 50000 
> blocks and sends a block report to the namenode once every hour. This means 
> that the namenode processes a block report once every 2 seconds. Each block 
> report contains all blocks that the datanode currently hosts. This makes the 
> namenode compare a huge number of blocks that practically remains the same 
> between two consecutive reports. This wastes CPU on the namenode.
> The problem becomes worse when the number of datanodes increases.
> One proposal is to make succeeding block reports (after a successful send of 
> a full block report) be incremental. This will make the namenode process only 
> those blocks that were added/deleted in the last period.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to