Hey Ayush,

Thanks a lot for your proposal.

Do you mean the Full Block Report that is sent out every 6 hours per
DataNode?
Someone told me they reduced the frequency of FBR to 24 hours and it seems
okay.

One of the purposes of FBR was to prevent bugs in incremental block report
implementation. In other words, it's a fail-safe mechanism. Any bugs in
IBRs get corrected after a FBR that refreshes the state of blocks at
NameNode. At least, that's my understanding of FBRs in its early days.

On Tue, Feb 4, 2020 at 12:21 AM Ayush Saxena <ayush...@gmail.com> wrote:

> Hi All,
> Me and Surendra have been lately trying to minimise the impact of Block
> Reports on Namenode in huge cluster. We observed in a huge cluster, about
> 10k datanodes, the periodic block reports impact the Namenode performance
> adversely.
> We have been thinking to restrict the block reports to be triggered only
> during Namenode startup or in case of failover and eliminate the periodic
> block report.
> The main purpose of block report is to get a corrupt blocks recognised, so
> as a follow up we can maintain a service at datanode to run periodically to
> check if the block size in memory is same as that reported to namenode, and
> the datanode can alarm the namenode in case of any suspect,(We still need
> to plan this.)
>
> At the datanode side, a datanode can send a BlockReport or restore its
> actual frequency in case during the configured time period, the Datanode
> got shutdown or lost connection with the namenode, say if the datanode was
> supposed to send BR at 2100 hrs, if during the last 6 hrs there has been
> any failover or loss of connection between the namenode and datanode, it
> will trigger BR normally, else shall skip sending the BR
>
> Let us know thoughts/challenges/improvements in this.
>
> -Ayush
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>

Reply via email to