[
https://issues.apache.org/jira/browse/HDFS-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186699#comment-15186699
]
Brahma Reddy Battula commented on HDFS-9917:
--------------------------------------------
*{color:blue}As current intention is not overload the NN{color}. Planning to
fix like following*
- *{color:green}Clear the IBRS on re-register to namenode.{color}*
{code}
void reRegister() throws IOException {
if (shouldRun()) {
// re-retrieve namespace info to make sure that, if the NN
// was restarted, we still match its version (HDFS-2120)
NamespaceInfo nsInfo = retrieveNamespaceInfo();
// and re-register
register(nsInfo);
scheduler.scheduleHeartbeat();
//HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
// for sometime.
if (state == HAServiceState.STANDBY) {
ibrManager.clearIBRs();
}
}
}
{code}
Any thoughts on this..?
> IBR accumulate more objects when SNN was down for sometime.
> -----------------------------------------------------------
>
> Key: HDFS-9917
> URL: https://issues.apache.org/jira/browse/HDFS-9917
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Brahma Reddy Battula
> Assignee: Brahma Reddy Battula
>
> SNN was down for sometime because of some reasons..After restarting SNN,it
> became unreponsive because
> - 29 DN's sending IBR in each 5 million ( most of them are delete IBRs),
> where as each datanode had only ~2.5 million blocks.
> - GC can't trigger on this objects since all will be under RPC queue.
> To recover this( to clear this objects) ,restarted all the DN's one by
> one..This issue happened in 2.4.1 where split of blockreport was not
> available.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)