[ 
https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17142713#comment-17142713
 ] 

Akira Ajisaka commented on HDFS-15421:
--------------------------------------

Thank you [~kihwal] for the detailed report. I read your report and the 
discussion in HDFS-14941.

In append operation, ANN first log {{OP_SET_GENSTAMP_V2}} and then log 
{{OP_APPEND}}. After HDFS-14941, SNN rolls {{OP_SET_GENSTAMP_V2}} log and set 
impending genstamp without updating the global genstamp. Next SNN rolls 
{{OP_APPEND}} log but the global genstamp is not updated. That's why genstamp 
is never updated and IBR always comes from the future. I think we need to 
update genstamp when rolling {{OP_APPEND}}. In {{OP_TRUNCATE}}, it is the same.

> IBR leak causes standby NN to be stuck in safe mode
> ---------------------------------------------------
>
>                 Key: HDFS-15421
>                 URL: https://issues.apache.org/jira/browse/HDFS-15421
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Kihwal Lee
>            Priority: Blocker
>
> After HDFS-14941, update of the global gen stamp is delayed in certain 
> situations.  This makes the last set of incremental block reports from append 
> "from future", which causes it to be simply re-queued to the pending DN 
> message queue, rather than processed to complete the block.  The last set of 
> IBRs will leak and never cleaned until it transitions to active.  The size of 
> {{pendingDNMessages}} constantly grows until then.
> If a leak happens while in a startup safe mode, the namenode will never be 
> able to come out of safe mode on its own.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to