[
https://issues.apache.org/jira/browse/HDFS-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970372#comment-13970372
]
Lohit Vijayarenu commented on HDFS-6248:
----------------------------------------
One corner case I suspect is QuotaCheck done FSDirectory::addChild in active vs
standby namenodes. When a file is created by active namenode and synced to
edits, active NN's quota check might be close to its max, by the time standby
NN replays this edit log space quota could have increased because of other
files in a directory and valid edit log might hit QuotaExceededException. I
feel when Standby namenode replays edits, it should ignore quota check since it
is already controlled by Active Namenode anyways. This should solve the race
condition and prevent Standby namenode from crashing. What do other think about
this approach?
> SNN crash during replay of FSEditLog of files inside directories having
> QuotaExceeded directories
> --------------------------------------------------------------------------------------------------
>
> Key: HDFS-6248
> URL: https://issues.apache.org/jira/browse/HDFS-6248
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.0.6-alpha, 2.4.0
> Environment: NameNode HA setup with Active/Standby using QJM
> Reporter: Lohit Vijayarenu
>
> We are seeing cases when Secondary NameNode crashes without recovery when it
> tries to replay edit log of files which are part of directories which have
> exceeded Quota. While debugging we got stack trace but we are still trying to
> reproduce this and wanted to note this to see if anyone else had seen this
> issue already.
--
This message was sent by Atlassian JIRA
(v6.2#6252)