Boris Bondarenko created HDFS-16289:
---------------------------------------

             Summary: Hadoop HA checkpointer issue 
                 Key: HDFS-16289
                 URL: https://issues.apache.org/jira/browse/HDFS-16289
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: dfs
    Affects Versions: 3.2.2
            Reporter: Boris Bondarenko


In HA setup active namenode will reject fsimage sync from one of the two 
standby namenodes all the time. This maybe an edge case, in our environment it 
primarily affect standby cluster. What we experienced was memory problem on 
standby namenodes in the scenario when the standby node was not able to 
complete sync cycle for a long time.

It is my understanding that the break out from the loop will only happen when 
doCheckpoint call succeeds otherwise it throws an exception and continues.

I can provide more details on my findings with code references if necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to