[
https://issues.apache.org/jira/browse/HDFS-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177367#comment-13177367
]
Todd Lipcon commented on HDFS-2692:
-----------------------------------
bq. In FSEditLogLoader#loadFSEdits, should we really be unconditionally calling
FSNamesystem#notifyGenStampUpdate in the finally block? What if an error occurs
and maxGenStamp is never updated in FSEditLogLoader#loadEditRecords
This should be OK -- we'll just call it with the argument 0, which won't cause
any problem (0 is lower than any possible queued gen stamp)
bq. sp. "Initiatling" in TestHASafeMode#testComplexFailoverIntoSafemode
fixed
bq. In FSNamesystem#notifyGenStampUpdate, could be a better log message, and
the log level should probably not be info: LOG.info("=> notified of genstamp
update for: " + gs);
Fixed and changed to DEBUG level
bq. Why is SafeModeInfo#doConsistencyCheck costly? It doesn't seem like it
should be. If it's not in fact expensive, we might as well make it run
regardless of whether or not asserts are enabled
You're right that it's not super expensive, but this code gets called on every
block being reported during startup, which is a fair amount.. so I chose to
maintain the current behavior, of only running the checks when asserts are
enabled.
bq. Is there really no better way to check if assertions are enabled?
Not that I've ever found! :(
bq. seems like they should all be made member methods and moved to
MiniDFSCluster... Also seems like TestEditLogTailer#waitForStandbyToCatchUp
should be moved to MiniDFSCluster.
I'd like to move a bunch of these methods into a new {{HATestUtil}} class...
can I do that in a follow-up JIRA?
Eli said:
bq. Nice change and tests. Nit, I'd add a comment in
TestHASafeMode#restartStandby where the safemode extension is set indicating
the rationale, it looked like the asserts at the end were racy because I missed
this
Fixed
> HA: Bugs related to failover from/into safe-mode
> ------------------------------------------------
>
> Key: HDFS-2692
> URL: https://issues.apache.org/jira/browse/HDFS-2692
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ha, name-node
> Affects Versions: HA branch (HDFS-1623)
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Critical
> Attachments: hdfs-2692.txt, hdfs-2692.txt
>
>
> In testing I saw an AssertionError come up several times when I was trying to
> do failover between two NNs where one or the other was in safe-mode. Need to
> write some unit tests to try to trigger this -- hunch is it has something to
> do with the treatment of "safe block count" while tailing edits in safemode.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira