[
https://issues.apache.org/jira/browse/AMBARI-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13780583#comment-13780583
]
Siddharth Wagle commented on AMBARI-3368:
-----------------------------------------
Patch does the following:
1. Save a local stub with all dirs created successfully by NN, (after mkdir,
chown, chomd).
2. Second time the NN starts up all of the above commands will be short
circuited with a check for dir in the stub.
3. The only gotcha is, this is a local file not shared by the 2 NN's, so first
failover will be slow.
> NameNode start hangs with HA config'd
> -------------------------------------
>
> Key: AMBARI-3368
> URL: https://issues.apache.org/jira/browse/AMBARI-3368
> Project: Ambari
> Issue Type: Bug
> Components: agent
> Affects Versions: 1.4.1
> Reporter: Siddharth Wagle
> Assignee: Siddharth Wagle
> Fix For: 1.4.1
>
> Attachments: AMBARI-3368.patch
>
>
> After configuring NameNode HA, I found starting a namenode hangs and fails
> with "Puppet has been killed due to timeout"
> 1) Install cluster
> 2) enable NameNode HA
> 3) Stop standby namenode on Hosts details page
> 4) Stop active namenode on Hosts details page
> 5) Start namenode on Hosts details page
> 6) Hangs on start. stops at 35% complete. Then after ~ 10 minutes, puppet has
> been killed due to timeout
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira