[ 
https://issues.apache.org/jira/browse/AMBARI-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13780583#comment-13780583
 ] 

Siddharth Wagle commented on AMBARI-3368:
-----------------------------------------

Patch does the following:

1. Save a local stub with all dirs created successfully by NN, (after mkdir, 
chown, chomd).
2. Second time the NN starts up all of the above commands will be short 
circuited with a check for dir in the stub.
3. The only gotcha is, this is a local file not shared by the 2 NN's, so first 
failover will be slow. 
                
> NameNode start hangs with HA config'd
> -------------------------------------
>
>                 Key: AMBARI-3368
>                 URL: https://issues.apache.org/jira/browse/AMBARI-3368
>             Project: Ambari
>          Issue Type: Bug
>          Components: agent
>    Affects Versions: 1.4.1
>            Reporter: Siddharth Wagle
>            Assignee: Siddharth Wagle
>             Fix For: 1.4.1
>
>         Attachments: AMBARI-3368.patch
>
>
> After configuring NameNode HA, I found starting a namenode hangs and fails 
> with "Puppet has been killed due to timeout"
> 1) Install cluster
> 2) enable NameNode HA
> 3) Stop standby namenode on Hosts details page
> 4) Stop active namenode on Hosts details page
> 5) Start namenode on Hosts details page
> 6) Hangs on start. stops at 35% complete. Then after ~ 10 minutes, puppet has 
> been killed due to timeout

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to