Andrew Onischuk created AMBARI-12230:
----------------------------------------

             Summary: During HDP 2.1 to 2.2.6 upgrade dfs.journalnode.edits.dir 
is incorrectly changed
                 Key: AMBARI-12230
                 URL: https://issues.apache.org/jira/browse/AMBARI-12230
             Project: Ambari
          Issue Type: Bug
            Reporter: Andrew Onischuk
            Assignee: Andrew Onischuk
             Fix For: 2.1.0


PROBLEM: The customer was following the Ambari 2.0.1instructions for upgrading
the stack from HDP 2.1 to 2.2.6 found here:

<http://docs.hortonworks.com/HDPDocuments/Ambari-2.0.1.0/bk_upgrading_Ambari/c
ontent/_upgrading_the_hdp_stack_from_21_to_22.html>

When they tried to start the NN in section 3 (Complete the Upgrade), step 12
of those instructions it failed with the error

    
    
    2015-06-17 23:00:32,926 WARN ha.EditLogTailer 
(EditLogTailer.java:doWork(339)) - Edit log tailer interrupted 
    java.lang.InterruptedException: sleep interrupted 
    at java.lang.Thread.sleep(Native Method) 
    at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:337)
 
    at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
 
    at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
 
    at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:412)
 
    at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
 
    2015-06-17 23:00:32,930 INFO namenode.FSNamesystem 
(FSNamesystem.java:startActiveServices(1152)) - Starting services required for 
active state 
    2015-06-17 23:00:32,946 INFO client.QuorumJournalManager 
(QuorumJournalManager.java:recoverUnfinalizedSegments(435)) - Starting recovery 
process for unclosed journal segments... 
    2015-06-17 23:00:32,963 FATAL namenode.FSEditLog 
(JournalSet.java:mapJournalsAndReportErrors(398)) - Error: 
recoverUnfinalizedSegments failed for required journal 
(JournalAndStream(mgr=QJM to [10.222.32.220:8485, 10.222.32.214:8485, 
10.222.32.216:8485], stream=null)) 
    org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many 
exceptions to achieve quorum size 2/3. 3 exceptions thrown: 
    10.222.32.220:8485: Journal Storage Directory 
/hadoop/hdfs/journalnode/preprod not formatted 
    

BUSINESS IMPACT: Customer stuck during upgrade process. Attempting to roll
back will not work either.

SUPPORT ANALYSIS: The issue was caused by section 3, step 4 where they had to
run

    
    
    python upgradeHelper.py --hostname $HOSTNAME --user $USERNAME --password 
$PASSWORD --clustername $CLUSTERNAME --fromStack=2.1 --toStack=2.2.x 
--upgradeCatalog=UpgradeCatalog_2.1_to_2.2.x.json update-configs
    

They had a custom path for dfs.journalnode.edits.dir set to
/data/hadoop/hdfs/journal. The above changed that to /hadoop/hdfs/journalnode
meaning the JNs thought they were not formatted properly. There was no
warnings in Ambari to indicate an issue when they started the JNs.

STEPS TO REPRODUCE:  
Starting with an HDP 2.1 Ambari installed cluster, change
dfs.journalnode.edits.dir from the default and set up NN HA. Then attempt to
follow upgrade instructions

<http://docs.hortonworks.com/HDPDocuments/Ambari-2.0.1.0/bk_upgrading_Ambari/c
ontent/_upgrading_the_hdp_stack_from_21_to_22.html>

to upgrade the HDP stack from 2.1 to 2.2.6.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to