Alejandro Fernandez created AMBARI-11605:
--------------------------------------------

             Summary: Restarting HistoryServer fails during RU because NameNode 
is in safemode
                 Key: AMBARI-11605
                 URL: https://issues.apache.org/jira/browse/AMBARI-11605
             Project: Ambari
          Issue Type: Bug
          Components: ambari-server
    Affects Versions: 2.1.0
            Reporter: Alejandro Fernandez
            Assignee: Alejandro Fernandez
             Fix For: 2.1.0


When restarting mapreduce HistoryServer for the first time during the Core 
Masters rolling upgrade, the restart fails with the following:

{noformat}
2015-05-28 20:03:32,540 - HdfsResource['/hdp/apps/2.3.0.0-2112/mapreduce'] 
{'security_enabled': False, 'hadoop_bin_dir': 
'/usr/hdp/2.3.0.0-2112/hadoop/bin', 'keytab': [EMPTY], 'default_fs': 
'hdfs://c1ha', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': 
[EMPTY], 'user': 'hdfs', 'owner': 'hdfs', 'hadoop_conf_dir': 
'/usr/hdp/current/hadoop-client/conf', 'type': 'directory', 'action': 
['create_on_execute'], 'mode': 0555}
2015-05-28 20:03:32,600 - checked_call['curl -L -w '%{http_code}' -X GET 
'http://jhurley-ru-2.c.pramod-thangali.internal:50070/webhdfs/v1/hdp/apps/2.3.0.0-2112/mapreduce?op=GETFILESTATUS&user.name=hdfs'']
 {'logoutput': None, 'user': 'hdfs', 'quiet': False}
2015-05-28 20:03:37,862 - checked_call returned (0, 
'{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
 does not exist: /hdp/apps/2.3.0.0-2112/mapreduce"}}404')
2015-05-28 20:03:37,866 - checked_call['curl -L -w '%{http_code}' -X PUT 
'http://jhurley-ru-2.c.pramod-thangali.internal:50070/webhdfs/v1/hdp/apps/2.3.0.0-2112/mapreduce?op=MKDIRS&user.name=hdfs'']
 {'logoutput': None, 'user': 'hdfs', 'quiet': False}
2015-05-28 20:03:37,993 - checked_call returned (0, 
'{"RemoteException":{"exception":"RetriableException","javaClassName":"org.apache.hadoop.ipc.RetriableException","message":"org.apache.hadoop.hdfs.server.namenode.SafeModeException:
 Cannot create directory /hdp/apps/2.3.0.0-2112/mapreduce. Name node is in safe 
mode.\\nThe reported blocks 414 needs additional 77 blocks to reach the 
threshold 0.9900 of total blocks 495.\\nThe number of live datanodes 4 has 
reached the minimum number 0. Safe mode will be turned off automatically once 
the thresholds have been reached."}}403')
{noformat}

Retrying after this error fixes the problem.

Turns out that now that the HDFS command run faster, by the time the 
HistorySever is restarted, it's still possible for the standby NameNode to 
still be in safemode.
For this reason, we must wait for both NameNodes to come out of safemode before 
proceeding to any other services or Service Checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to