[
https://issues.apache.org/jira/browse/AMBARI-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alejandro Fernandez updated AMBARI-11605:
-----------------------------------------
Attachment: AMBARI-11605.patch
> Restarting HistoryServer fails during RU because NameNode is in safemode
> ------------------------------------------------------------------------
>
> Key: AMBARI-11605
> URL: https://issues.apache.org/jira/browse/AMBARI-11605
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server
> Affects Versions: 2.1.0
> Reporter: Alejandro Fernandez
> Assignee: Alejandro Fernandez
> Fix For: 2.1.0
>
> Attachments: AMBARI-11605.patch
>
>
> When restarting mapreduce HistoryServer for the first time during the Core
> Masters rolling upgrade, the restart fails with the following:
> {noformat}
> 2015-05-28 20:03:32,540 - HdfsResource['/hdp/apps/2.3.0.0-2112/mapreduce']
> {'security_enabled': False, 'hadoop_bin_dir':
> '/usr/hdp/2.3.0.0-2112/hadoop/bin', 'keytab': [EMPTY], 'default_fs':
> 'hdfs://c1ha', 'hdfs_site': ..., 'kinit_path_local': 'kinit',
> 'principal_name': [EMPTY], 'user': 'hdfs', 'owner': 'hdfs',
> 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type':
> 'directory', 'action': ['create_on_execute'], 'mode': 0555}
> 2015-05-28 20:03:32,600 - checked_call['curl -L -w '%{http_code}' -X GET
> 'http://jhurley-ru-2.c.pramod-thangali.internal:50070/webhdfs/v1/hdp/apps/2.3.0.0-2112/mapreduce?op=GETFILESTATUS&user.name=hdfs'']
> {'logoutput': None, 'user': 'hdfs', 'quiet': False}
> 2015-05-28 20:03:37,862 - checked_call returned (0,
> '{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
> does not exist: /hdp/apps/2.3.0.0-2112/mapreduce"}}404')
> 2015-05-28 20:03:37,866 - checked_call['curl -L -w '%{http_code}' -X PUT
> 'http://jhurley-ru-2.c.pramod-thangali.internal:50070/webhdfs/v1/hdp/apps/2.3.0.0-2112/mapreduce?op=MKDIRS&user.name=hdfs'']
> {'logoutput': None, 'user': 'hdfs', 'quiet': False}
> 2015-05-28 20:03:37,993 - checked_call returned (0,
> '{"RemoteException":{"exception":"RetriableException","javaClassName":"org.apache.hadoop.ipc.RetriableException","message":"org.apache.hadoop.hdfs.server.namenode.SafeModeException:
> Cannot create directory /hdp/apps/2.3.0.0-2112/mapreduce. Name node is in
> safe mode.\\nThe reported blocks 414 needs additional 77 blocks to reach the
> threshold 0.9900 of total blocks 495.\\nThe number of live datanodes 4 has
> reached the minimum number 0. Safe mode will be turned off automatically once
> the thresholds have been reached."}}403')
> {noformat}
> Retrying after this error fixes the problem.
> Turns out that now that the HDFS command run faster, by the time the
> HistorySever is restarted, it's still possible for the standby NameNode to
> still be in safemode.
> For this reason, we must wait for both NameNodes to come out of safemode
> before proceeding to any other services or Service Checks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)