Siddharth Wagle created AMBARI-21527:
----------------------------------------
Summary: Restart of MR2 History Server failed due to wrong
NameNode RPC address
Key: AMBARI-21527
URL: https://issues.apache.org/jira/browse/AMBARI-21527
Project: Ambari
Issue Type: Bug
Components: ambari-server
Affects Versions: 2.5.2
Reporter: Siddharth Wagle
Assignee: Siddharth Wagle
Priority: Critical
Fix For: 2.5.2
Steps:
* Installed BI 4.2 cluster on Ambari 2.2 with Slider and services it required
* Upgraded Ambari to 2.5.2.0-146
* Registered HDP 2.6.1.0 repo, installed packages
* Restarted services that needed restart
* Ran service checks
* Started upgrade
Result: _Restarting History Server_ step failed with
{noformat:title=errors-87.txt}
Traceback (most recent call last):
File
"/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/historyserver.py",
line 134, in <module>
HistoryServer().execute()
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 329, in execute
method(env)
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 841, in restart
self.pre_upgrade_restart(env, upgrade_type=upgrade_type)
File
"/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/historyserver.py",
line 85, in pre_upgrade_restart
copy_to_hdfs("mapreduce", params.user_group, params.hdfs_user,
skip=params.sysprep_skip_copy_tarballs_hdfs)
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/functions/copy_tarball.py",
line 267, in copy_to_hdfs
replace_existing_files=replace_existing_files,
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
line 155, in __init__
self.env.run()
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 160, in run
self.run_action(resource, action)
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 124, in run_action
provider_action()
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 560, in action_create_on_execute
self.action_delayed("create")
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 557, in action_delayed
self.get_hdfs_resource_executor().action_delayed(action_name, self)
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 292, in action_delayed
self._create_resource()
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 308, in _create_resource
self._create_file(self.main_resource.resource.target,
source=self.main_resource.resource.source, mode=self.mode)
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 423, in _create_file
self.util.run_command(target, 'CREATE', method='PUT', overwrite=True,
assertable_result=False, file_to_put=source, **kwargs)
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 204, in run_command
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'curl -sS -L -w
'%{http_code}' -X PUT --data-binary
@/usr/hdp/2.6.1.0-129/hadoop/mapreduce.tar.gz -H 'Content-Type:
application/octet-stream'
'http://c7301.ambari.apache.org:50070/webhdfs/v1/hdp/apps/2.6.1.0-129/mapreduce/mapreduce.tar.gz?op=CREATE&user.name=hdfs&overwrite=True&permission=444''
returned status_code=403.
{
"RemoteException": {
"exception": "ConnectException",
"javaClassName": "java.net.ConnectException",
"message": "Call From c7301.ambari.apache.org/192.168.73.101 to
c7301.ambari.apache.org:8020 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused"
}
}
{noformat}
{noformat:title=NameNode log, pre-upgrade restart}
2017-07-18 07:48:05,435 INFO namenode.NameNode
(NameNode.java:setClientNamenodeAddress(397)) - fs.defaultFS is
hdfs://c7301.ambari.apache.org:8020
2017-07-18 07:48:05,436 INFO namenode.NameNode
(NameNode.java:setClientNamenodeAddress(417)) - Clients are to use
c7301.ambari.apache.org:8020 to access this namenode/service.
2017-07-18 07:48:07,343 INFO namenode.NameNode
(NameNodeRpcServer.java:<init>(342)) - RPC server is binding to
c7301.ambari.apache.org:8020
2017-07-18 07:48:07,434 INFO namenode.NameNode
(NameNode.java:startCommonServices(695)) - NameNode RPC up at:
c7301.ambari.apache.org/192.168.73.101:8020
{noformat}
{noformat:title=NameNode log, in-upgrade restart}
2017-07-18 09:03:42,336 INFO namenode.NameNode
(NameNode.java:setClientNamenodeAddress(450)) - fs.defaultFS is
hdfs://c7301.ambari.apache.org:8020
2017-07-18 09:03:42,337 INFO namenode.NameNode
(NameNode.java:setClientNamenodeAddress(470)) - Clients are to use
c7301.ambari.apache.org:8020 to access this namenode/service.
2017-07-18 09:03:44,686 INFO namenode.NameNode
(NameNodeRpcServer.java:<init>(428)) - RPC server is binding to localhost:8020
2017-07-18 09:03:44,995 INFO namenode.NameNode
(NameNode.java:startCommonServices(876)) - NameNode RPC up at:
localhost/127.0.0.1:8020
{noformat}
Looks like something during the upgrade configures NameNode RPC to listen only
on localhost.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)