[ https://issues.apache.org/jira/browse/AMBARI-21527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Di Li reassigned AMBARI-21527: ------------------------------ Assignee: Di Li (was: Doroszlai, Attila) > Restart of MR2 History Server failed due to wrong NameNode RPC address > ---------------------------------------------------------------------- > > Key: AMBARI-21527 > URL: https://issues.apache.org/jira/browse/AMBARI-21527 > Project: Ambari > Issue Type: Bug > Components: ambari-server > Affects Versions: 2.5.2 > Reporter: Siddharth Wagle > Assignee: Di Li > Priority: Critical > Fix For: 2.5.2 > > Attachments: AMBARI-21527-HA_and_NonHA.patch, AMBARI-21527.patch > > > Steps: > * Installed BI 4.2 cluster on Ambari 2.2 with Slider and services it required > * Upgraded Ambari to 2.5.2.0-146 > * Registered HDP 2.6.1.0 repo, installed packages > * Restarted services that needed restart > * Ran service checks > * Started upgrade > Result: _Restarting History Server_ step failed with > {noformat:title=errors-87.txt} > Traceback (most recent call last): > File > "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/historyserver.py", > line 134, in <module> > HistoryServer().execute() > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", > line 329, in execute > method(env) > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", > line 841, in restart > self.pre_upgrade_restart(env, upgrade_type=upgrade_type) > File > "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/historyserver.py", > line 85, in pre_upgrade_restart > copy_to_hdfs("mapreduce", params.user_group, params.hdfs_user, > skip=params.sysprep_skip_copy_tarballs_hdfs) > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/copy_tarball.py", > line 267, in copy_to_hdfs > replace_existing_files=replace_existing_files, > File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", > line 155, in __init__ > self.env.run() > File > "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", > line 160, in run > self.run_action(resource, action) > File > "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", > line 124, in run_action > provider_action() > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", > line 560, in action_create_on_execute > self.action_delayed("create") > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", > line 557, in action_delayed > self.get_hdfs_resource_executor().action_delayed(action_name, self) > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", > line 292, in action_delayed > self._create_resource() > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", > line 308, in _create_resource > self._create_file(self.main_resource.resource.target, > source=self.main_resource.resource.source, mode=self.mode) > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", > line 423, in _create_file > self.util.run_command(target, 'CREATE', method='PUT', overwrite=True, > assertable_result=False, file_to_put=source, **kwargs) > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", > line 204, in run_command > raise Fail(err_msg) > resource_management.core.exceptions.Fail: Execution of 'curl -sS -L -w > '%{http_code}' -X PUT --data-binary > @/usr/hdp/2.6.1.0-129/hadoop/mapreduce.tar.gz -H 'Content-Type: > application/octet-stream' > 'http://c7301.ambari.apache.org:50070/webhdfs/v1/hdp/apps/2.6.1.0-129/mapreduce/mapreduce.tar.gz?op=CREATE&user.name=hdfs&overwrite=True&permission=444'' > returned status_code=403. > { > "RemoteException": { > "exception": "ConnectException", > "javaClassName": "java.net.ConnectException", > "message": "Call From c7301.ambari.apache.org/192.168.73.101 to > c7301.ambari.apache.org:8020 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused" > } > } > {noformat} > {noformat:title=NameNode log, pre-upgrade restart} > 2017-07-18 07:48:05,435 INFO namenode.NameNode > (NameNode.java:setClientNamenodeAddress(397)) - fs.defaultFS is > hdfs://c7301.ambari.apache.org:8020 > 2017-07-18 07:48:05,436 INFO namenode.NameNode > (NameNode.java:setClientNamenodeAddress(417)) - Clients are to use > c7301.ambari.apache.org:8020 to access this namenode/service. > 2017-07-18 07:48:07,343 INFO namenode.NameNode > (NameNodeRpcServer.java:<init>(342)) - RPC server is binding to > c7301.ambari.apache.org:8020 > 2017-07-18 07:48:07,434 INFO namenode.NameNode > (NameNode.java:startCommonServices(695)) - NameNode RPC up at: > c7301.ambari.apache.org/192.168.73.101:8020 > {noformat} > {noformat:title=NameNode log, in-upgrade restart} > 2017-07-18 09:03:42,336 INFO namenode.NameNode > (NameNode.java:setClientNamenodeAddress(450)) - fs.defaultFS is > hdfs://c7301.ambari.apache.org:8020 > 2017-07-18 09:03:42,337 INFO namenode.NameNode > (NameNode.java:setClientNamenodeAddress(470)) - Clients are to use > c7301.ambari.apache.org:8020 to access this namenode/service. > 2017-07-18 09:03:44,686 INFO namenode.NameNode > (NameNodeRpcServer.java:<init>(428)) - RPC server is binding to localhost:8020 > 2017-07-18 09:03:44,995 INFO namenode.NameNode > (NameNode.java:startCommonServices(876)) - NameNode RPC up at: > localhost/127.0.0.1:8020 > {noformat} > Looks like something during the upgrade configures NameNode RPC to listen > only on localhost. -- This message was sent by Atlassian JIRA (v6.4.14#64029)