David McWhorter created AMBARI-10317:
----------------------------------------
Summary: Knox gateway fails to restart on Ubuntu 12.04 after
system restart
Key: AMBARI-10317
URL: https://issues.apache.org/jira/browse/AMBARI-10317
Project: Ambari
Issue Type: Bug
Components: ambari-server
Affects Versions: 2.0.0
Environment: ubuntu 12.04
Reporter: David McWhorter
Fix For: 2.0.0
We are testing deploying an HDP 2.2. Cluster using ambari 2.0.0-rc2 running on
ubuntu 12.04. I’ve been able to set up a cluster running HDFS, MapReduce2,
YARN, Zookeeper, Knox, Ranger, and Ambari Metrics. When I shut down the whole
cluster using Actions -> Stop All in Ambari, reboot the hosts, and then try to
restart the cluster I see the error below restarting the Knox gateway. The
directory /var/run/knox is indeed missing on the master host.
Knox Gateway startup log:
2015-04-01 16:17:12,075 - Error while executing command 'start':
Traceback (most recent call last):
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 214, in execute
method(env)
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py",
line 89, in thunk
return fn(*args, **kwargs)
File
"/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py",
line 80, in start
self.configure(env)
File
"/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py",
line 64, in configure
knox()
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py",
line 89, in thunk
return fn(*args, **kwargs)
File
"/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox.py",
line 99, in knox
sudo = True,
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
line 148, in __init__
self.env.run()
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 152, in run
self.run_action(resource, action)
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 118, in run_action
provider_action()
File
"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 274, in action_run
raise ex
Fail: Execution of 'chown -R knox:knox /var/lib/knox/data /var/log/knox
/var/log/knox /var/run/knox /etc/knox/conf' returned 1. chown: cannot access
`/var/run/knox': No such file or directory
stdout: /var/lib/ambari-agent/data/output-107.txt
2015-04-01 16:17:06,744 - u"Group['hadoop']" {'ignore_failures': False}
2015-04-01 16:17:06,744 - Modifying group hadoop
2015-04-01 16:17:06,797 - u"Group['users']" {'ignore_failures': False}
2015-04-01 16:17:06,797 - Modifying group users
2015-04-01 16:17:06,839 - u"Group['knox']" {'ignore_failures': False}
2015-04-01 16:17:06,839 - Modifying group knox
2015-04-01 16:17:06,886 - u"Group['ranger']" {'ignore_failures': False}
2015-04-01 16:17:06,886 - Modifying group ranger
2015-04-01 16:17:06,930 - u"User['mapred']" {'gid': 'hadoop',
'ignore_failures': False, 'groups': [u'hadoop']}
2015-04-01 16:17:06,930 - Modifying user mapred
2015-04-01 16:17:06,976 - u"User['root']" {'gid': 'hadoop', 'ignore_failures':
False, 'groups': [u'hadoop']}
2015-04-01 16:17:06,977 - Modifying user root
2015-04-01 16:17:07,019 - u"User['ambari-qa']" {'gid': 'hadoop',
'ignore_failures': False, 'groups': [u'users']}
2015-04-01 16:17:07,020 - Modifying user ambari-qa
2015-04-01 16:17:07,066 - u"User['zookeeper']" {'gid': 'hadoop',
'ignore_failures': False, 'groups': [u'hadoop']}
2015-04-01 16:17:07,066 - Modifying user zookeeper
2015-04-01 16:17:07,109 - u"User['rangerlogger']" {'gid': 'hadoop',
'ignore_failures': False, 'groups': [u'hadoop']}
2015-04-01 16:17:07,110 - Modifying user rangerlogger
2015-04-01 16:17:07,152 - u"User['hdfs']" {'gid': 'hadoop', 'ignore_failures':
False, 'groups': [u'hadoop']}
2015-04-01 16:17:07,152 - Modifying user hdfs
2015-04-01 16:17:07,195 - u"User['knox']" {'gid': 'hadoop', 'ignore_failures':
False, 'groups': [u'hadoop']}
2015-04-01 16:17:07,195 - Modifying user knox
2015-04-01 16:17:07,238 - u"User['ranger']" {'gid': 'hadoop',
'ignore_failures': False, 'groups': [u'hadoop']}
2015-04-01 16:17:07,238 - Modifying user ranger
2015-04-01 16:17:07,282 - u"User['yarn']" {'gid': 'hadoop', 'ignore_failures':
False, 'groups': [u'hadoop']}
2015-04-01 16:17:07,283 - Modifying user yarn
2015-04-01 16:17:07,326 - u"User['ams']" {'gid': 'hadoop', 'ignore_failures':
False, 'groups': [u'hadoop']}
2015-04-01 16:17:07,327 - Modifying user ams
2015-04-01 16:17:07,370 - u"User['rangeradmin']" {'gid': 'hadoop',
'ignore_failures': False, 'groups': [u'hadoop']}
2015-04-01 16:17:07,370 - Modifying user rangeradmin
2015-04-01 16:17:07,413 -
u"File['/var/lib/ambari-agent/data/tmp/changeUid.sh']" {'content':
StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2015-04-01 16:17:07,686 -
u"Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa
/tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa']"
{'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
2015-04-01 16:17:07,728 - Skipping
u"Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa
/tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa']"
due to not_if
2015-04-01 16:17:07,728 - u"Group['hdfs']" {'ignore_failures': False}
2015-04-01 16:17:07,728 - Modifying group hdfs
2015-04-01 16:17:07,774 - u"User['hdfs']" {'ignore_failures': False, 'groups':
[u'hadoop', 'hadoop', 'hdfs', u'hdfs']}
2015-04-01 16:17:07,775 - Modifying user hdfs
2015-04-01 16:17:07,818 - u"Directory['/etc/hadoop']" {'mode': 0755}
2015-04-01 16:17:07,974 - u"Directory['/etc/hadoop/conf.empty']" {'owner':
'root', 'group': 'hadoop', 'recursive': True}
2015-04-01 16:17:08,110 - u"Link['/etc/hadoop/conf']" {'not_if': 'ls
/etc/hadoop/conf', 'to': '/etc/hadoop/conf.empty'}
2015-04-01 16:17:08,153 - Skipping u"Link['/etc/hadoop/conf']" due to not_if
2015-04-01 16:17:08,160 - u"File['/etc/hadoop/conf/hadoop-env.sh']" {'content':
InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2015-04-01 16:17:08,396 - u"Execute['('setenforce', '0')']" {'sudo': True,
'only_if': 'test -f /selinux/enforce'}
2015-04-01 16:17:08,448 - Skipping u"Execute['('setenforce', '0')']" due to
only_if
2015-04-01 16:17:08,448 - u"Directory['/var/log/hadoop']" {'owner': 'root',
'mode': 0775, 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2015-04-01 16:17:08,843 - u"Directory['/var/run/hadoop']" {'owner': 'root',
'group': 'root', 'recursive': True, 'cd_access': 'a'}
2015-04-01 16:17:08,886 - Creating directory u"Directory['/var/run/hadoop']"
2015-04-01 16:17:09,066 - Changing group for /var/run/hadoop from 1000 to root
2015-04-01 16:17:09,364 - u"Directory['/tmp/hadoop-hdfs']" {'owner': 'hdfs',
'recursive': True, 'cd_access': 'a'}
2015-04-01 16:17:09,407 - Creating directory u"Directory['/tmp/hadoop-hdfs']"
2015-04-01 16:17:09,587 - Changing owner for /tmp/hadoop-hdfs from 0 to hdfs
2015-04-01 16:17:09,820 -
u"File['/etc/hadoop/conf/commons-logging.properties']" {'content':
Template('commons-logging.properties.j2'), 'owner': 'hdfs'}
2015-04-01 16:17:10,049 - u"File['/etc/hadoop/conf/health_check']" {'content':
Template('health_check-v2.j2'), 'owner': 'hdfs'}
2015-04-01 16:17:10,272 - u"File['/etc/hadoop/conf/log4j.properties']"
{'content': '...', 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
2015-04-01 16:17:10,506 -
u"File['/etc/hadoop/conf/hadoop-metrics2.properties']" {'content':
Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs'}
2015-04-01 16:17:10,732 - u"File['/etc/hadoop/conf/task-log4j.properties']"
{'content': StaticFile('task-log4j.properties'), 'mode': 0755}
2015-04-01 16:17:11,085 - u"Directory['/etc/knox/conf']" {'owner': 'knox',
'group': 'knox', 'recursive': True}
2015-04-01 16:17:11,231 - u"XmlConfig['gateway-site.xml']" {'owner': 'knox',
'group': 'knox', 'conf_dir': '/etc/knox/conf', 'configuration_attributes': {},
'configurations': ...}
2015-04-01 16:17:11,239 - Generating config: /etc/knox/conf/gateway-site.xml
2015-04-01 16:17:11,239 - u"File['/etc/knox/conf/gateway-site.xml']" {'owner':
'knox', 'content': InlineTemplate(...), 'group': 'knox', 'mode': None,
'encoding': 'UTF-8'}
2015-04-01 16:17:11,422 - Writing u"File['/etc/knox/conf/gateway-site.xml']"
because contents don't match
2015-04-01 16:17:11,561 - u"File['/etc/knox/conf/gateway-log4j.properties']"
{'content': '...', 'owner': 'knox', 'group': 'knox', 'mode': 0644}
2015-04-01 16:17:11,790 - u"File['/etc/knox/conf/topologies/default.xml']"
{'content': InlineTemplate(...), 'owner': 'knox', 'group': 'knox'}
2015-04-01 16:17:12,014 - u"Execute['('chown', '-R', u'knox:knox',
'/var/lib/knox/data', '/var/log/knox', '/var/log/knox', u'/var/run/knox',
'/etc/knox/conf')']" {'sudo': True}
2015-04-01 16:17:12,075 - Error while executing command 'start':
Traceback (most recent call last):
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 214, in execute
method(env)
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py",
line 89, in thunk
return fn(*args, **kwargs)
File
"/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py",
line 80, in start
self.configure(env)
File
"/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py",
line 64, in configure
knox()
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py",
line 89, in thunk
return fn(*args, **kwargs)
File
"/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox.py",
line 99, in knox
sudo = True,
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
line 148, in __init__
self.env.run()
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 152, in run
self.run_action(resource, action)
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 118, in run_action
provider_action()
File
"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 274, in action_run
raise ex
Fail: Execution of 'chown -R knox:knox /var/lib/knox/data /var/log/knox
/var/log/knox /var/run/knox /etc/knox/conf' returned 1. chown: cannot access
`/var/run/knox': No such file or directory
2015-04-01 16:17:12,119 - Command: /usr/bin/hdp-select status knox-server >
/tmp/tmp7GgVe1
Output: knox-server - 2.2.0.0-2041
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)