Dmitry Lysnichenko created AMBARI-13294:
-------------------------------------------
Summary: ACCUMULO_TRACER START failed after enabling Kerberos
Key: AMBARI-13294
URL: https://issues.apache.org/jira/browse/AMBARI-13294
Project: Ambari
Issue Type: Bug
Reporter: Dmitry Lysnichenko
Assignee: Dmitry Lysnichenko
After enabling Kerberos on the "Start and Test Services" step ACCUMULO_TRACER
START failed.
{code}
"stderr" : "Python script has been killed due to timeout after waiting 180
secs",
{code}
{code}
"stdout" : "2015-09-25 14:42:53,963 - Group['custom-spark'] {}\n2015-09-25
14:42:53,964 - Group['hadoop'] {}\n2015-09-25 14:42:53,965 -
Group['custom-users'] {}\n2015-09-25 14:42:53,965 - Group['custom-knox-group']
{}\n2015-09-25 14:42:53,965 - User['custom-sqoop'] {'gid': 'hadoop', 'groups':
[u'hadoop']}\n2015-09-25 14:42:53,966 - User['custom-knox'] {'gid': 'hadoop',
'groups': [u'hadoop']}\n2015-09-25 14:42:53,967 - User['custom-hdfs'] {'gid':
'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,968 -
User['custom-oozie'] {'gid': 'hadoop', 'groups': [u'custom-users']}\n2015-09-25
14:42:53,969 - User['custom-smoke'] {'gid': 'hadoop', 'groups':
[u'custom-users']}\n2015-09-25 14:42:53,970 - User['custom-hbase'] {'gid':
'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,971 - User['custom-tez']
{'gid': 'hadoop', 'groups': [u'custom-users']}\n2015-09-25 14:42:53,972 -
User['custom-hive'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25
14:42:53,973 - User['custom-mr'] {'gid': 'hadoop', 'groups':
[u'hadoop']}\n2015-09-25 14:42:53,973 - User['custom-accumulo'] {'gid':
'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,974 - User['custom-hcat']
{'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,975 -
User['custom-ams'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25
14:42:53,976 - User['custom-yarn'] {'gid': 'hadoop', 'groups':
[u'hadoop']}\n2015-09-25 14:42:53,977 - User['custom-falcon'] {'gid': 'hadoop',
'groups': [u'custom-users']}\n2015-09-25 14:42:53,977 - User['custom-spark']
{'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,978 -
User['custom-atlas'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25
14:42:53,979 - User['custom-flume'] {'gid': 'hadoop', 'groups':
[u'hadoop']}\n2015-09-25 14:42:53,980 - User['custom-kafka'] {'gid': 'hadoop',
'groups': [u'hadoop']}\n2015-09-25 14:42:53,981 - User['custom-zookeeper']
{'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,982 -
User['custom-mahout'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25
14:42:53,982 - User['custom-storm'] {'gid': 'hadoop', 'groups':
[u'hadoop']}\n2015-09-25 14:42:53,983 -
File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content':
StaticFile('changeToSecureUid.sh'), 'mode': 0555}\n2015-09-25 14:42:53,985 -
Execute['/var/lib/ambari-agent/tmp/changeUid.sh custom-smoke
/tmp/hadoop-custom-smoke,/tmp/hsperfdata_custom-smoke,/home/custom-smoke,/tmp/custom-smoke,/tmp/sqoop-custom-smoke']
{'not_if': '(test $(id -u custom-smoke) -gt 1000) || (false)'}\n2015-09-25
14:42:53,991 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh
custom-smoke
/tmp/hadoop-custom-smoke,/tmp/hsperfdata_custom-smoke,/home/custom-smoke,/tmp/custom-smoke,/tmp/sqoop-custom-smoke']
due to not_if\n2015-09-25 14:42:53,991 - Directory['/tmp/hbase-hbase']
{'owner': 'custom-hbase', 'recursive': True, 'mode': 0775, 'cd_access':
'a'}\n2015-09-25 14:42:53,992 - File['/var/lib/ambari-agent/tmp/changeUid.sh']
{'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}\n2015-09-25
14:42:53,993 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh custom-hbase
/home/custom-hbase,/tmp/custom-hbase,/usr/bin/custom-hbase,/var/log/custom-hbase,/tmp/hbase-hbase']
{'not_if': '(test $(id -u custom-hbase) -gt 1000) || (false)'}\n2015-09-25
14:42:53,999 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh
custom-hbase
/home/custom-hbase,/tmp/custom-hbase,/usr/bin/custom-hbase,/var/log/custom-hbase,/tmp/hbase-hbase']
due to not_if\n2015-09-25 14:42:54,000 - Group['custom-hdfs']
{'ignore_failures': False}\n2015-09-25 14:42:54,000 - User['custom-hdfs']
{'ignore_failures': False, 'groups': [u'hadoop', u'custom-hdfs']}\n2015-09-25
14:42:54,001 - Directory['/etc/hadoop'] {'mode': 0755}\n2015-09-25 14:42:54,019
- File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content':
InlineTemplate(...), 'owner': 'root', 'group': 'hadoop'}\n2015-09-25
14:42:54,019 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir']
{'owner': 'custom-hdfs', 'group': 'hadoop', 'mode': 0777}\n2015-09-25
14:42:54,032 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce )
|| (which getenforce && getenforce | grep -q Disabled)', 'sudo': True,
'only_if': 'test -f /selinux/enforce'}\n2015-09-25 14:42:54,039 - Skipping
Execute[('setenforce', '0')] due to not_if\n2015-09-25 14:42:54,040 -
Directory['/grid/0/log/hadoop'] {'owner': 'root', 'mode': 0775, 'group':
'hadoop', 'recursive': True, 'cd_access': 'a'}\n2015-09-25 14:42:54,043 -
Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root', 'recursive':
True, 'cd_access': 'a'}\n2015-09-25 14:42:54,043 -
Directory['/tmp/hadoop-custom-hdfs'] {'owner': 'custom-hdfs', 'recursive':
True, 'cd_access': 'a'}\n2015-09-25 14:42:54,048 -
File['/usr/hdp/current/hadoop-client/conf/commons-logging.properties']
{'content': Template('commons-logging.properties.j2'), 'owner':
'root'}\n2015-09-25 14:42:54,051 -
File['/usr/hdp/current/hadoop-client/conf/health_check'] {'content':
Template('health_check.j2'), 'owner': 'root'}\n2015-09-25 14:42:54,051 -
File['/usr/hdp/current/hadoop-client/conf/log4j.properties'] {'content': ...,
'owner': 'custom-hdfs', 'group': 'hadoop', 'mode': 0644}\n2015-09-25
14:42:54,074 -
File['/usr/hdp/current/hadoop-client/conf/hadoop-metrics2.properties']
{'content': Template('hadoop-metrics2.properties.j2'), 'owner':
'custom-hdfs'}\n2015-09-25 14:42:54,075 -
File['/usr/hdp/current/hadoop-client/conf/task-log4j.properties'] {'content':
StaticFile('task-log4j.properties'), 'mode': 0755}\n2015-09-25 14:42:54,076 -
File['/usr/hdp/current/hadoop-client/conf/configuration.xsl'] {'owner':
'custom-hdfs', 'group': 'hadoop'}\n2015-09-25 14:42:54,083 -
File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'custom-hdfs',
'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d
/etc/hadoop/conf', 'group': 'hadoop'}\n2015-09-25 14:42:54,089 -
File['/etc/hadoop/conf/topology_script.py'] {'content':
StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf',
'mode': 0755}\n2015-09-25 14:42:54,275 -
Directory['/usr/hdp/current/accumulo-tracer/conf'] {'owner': 'custom-accumulo',
'group': 'hadoop', 'recursive': True, 'mode': 0755}\n2015-09-25 14:42:54,277 -
Directory['/usr/hdp/current/accumulo-tracer/conf/server'] {'owner':
'custom-accumulo', 'group': 'hadoop', 'recursive': True, 'mode':
0700}\n2015-09-25 14:42:54,278 - XmlConfig['accumulo-site.xml'] {'group':
'hadoop', 'conf_dir': '/usr/hdp/current/accumulo-tracer/conf/server', 'mode':
0600, 'configuration_attributes': {}, 'owner': 'custom-accumulo',
'configurations': ...}\n2015-09-25 14:42:54,292 - Generating config:
/usr/hdp/current/accumulo-tracer/conf/server/accumulo-site.xml\n2015-09-25
14:42:54,293 -
File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-site.xml']
{'owner': 'custom-accumulo', 'content': InlineTemplate(...), 'group': 'hadoop',
'mode': 0600, 'encoding': 'UTF-8'}\n2015-09-25 14:42:54,317 -
Directory['/var/run/accumulo'] {'owner': 'custom-accumulo', 'group': 'hadoop',
'recursive': True}\n2015-09-25 14:42:54,318 - Directory['/grid/0/log/accumulo']
{'owner': 'custom-accumulo', 'group': 'hadoop', 'recursive': True}\n2015-09-25
14:42:54,323 -
File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-env.sh']
{'content': InlineTemplate(...), 'owner': 'custom-accumulo', 'group': 'hadoop',
'mode': 0644}\n2015-09-25 14:42:54,324 -
PropertiesFile['/usr/hdp/current/accumulo-tracer/conf/server/client.conf']
{'owner': 'custom-accumulo', 'group': 'hadoop', 'properties':
{'instance.zookeeper.host':
u'ambari-ooziehive-r1-2.novalocal:2181,ambari-ooziehive-r1-3.novalocal:2181,ambari-ooziehive-r1-5.novalocal:2181',
'instance.name': u'hdp-accumulo-instance', 'instance.rpc.sasl.enabled': True,
'instance.zookeeper.timeout': u'30s'}}\n2015-09-25 14:42:54,329 - Generating
properties file:
/usr/hdp/current/accumulo-tracer/conf/server/client.conf\n2015-09-25
14:42:54,329 - File['/usr/hdp/current/accumulo-tracer/conf/server/client.conf']
{'owner': 'custom-accumulo', 'content': InlineTemplate(...), 'group': 'hadoop',
'mode': None}\n2015-09-25 14:42:54,332 - Writing
File['/usr/hdp/current/accumulo-tracer/conf/server/client.conf'] because
contents don't match\n2015-09-25 14:42:54,333 -
File['/usr/hdp/current/accumulo-tracer/conf/server/log4j.properties']
{'content': ..., 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode':
0644}\n2015-09-25 14:42:54,333 -
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/auditLog.xml']
{'owner': 'custom-accumulo', 'template_tag': None, 'group':
'hadoop'}\n2015-09-25 14:42:54,337 -
File['/usr/hdp/current/accumulo-tracer/conf/server/auditLog.xml'] {'content':
Template('auditLog.xml.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop',
'mode': None}\n2015-09-25 14:42:54,337 -
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/generic_logger.xml']
{'owner': 'custom-accumulo', 'template_tag': None, 'group':
'hadoop'}\n2015-09-25 14:42:54,341 -
File['/usr/hdp/current/accumulo-tracer/conf/server/generic_logger.xml']
{'content': Template('generic_logger.xml.j2'), 'owner': 'custom-accumulo',
'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,342 -
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/monitor_logger.xml']
{'owner': 'custom-accumulo', 'template_tag': None, 'group':
'hadoop'}\n2015-09-25 14:42:54,344 -
File['/usr/hdp/current/accumulo-tracer/conf/server/monitor_logger.xml']
{'content': Template('monitor_logger.xml.j2'), 'owner': 'custom-accumulo',
'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,345 -
File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-metrics.xml']
{'content': StaticFile('accumulo-metrics.xml'), 'owner': 'custom-accumulo',
'group': 'hadoop', 'mode': 0644}\n2015-09-25 14:42:54,346 -
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/tracers']
{'owner': 'custom-accumulo', 'template_tag': None, 'group':
'hadoop'}\n2015-09-25 14:42:54,348 -
File['/usr/hdp/current/accumulo-tracer/conf/server/tracers'] {'content':
Template('tracers.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode':
None}\n2015-09-25 14:42:54,349 -
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/gc'] {'owner':
'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25
14:42:54,351 - File['/usr/hdp/current/accumulo-tracer/conf/server/gc']
{'content': Template('gc.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop',
'mode': None}\n2015-09-25 14:42:54,352 -
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/monitor']
{'owner': 'custom-accumulo', 'template_tag': None, 'group':
'hadoop'}\n2015-09-25 14:42:54,354 -
File['/usr/hdp/current/accumulo-tracer/conf/server/monitor'] {'content':
Template('monitor.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode':
None}\n2015-09-25 14:42:54,355 -
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/slaves'] {'owner':
'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25
14:42:54,357 - File['/usr/hdp/current/accumulo-tracer/conf/server/slaves']
{'content': Template('slaves.j2'), 'owner': 'custom-accumulo', 'group':
'hadoop', 'mode': None}\n2015-09-25 14:42:54,357 -
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/masters']
{'owner': 'custom-accumulo', 'template_tag': None, 'group':
'hadoop'}\n2015-09-25 14:42:54,359 -
File['/usr/hdp/current/accumulo-tracer/conf/server/masters'] {'content':
Template('masters.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode':
None}\n2015-09-25 14:42:54,360 -
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/hadoop-metrics2-accumulo.properties']
{'owner': 'custom-accumulo', 'template_tag': None, 'group':
'hadoop'}\n2015-09-25 14:42:54,368 -
File['/usr/hdp/current/accumulo-tracer/conf/server/hadoop-metrics2-accumulo.properties']
{'content': Template('hadoop-metrics2-accumulo.properties.j2'), 'owner':
'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,369 -
Execute['/usr/bin/kinit -kt /etc/security/keytabs/accumulo.headless.keytab
[email protected];
ACCUMULO_CONF_DIR=/usr/hdp/current/accumulo-tracer/conf/server
/usr/hdp/current/accumulo-client/bin/accumulo init --reset-security --user
[email protected] --password NA
>/grid/0/log/accumulo/accumulo-reset.out
2>/grid/0/log/accumulo/accumulo-reset.err'] {'not_if': 'ambari-sudo.sh su
custom-accumulo -l -s /bin/bash -c \\'/usr/bin/kinit -kt
/etc/security/keytabs/accumulo.headless.keytab [email protected];
ACCUMULO_CONF_DIR=/usr/hdp/current/accumulo-tracer/conf/server
/usr/hdp/current/accumulo-client/bin/accumulo shell -e \"userpermissions -u
[email protected]\" | grep System.CREATE_TABLE\\'', 'user':
'custom-accumulo'}",
{code}
tserver log contains the following exceptions
{code}
2015-09-25 14:29:38,821 [tserver.TabletServer] INFO : Started replication
service on ambari-ooziehive-r1-2.novalocal:10002
2015-09-25 14:29:55,489 [server.TThreadPoolServer] ERROR: Error occurred during
processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException
at
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
at
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
at
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
at
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at
org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178)
at
org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
at
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
at
org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
... 11 more
2015-09-25 14:30:01,812 [tserver.TabletServer] INFO : Loading tablet !0<;~
2015-09-25 14:30:01,894 [tserver.TabletServer] INFO :
ambari-ooziehive-r1-2.novalocal:9997: got assignment from master: !0<;~
2015-09-25 14:30:02,833 [util.MetadataTableUtil] INFO : Scanning logging
entries for !0<;~
2015-09-25 14:30:02,862 [util.MetadataTableUtil] INFO : Scanning metadata for
logs used for tablet !0<;~
2015-09-25 14:30:02,924 [util.MetadataTableUtil] INFO : Returning logs [] for
extent !0<;~
2015-09-25 14:30:34,637 [server.TThreadPoolServer] ERROR: Error occurred during
processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException:
Peer indicated failure: GSS initiate failed
at
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
at
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
at
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
at
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at
org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException: Peer indicated
failure: GSS initiate failed
at
org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190)
at
org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
at
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
at
org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
... 11 more
{code}
Live (another 48 hours) cluster which happened fail:
172.22.90.201 ambari-ooziehive-r1-5.novalocal ambari-ooziehive-r1-5
172.22.90.200 ambari-ooziehive-r1-2.novalocal ambari-ooziehive-r1-2
172.22.90.198 ambari-ooziehive-r1-3.novalocal ambari-ooziehive-r1-3
172.22.90.197 ambari-ooziehive-r1-4.novalocal ambari-ooziehive-r1-4
172.22.90.199 ambari-ooziehive-r1-1.novalocal ambari-ooziehive-r1-1
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)