Dmitry Lysnichenko created AMBARI-13294:
-------------------------------------------

             Summary: ACCUMULO_TRACER START failed after enabling Kerberos
                 Key: AMBARI-13294
                 URL: https://issues.apache.org/jira/browse/AMBARI-13294
             Project: Ambari
          Issue Type: Bug
            Reporter: Dmitry Lysnichenko
            Assignee: Dmitry Lysnichenko


After enabling Kerberos on the "Start and Test Services" step ACCUMULO_TRACER 
START failed.

{code}
"stderr" : "Python script has been killed due to timeout after waiting 180 
secs",
{code}

{code}
"stdout" : "2015-09-25 14:42:53,963 - Group['custom-spark'] {}\n2015-09-25 
14:42:53,964 - Group['hadoop'] {}\n2015-09-25 14:42:53,965 - 
Group['custom-users'] {}\n2015-09-25 14:42:53,965 - Group['custom-knox-group'] 
{}\n2015-09-25 14:42:53,965 - User['custom-sqoop'] {'gid': 'hadoop', 'groups': 
[u'hadoop']}\n2015-09-25 14:42:53,966 - User['custom-knox'] {'gid': 'hadoop', 
'groups': [u'hadoop']}\n2015-09-25 14:42:53,967 - User['custom-hdfs'] {'gid': 
'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,968 - 
User['custom-oozie'] {'gid': 'hadoop', 'groups': [u'custom-users']}\n2015-09-25 
14:42:53,969 - User['custom-smoke'] {'gid': 'hadoop', 'groups': 
[u'custom-users']}\n2015-09-25 14:42:53,970 - User['custom-hbase'] {'gid': 
'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,971 - User['custom-tez'] 
{'gid': 'hadoop', 'groups': [u'custom-users']}\n2015-09-25 14:42:53,972 - 
User['custom-hive'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 
14:42:53,973 - User['custom-mr'] {'gid': 'hadoop', 'groups': 
[u'hadoop']}\n2015-09-25 14:42:53,973 - User['custom-accumulo'] {'gid': 
'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,974 - User['custom-hcat'] 
{'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,975 - 
User['custom-ams'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 
14:42:53,976 - User['custom-yarn'] {'gid': 'hadoop', 'groups': 
[u'hadoop']}\n2015-09-25 14:42:53,977 - User['custom-falcon'] {'gid': 'hadoop', 
'groups': [u'custom-users']}\n2015-09-25 14:42:53,977 - User['custom-spark'] 
{'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,978 - 
User['custom-atlas'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 
14:42:53,979 - User['custom-flume'] {'gid': 'hadoop', 'groups': 
[u'hadoop']}\n2015-09-25 14:42:53,980 - User['custom-kafka'] {'gid': 'hadoop', 
'groups': [u'hadoop']}\n2015-09-25 14:42:53,981 - User['custom-zookeeper'] 
{'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,982 - 
User['custom-mahout'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 
14:42:53,982 - User['custom-storm'] {'gid': 'hadoop', 'groups': 
[u'hadoop']}\n2015-09-25 14:42:53,983 - 
File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': 
StaticFile('changeToSecureUid.sh'), 'mode': 0555}\n2015-09-25 14:42:53,985 - 
Execute['/var/lib/ambari-agent/tmp/changeUid.sh custom-smoke 
/tmp/hadoop-custom-smoke,/tmp/hsperfdata_custom-smoke,/home/custom-smoke,/tmp/custom-smoke,/tmp/sqoop-custom-smoke']
 {'not_if': '(test $(id -u custom-smoke) -gt 1000) || (false)'}\n2015-09-25 
14:42:53,991 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh 
custom-smoke 
/tmp/hadoop-custom-smoke,/tmp/hsperfdata_custom-smoke,/home/custom-smoke,/tmp/custom-smoke,/tmp/sqoop-custom-smoke']
 due to not_if\n2015-09-25 14:42:53,991 - Directory['/tmp/hbase-hbase'] 
{'owner': 'custom-hbase', 'recursive': True, 'mode': 0775, 'cd_access': 
'a'}\n2015-09-25 14:42:53,992 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] 
{'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}\n2015-09-25 
14:42:53,993 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh custom-hbase 
/home/custom-hbase,/tmp/custom-hbase,/usr/bin/custom-hbase,/var/log/custom-hbase,/tmp/hbase-hbase']
 {'not_if': '(test $(id -u custom-hbase) -gt 1000) || (false)'}\n2015-09-25 
14:42:53,999 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh 
custom-hbase 
/home/custom-hbase,/tmp/custom-hbase,/usr/bin/custom-hbase,/var/log/custom-hbase,/tmp/hbase-hbase']
 due to not_if\n2015-09-25 14:42:54,000 - Group['custom-hdfs'] 
{'ignore_failures': False}\n2015-09-25 14:42:54,000 - User['custom-hdfs'] 
{'ignore_failures': False, 'groups': [u'hadoop', u'custom-hdfs']}\n2015-09-25 
14:42:54,001 - Directory['/etc/hadoop'] {'mode': 0755}\n2015-09-25 14:42:54,019 
- File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': 
InlineTemplate(...), 'owner': 'root', 'group': 'hadoop'}\n2015-09-25 
14:42:54,019 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] 
{'owner': 'custom-hdfs', 'group': 'hadoop', 'mode': 0777}\n2015-09-25 
14:42:54,032 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) 
|| (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 
'only_if': 'test -f /selinux/enforce'}\n2015-09-25 14:42:54,039 - Skipping 
Execute[('setenforce', '0')] due to not_if\n2015-09-25 14:42:54,040 - 
Directory['/grid/0/log/hadoop'] {'owner': 'root', 'mode': 0775, 'group': 
'hadoop', 'recursive': True, 'cd_access': 'a'}\n2015-09-25 14:42:54,043 - 
Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root', 'recursive': 
True, 'cd_access': 'a'}\n2015-09-25 14:42:54,043 - 
Directory['/tmp/hadoop-custom-hdfs'] {'owner': 'custom-hdfs', 'recursive': 
True, 'cd_access': 'a'}\n2015-09-25 14:42:54,048 - 
File['/usr/hdp/current/hadoop-client/conf/commons-logging.properties'] 
{'content': Template('commons-logging.properties.j2'), 'owner': 
'root'}\n2015-09-25 14:42:54,051 - 
File['/usr/hdp/current/hadoop-client/conf/health_check'] {'content': 
Template('health_check.j2'), 'owner': 'root'}\n2015-09-25 14:42:54,051 - 
File['/usr/hdp/current/hadoop-client/conf/log4j.properties'] {'content': ..., 
'owner': 'custom-hdfs', 'group': 'hadoop', 'mode': 0644}\n2015-09-25 
14:42:54,074 - 
File['/usr/hdp/current/hadoop-client/conf/hadoop-metrics2.properties'] 
{'content': Template('hadoop-metrics2.properties.j2'), 'owner': 
'custom-hdfs'}\n2015-09-25 14:42:54,075 - 
File['/usr/hdp/current/hadoop-client/conf/task-log4j.properties'] {'content': 
StaticFile('task-log4j.properties'), 'mode': 0755}\n2015-09-25 14:42:54,076 - 
File['/usr/hdp/current/hadoop-client/conf/configuration.xsl'] {'owner': 
'custom-hdfs', 'group': 'hadoop'}\n2015-09-25 14:42:54,083 - 
File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'custom-hdfs', 
'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d 
/etc/hadoop/conf', 'group': 'hadoop'}\n2015-09-25 14:42:54,089 - 
File['/etc/hadoop/conf/topology_script.py'] {'content': 
StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 
'mode': 0755}\n2015-09-25 14:42:54,275 - 
Directory['/usr/hdp/current/accumulo-tracer/conf'] {'owner': 'custom-accumulo', 
'group': 'hadoop', 'recursive': True, 'mode': 0755}\n2015-09-25 14:42:54,277 - 
Directory['/usr/hdp/current/accumulo-tracer/conf/server'] {'owner': 
'custom-accumulo', 'group': 'hadoop', 'recursive': True, 'mode': 
0700}\n2015-09-25 14:42:54,278 - XmlConfig['accumulo-site.xml'] {'group': 
'hadoop', 'conf_dir': '/usr/hdp/current/accumulo-tracer/conf/server', 'mode': 
0600, 'configuration_attributes': {}, 'owner': 'custom-accumulo', 
'configurations': ...}\n2015-09-25 14:42:54,292 - Generating config: 
/usr/hdp/current/accumulo-tracer/conf/server/accumulo-site.xml\n2015-09-25 
14:42:54,293 - 
File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-site.xml'] 
{'owner': 'custom-accumulo', 'content': InlineTemplate(...), 'group': 'hadoop', 
'mode': 0600, 'encoding': 'UTF-8'}\n2015-09-25 14:42:54,317 - 
Directory['/var/run/accumulo'] {'owner': 'custom-accumulo', 'group': 'hadoop', 
'recursive': True}\n2015-09-25 14:42:54,318 - Directory['/grid/0/log/accumulo'] 
{'owner': 'custom-accumulo', 'group': 'hadoop', 'recursive': True}\n2015-09-25 
14:42:54,323 - 
File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-env.sh'] 
{'content': InlineTemplate(...), 'owner': 'custom-accumulo', 'group': 'hadoop', 
'mode': 0644}\n2015-09-25 14:42:54,324 - 
PropertiesFile['/usr/hdp/current/accumulo-tracer/conf/server/client.conf'] 
{'owner': 'custom-accumulo', 'group': 'hadoop', 'properties': 
{'instance.zookeeper.host': 
u'ambari-ooziehive-r1-2.novalocal:2181,ambari-ooziehive-r1-3.novalocal:2181,ambari-ooziehive-r1-5.novalocal:2181',
 'instance.name': u'hdp-accumulo-instance', 'instance.rpc.sasl.enabled': True, 
'instance.zookeeper.timeout': u'30s'}}\n2015-09-25 14:42:54,329 - Generating 
properties file: 
/usr/hdp/current/accumulo-tracer/conf/server/client.conf\n2015-09-25 
14:42:54,329 - File['/usr/hdp/current/accumulo-tracer/conf/server/client.conf'] 
{'owner': 'custom-accumulo', 'content': InlineTemplate(...), 'group': 'hadoop', 
'mode': None}\n2015-09-25 14:42:54,332 - Writing 
File['/usr/hdp/current/accumulo-tracer/conf/server/client.conf'] because 
contents don't match\n2015-09-25 14:42:54,333 - 
File['/usr/hdp/current/accumulo-tracer/conf/server/log4j.properties'] 
{'content': ..., 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': 
0644}\n2015-09-25 14:42:54,333 - 
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/auditLog.xml'] 
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 
'hadoop'}\n2015-09-25 14:42:54,337 - 
File['/usr/hdp/current/accumulo-tracer/conf/server/auditLog.xml'] {'content': 
Template('auditLog.xml.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 
'mode': None}\n2015-09-25 14:42:54,337 - 
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/generic_logger.xml']
 {'owner': 'custom-accumulo', 'template_tag': None, 'group': 
'hadoop'}\n2015-09-25 14:42:54,341 - 
File['/usr/hdp/current/accumulo-tracer/conf/server/generic_logger.xml'] 
{'content': Template('generic_logger.xml.j2'), 'owner': 'custom-accumulo', 
'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,342 - 
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/monitor_logger.xml']
 {'owner': 'custom-accumulo', 'template_tag': None, 'group': 
'hadoop'}\n2015-09-25 14:42:54,344 - 
File['/usr/hdp/current/accumulo-tracer/conf/server/monitor_logger.xml'] 
{'content': Template('monitor_logger.xml.j2'), 'owner': 'custom-accumulo', 
'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,345 - 
File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-metrics.xml'] 
{'content': StaticFile('accumulo-metrics.xml'), 'owner': 'custom-accumulo', 
'group': 'hadoop', 'mode': 0644}\n2015-09-25 14:42:54,346 - 
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/tracers'] 
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 
'hadoop'}\n2015-09-25 14:42:54,348 - 
File['/usr/hdp/current/accumulo-tracer/conf/server/tracers'] {'content': 
Template('tracers.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': 
None}\n2015-09-25 14:42:54,349 - 
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/gc'] {'owner': 
'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 
14:42:54,351 - File['/usr/hdp/current/accumulo-tracer/conf/server/gc'] 
{'content': Template('gc.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 
'mode': None}\n2015-09-25 14:42:54,352 - 
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/monitor'] 
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 
'hadoop'}\n2015-09-25 14:42:54,354 - 
File['/usr/hdp/current/accumulo-tracer/conf/server/monitor'] {'content': 
Template('monitor.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': 
None}\n2015-09-25 14:42:54,355 - 
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/slaves'] {'owner': 
'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 
14:42:54,357 - File['/usr/hdp/current/accumulo-tracer/conf/server/slaves'] 
{'content': Template('slaves.j2'), 'owner': 'custom-accumulo', 'group': 
'hadoop', 'mode': None}\n2015-09-25 14:42:54,357 - 
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/masters'] 
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 
'hadoop'}\n2015-09-25 14:42:54,359 - 
File['/usr/hdp/current/accumulo-tracer/conf/server/masters'] {'content': 
Template('masters.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': 
None}\n2015-09-25 14:42:54,360 - 
TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/hadoop-metrics2-accumulo.properties']
 {'owner': 'custom-accumulo', 'template_tag': None, 'group': 
'hadoop'}\n2015-09-25 14:42:54,368 - 
File['/usr/hdp/current/accumulo-tracer/conf/server/hadoop-metrics2-accumulo.properties']
 {'content': Template('hadoop-metrics2-accumulo.properties.j2'), 'owner': 
'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,369 - 
Execute['/usr/bin/kinit -kt /etc/security/keytabs/accumulo.headless.keytab 
[email protected]; 
ACCUMULO_CONF_DIR=/usr/hdp/current/accumulo-tracer/conf/server 
/usr/hdp/current/accumulo-client/bin/accumulo init --reset-security --user 
[email protected] --password NA 
>/grid/0/log/accumulo/accumulo-reset.out 
2>/grid/0/log/accumulo/accumulo-reset.err'] {'not_if': 'ambari-sudo.sh su 
custom-accumulo -l -s /bin/bash -c \\'/usr/bin/kinit -kt 
/etc/security/keytabs/accumulo.headless.keytab [email protected]; 
ACCUMULO_CONF_DIR=/usr/hdp/current/accumulo-tracer/conf/server 
/usr/hdp/current/accumulo-client/bin/accumulo shell -e \"userpermissions -u 
[email protected]\" | grep System.CREATE_TABLE\\'', 'user': 
'custom-accumulo'}",
{code}

tserver log contains the following exceptions
{code}
2015-09-25 14:29:38,821 [tserver.TabletServer] INFO : Started replication 
service on ambari-ooziehive-r1-2.novalocal:10002
2015-09-25 14:29:55,489 [server.TThreadPoolServer] ERROR: Error occurred during 
processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException
        at 
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
        at 
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
        at 
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
        at 
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at 
org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException
        at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at 
org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178)
        at 
org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
        at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
        at 
org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
        at 
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
        ... 11 more
2015-09-25 14:30:01,812 [tserver.TabletServer] INFO : Loading tablet !0<;~
2015-09-25 14:30:01,894 [tserver.TabletServer] INFO : 
ambari-ooziehive-r1-2.novalocal:9997: got assignment from master: !0<;~
2015-09-25 14:30:02,833 [util.MetadataTableUtil] INFO : Scanning logging 
entries for !0<;~
2015-09-25 14:30:02,862 [util.MetadataTableUtil] INFO : Scanning metadata for 
logs used for tablet !0<;~
2015-09-25 14:30:02,924 [util.MetadataTableUtil] INFO : Returning logs [] for 
extent !0<;~
2015-09-25 14:30:34,637 [server.TThreadPoolServer] ERROR: Error occurred during 
processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: 
Peer indicated failure: GSS initiate failed
        at 
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
        at 
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
        at 
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
        at 
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at 
org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException: Peer indicated 
failure: GSS initiate failed
        at 
org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190)
        at 
org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
        at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
        at 
org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
        at 
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
        ... 11 more
{code}

Live (another 48 hours) cluster which happened fail:
172.22.90.201   ambari-ooziehive-r1-5.novalocal ambari-ooziehive-r1-5
172.22.90.200   ambari-ooziehive-r1-2.novalocal ambari-ooziehive-r1-2
172.22.90.198   ambari-ooziehive-r1-3.novalocal ambari-ooziehive-r1-3
172.22.90.197   ambari-ooziehive-r1-4.novalocal ambari-ooziehive-r1-4
172.22.90.199   ambari-ooziehive-r1-1.novalocal ambari-ooziehive-r1-1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to