-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31483/
-----------------------------------------------------------
Review request for Ambari, Andrew Onischuk, Emil Anca, Jonathan Hurley, and
Vitalyi Brodetskyi.
Bugs: AMBARI-9785
https://issues.apache.org/jira/browse/AMBARI-9785
Repository: ambari
Description
-------
After enabling Kerberos, the root user has the spnego user set for it
```
[root@c6501 ~]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: HTTP/[email protected]
Valid starting Expires Service principal
02/18/15 22:14:51 02/19/15 22:14:51 krbtgt/[email protected]
renew until 02/18/15 22:14:51
```
It appears that the issue is related to the agent-side scheduler and/or some
job that is scheduled to run periodically. Apparently some job is kinit-ing
with the SPNEGO identity as the running user (root in this case) without
changing the ticket cache. Thus whenever the job runs the root user's ticket
cache gets changed to contain the SPNEGO identity's ticket.
While investigating and solving the issue it was found that other credentials
were added to this cache, overwriting what was there, during backround
processing, as well.
Most of the issues were releated to _alert_ checking on web-based UI endpoints
while configuring the environment for curl to use Kerberos authentication.
Another place (in Oozie) was a failure to run a command as the `oozie` local
user.
Solving this includes using an alternate credential cache when kinit-ing. While
at it, the cached is checked to see if the tickets are expired (or even there)
before kinit-ing.
Diffs
-----
ambari-agent/src/main/python/ambari_agent/alerts/web_alert.py 8ee6606
ambari-common/src/main/python/resource_management/libraries/functions/__init__.py
44d235c
ambari-common/src/main/python/resource_management/libraries/functions/get_klist_path.py
PRE-CREATION
ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/alerts/alert_webhcat_server.py
970ddde
ambari-server/src/main/resources/common-services/OOZIE/4.0.0.2.0/package/alerts/alert_check_oozie_server.py
a5a066b
ambari-server/src/main/resources/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie_service.py
092149d
ambari-server/src/main/resources/stacks/BIGTOP/0.8/services/OOZIE/package/files/alert_check_oozie_server.py
a5a066b
ambari-server/src/main/resources/stacks/BIGTOP/0.8/services/WEBHCAT/package/files/alert_webhcat_server.py
970ddde
ambari-server/src/test/python/stacks/2.0.6/OOZIE/test_oozie_server.py 45e9dc4
Diff: https://reviews.apache.org/r/31483/diff/
Testing
-------
Manually tested all services in test cluster to see which might have this
issue. Found only OOZIE and HIVE issues and tests showed they are fixed and
working as they should.
#Jenkins Test Results
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:12 h
[INFO] Finished at: 2015-02-26T06:35:45+00:00
[INFO] Final Memory: 44M/457M
[INFO] ------------------------------------------------------------------------
Thanks,
Robert Levas