Niels Basjes created PIG-4796:
---------------------------------

             Summary: Authenticate with Kerberos using a keytab file
                 Key: PIG-4796
                 URL: https://issues.apache.org/jira/browse/PIG-4796
             Project: Pig
          Issue Type: New Feature
            Reporter: Niels Basjes


When running in a Kerberos secured environment users are faced with the 
limitation that their jobs cannot run longer than the (remaining) ticket 
lifetime of their Kerberos tickets. The environment I work in these tickets 
expire after 10 hours, thus limiting the maximum job duration to at most 10 
hours (which is a problem).

In the Hadoop tooling there is a feature where you can authenticate using a 
Kerberos keytab file (essentially a file that contains the encrypted form of 
the kerberos principal and password). Using this the running application can 
request new tickets from the Kerberos server when the initial tickets expire.

In my Java/Hadoop applications I commonly include these two lines:
{code}
System.setProperty("java.security.krb5.conf", "/etc/krb5.conf");
UserGroupInformation.loginUserFromKeytab("[email protected]", 
"/home/nbasjes/.krb/nbasjes.keytab");
{code}

This way I have run an Apache Flink based application for more than 170 hours 
(about a week) on the kerberos secured Yarn cluster.

What I propose is to have a feature that I can set the relevant kerberos values 
in my pig script and from there be able to run a pig job for many days on the 
secured cluster.

Proposal how this can look in a pig script:
{code}
SET java.security.krb5.conf '/etc/krb5.conf'
SET job.security.krb5.principal '[email protected]'
SET job.security.krb5.keytab '/home/nbasjes/.krb/nbasjes.keytab'
{code}
So iff all of these are set (or at least the last two) then the aforementioned  
UserGroupInformation.loginUserFromKeytab method is called before submitting the 
job to the cluster.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to