[ 
https://issues.apache.org/jira/browse/KUDU-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho updated KUDU-2545:
-----------------------------
    Description: 
Impala recently switched over to using the Kerberos implementation in Kudu for 
kinit and acquiring a new the TGT periodically.

A user ran into a situation in which {{krb5_get_init_creds_keytab()}} called 
from {{KinitContext::DoRenewal()}} failed because of IO failure to the 
credentials cache. Apparently, the credentials cache get deleted afterwards. 
It's unclear if it was done by some clean up code in Kerberos library or 
something external to Kerberos which deleted that credentials cache. The error 
message of the IO failure to store credentials suggested the credentials cache 
still existed when the failure occurred.

On the next call to {{KinitContext::DoRenewal()}}, the credentials cache was 
gone so it failed right away on the call to {{krb5_cc_start_seq_get()}} a the 
beginning of the function. This failure kept happening until the user restarted 
the Impala service, at which point the credentials cache got recreated. It 
seems that {{KinitContext::DoRenewal()}} *cannot recover at all* once the 
credentials cache disappears somehow. It seems that the code could be more 
robust if it handles the failure by reverting to calling 
{{KinitContext::Kinit()}} if the credentials cache disappeared somehow.
{noformat}
| W0815 10:04:01.132095 144773 init.cc:180] Kerberos reacquire error: : Runtime 
error: Reacquire error: unable to login from keytab: Failed to store 
credentials: Credentials cache I/O operation failed (filename: 
/tmp/krb5cc_impala_internal)
 | W0815 10:05:37.133210 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 10:10:01.133746 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 10:18:26.134222 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 10:41:30.134889 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 10:59:01.135460 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 12:24:18.135974 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 13:49:35.136660 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
{noformat}

  was:
Impala recently switched over to using the Kerberos implementation in Kudu for 
kinit and acquiring a new the TGT periodically.

A user ran into a situation in which {{krb5_get_init_creds_keytab()}} called 
from {{KinitContext::DoRenewal()}} failed because of IO failure to the 
credentials cache. Apparently, the credentials cache get deleted afterwards. 
It's unclear if it was done by some clean up code in Kerberos library or 
something external to Kerberos which deleted that credentials cache. The IO 
failure to store credentials suggested the credentials cache still existed when 
the failure occurred. 

On the next call to {{KinitContext::DoRenewal()}}, the credentials cache was 
gone so it failed right away on the call to {{krb5_cc_start_seq_get()}} a the 
beginning of the function. This failure kept happening until the user restarted 
the Impala service, at which point the credentials cache got recreated. It 
seems that {{KinitContext::DoRenewal()}} *cannot recover at all* once the 
credentials cache disappears somehow. It seems that the code could be more 
robust if it handles the failure by reverting to calling 
{{KinitContext::Kinit()}} if the credentials cache disappeared somehow.

{noformat}
| W0815 10:04:01.132095 144773 init.cc:180] Kerberos reacquire error: : Runtime 
error: Reacquire error: unable to login from keytab: Failed to store 
credentials: Credentials cache I/O operation failed (filename: 
/tmp/krb5cc_impala_internal)
 | W0815 10:05:37.133210 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 10:10:01.133746 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 10:18:26.134222 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 10:41:30.134889 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 10:59:01.135460 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 12:24:18.135974 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
 | W0815 13:49:35.136660 144773 init.cc:180] Kerberos reacquire error: : 
Runtime error: Failed to peek into ccache: No credentials cache found 
(filename: /tmp/krb5cc_impala_internal)
{noformat}


> KinitContext::DoRenewal() is unable to recover after credentials cache gets 
> deleted
> -----------------------------------------------------------------------------------
>
>                 Key: KUDU-2545
>                 URL: https://issues.apache.org/jira/browse/KUDU-2545
>             Project: Kudu
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 1.3.0, 1.4.0, 1.5.0, 1.6.0, 1.7.0
>            Reporter: Michael Ho
>            Priority: Critical
>
> Impala recently switched over to using the Kerberos implementation in Kudu 
> for kinit and acquiring a new the TGT periodically.
> A user ran into a situation in which {{krb5_get_init_creds_keytab()}} called 
> from {{KinitContext::DoRenewal()}} failed because of IO failure to the 
> credentials cache. Apparently, the credentials cache get deleted afterwards. 
> It's unclear if it was done by some clean up code in Kerberos library or 
> something external to Kerberos which deleted that credentials cache. The 
> error message of the IO failure to store credentials suggested the 
> credentials cache still existed when the failure occurred.
> On the next call to {{KinitContext::DoRenewal()}}, the credentials cache was 
> gone so it failed right away on the call to {{krb5_cc_start_seq_get()}} a the 
> beginning of the function. This failure kept happening until the user 
> restarted the Impala service, at which point the credentials cache got 
> recreated. It seems that {{KinitContext::DoRenewal()}} *cannot recover at 
> all* once the credentials cache disappears somehow. It seems that the code 
> could be more robust if it handles the failure by reverting to calling 
> {{KinitContext::Kinit()}} if the credentials cache disappeared somehow.
> {noformat}
> | W0815 10:04:01.132095 144773 init.cc:180] Kerberos reacquire error: : 
> Runtime error: Reacquire error: unable to login from keytab: Failed to store 
> credentials: Credentials cache I/O operation failed (filename: 
> /tmp/krb5cc_impala_internal)
>  | W0815 10:05:37.133210 144773 init.cc:180] Kerberos reacquire error: : 
> Runtime error: Failed to peek into ccache: No credentials cache found 
> (filename: /tmp/krb5cc_impala_internal)
>  | W0815 10:10:01.133746 144773 init.cc:180] Kerberos reacquire error: : 
> Runtime error: Failed to peek into ccache: No credentials cache found 
> (filename: /tmp/krb5cc_impala_internal)
>  | W0815 10:18:26.134222 144773 init.cc:180] Kerberos reacquire error: : 
> Runtime error: Failed to peek into ccache: No credentials cache found 
> (filename: /tmp/krb5cc_impala_internal)
>  | W0815 10:41:30.134889 144773 init.cc:180] Kerberos reacquire error: : 
> Runtime error: Failed to peek into ccache: No credentials cache found 
> (filename: /tmp/krb5cc_impala_internal)
>  | W0815 10:59:01.135460 144773 init.cc:180] Kerberos reacquire error: : 
> Runtime error: Failed to peek into ccache: No credentials cache found 
> (filename: /tmp/krb5cc_impala_internal)
>  | W0815 12:24:18.135974 144773 init.cc:180] Kerberos reacquire error: : 
> Runtime error: Failed to peek into ccache: No credentials cache found 
> (filename: /tmp/krb5cc_impala_internal)
>  | W0815 13:49:35.136660 144773 init.cc:180] Kerberos reacquire error: : 
> Runtime error: Failed to peek into ccache: No credentials cache found 
> (filename: /tmp/krb5cc_impala_internal)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to