[
https://issues.apache.org/jira/browse/KUDU-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Henke updated KUDU-2549:
------------------------------
Labels: trivial (was: )
> Kudu kinit renewal thread's exponential backoff may need an upper bound
> -----------------------------------------------------------------------
>
> Key: KUDU-2549
> URL: https://issues.apache.org/jira/browse/KUDU-2549
> Project: Kudu
> Issue Type: Improvement
> Components: security
> Affects Versions: 1.7.0
> Reporter: Michael Ho
> Priority: Critical
> Labels: trivial
>
> An Impala instance (which recently adopted the Kudu Kerberos implementation)
> happened to run into a temporary DNS outage. The user set up Kerberos to have
> a very short Kerberos ticket lifetime (30 minutes). For the couple of hours
> in which the DNS was done, the renewal thread quickly racked up many renewal
> failure, leading to a very long backoff time (up to *5 hours* eventually).
> Even after the DNS has recovered, the Impala process still fails to
> communicate with other nodes due to the expired TGT. The renewal thread
> didn't wake up in some cases for more than 3 hours after the DNS recovered.
> This seems to provide a rather bad user experience so it may be worth
> considering having a configurable upper bound on exponential backoff when
> ticket renewal fails. At a minimum, may help to log the backoff time to help
> diagnose the issue.
> {noformat}
> W0822 23:15:35.960669 10964 init.cc:188] Kerberos reacquire error: : Runtime
> error: Reacquire error: unable to login from keytab: Cannot contact any KDC
> for realm '---redacted---'
> W0822 23:19:21.016465 10964 init.cc:188] Kerberos reacquire error: : Runtime
> error: Reacquire error: unable to login from keytab: Cannot contact any KDC
> for realm '---redacted---'
> W0822 23:25:48.059895 10964 init.cc:188] Kerberos reacquire error: : Runtime
> error: Reacquire error: unable to login from keytab: Cannot contact any KDC
> for realm '---redacted---'
> W0822 23:38:14.100435 10964 init.cc:188] Kerberos reacquire error: : Runtime
> error: Reacquire error: unable to login from keytab: Cannot contact any KDC
> for realm '---redacted---'
> W0822 23:59:26.152209 10964 init.cc:188] Kerberos reacquire error: : Runtime
> error: Reacquire error: unable to login from keytab: Cannot contact any KDC
> for realm '---redacted---'
> W0823 00:42:28.194363 10964 init.cc:188] Kerberos reacquire error: : Runtime
> error: Reacquire error: unable to login from keytab: Cannot contact any KDC
> for realm ''---redacted---'
> W0823 01:58:41.240950 10964 init.cc:188] Kerberos reacquire error: : Runtime
> error: Reacquire error: unable to login from keytab: Cannot contact any KDC
> for realm ''---redacted---'
> W0823 03:28:54.285295 10964 init.cc:188] Kerberos reacquire error: : Runtime
> error: Reacquire error: unable to login from keytab: Cannot contact any KDC
> for realm ''---redacted---'
> W0823 08:42:57.335754 10964 init.cc:188] Kerberos reacquire error: : Runtime
> error: Reacquire error: unable to login from keytab: Cannot contact any KDC
> for realm '---redacted---'
> I0823 13:58:11.337008 10964 init.cc:283] Successfully reacquired a new
> kerberos TGT
> I0823 14:08:46.362918 10964 init.cc:283] Successfully reacquired a new
> kerberos TGT
> {noformat}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)