[
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16549472#comment-16549472
]
Eric Yang commented on HADOOP-15593:
------------------------------------
[~gabor.bota] I know you are trying to retain existing behavior, but I think
there are bugs in existing logic. The calculation of nextRefresh is based on:
{code}
nextRefresh = Math.max(getRefreshTime(tgt),
now + kerberosMinSecondsBeforeRelogin);
{code}
Most of the time nextRefresh = getRefreshTime(tgt). If it is renewing exactly
on refreshTime, and there are parallel operations using expired ticket. There
is a time gap that some operations might not perform until the next tgt is
obtained. Ideally, we want to keep service uninterrupted, therefore
getNextTgtRenewalTime supposed to calculate the time a few minutes before
Kerberos tgt expired to determine the nextRefresh time. It looks like we are
not using getNextTgtRenewalTime method to calculate nextRefresh instead opt-in
to use ticket expiration time as base line for nextRefresh. I think patch 2
approach can create time gap then strain on KDC server when ticket can not be
renewed. It would be better to calculate nextRefresh based on
getNextTgtRenewalTime.
> UserGroupInformation TGT renewer throws NPE
> -------------------------------------------
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
> Issue Type: Bug
> Components: security
> Affects Versions: 3.0.0
> Reporter: Wei-Chiu Chuang
> Assignee: Gabor Bota
> Priority: Critical
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within
> an exception handler so the original exception was hidden, though it's likely
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught
> exception in thread Thread[TGT Renewer for [email protected],5,main]
> java.lang.NullPointerException
> at
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the
> exception better.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]