[
https://issues.apache.org/jira/browse/HADOOP-15622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabor Bota updated HADOOP-15622:
--------------------------------
Description:
The calculation of nextRefresh in
UserGroupInformation#spawnAutoRenewalThreadForUserCreds is currently based on:
{code:java}
nextRefresh = Math.max(getRefreshTime(tgt),
now + kerberosMinSecondsBeforeRelogin);
{code}
Most of the time nextRefresh = getRefreshTime(tgt). If it is renewing exactly
on refreshTime, and there are parallel operations using expired ticket. There
is a time gap that some operations might not perform until the next tgt is
obtained. Ideally, we want to keep service uninterrupted, therefore
getNextTgtRenewalTime supposed to calculate the time a few minutes before
Kerberos tgt expired to determine the nextRefresh time. It looks like we are
not using getNextTgtRenewalTime method to calculate nextRefresh instead opt-in
to use ticket expiration time as base line for nextRefresh. I think patch 2
approach can create time gap then strain on KDC server when ticket can not be
renewed. It would be better to calculate nextRefresh based on
getNextTgtRenewalTime.
was:
The calculation of nextRefresh in UserGroupInformation is currently based on:
{code:java}
nextRefresh = Math.max(getRefreshTime(tgt),
now + kerberosMinSecondsBeforeRelogin);
{code}
Most of the time nextRefresh = getRefreshTime(tgt). If it is renewing exactly
on refreshTime, and there are parallel operations using expired ticket. There
is a time gap that some operations might not perform until the next tgt is
obtained. Ideally, we want to keep service uninterrupted, therefore
getNextTgtRenewalTime supposed to calculate the time a few minutes before
Kerberos tgt expired to determine the nextRefresh time. It looks like we are
not using getNextTgtRenewalTime method to calculate nextRefresh instead opt-in
to use ticket expiration time as base line for nextRefresh. I think patch 2
approach can create time gap then strain on KDC server when ticket can not be
renewed. It would be better to calculate nextRefresh based on
getNextTgtRenewalTime.
> UserGroupInformation TGT renewer refreshTime should be based on
> getNextTgtRenewalTime
> -------------------------------------------------------------------------------------
>
> Key: HADOOP-15622
> URL: https://issues.apache.org/jira/browse/HADOOP-15622
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Gabor Bota
> Priority: Major
>
> The calculation of nextRefresh in
> UserGroupInformation#spawnAutoRenewalThreadForUserCreds is currently based on:
> {code:java}
> nextRefresh = Math.max(getRefreshTime(tgt),
> now + kerberosMinSecondsBeforeRelogin);
> {code}
> Most of the time nextRefresh = getRefreshTime(tgt). If it is renewing exactly
> on refreshTime, and there are parallel operations using expired ticket. There
> is a time gap that some operations might not perform until the next tgt is
> obtained. Ideally, we want to keep service uninterrupted, therefore
> getNextTgtRenewalTime supposed to calculate the time a few minutes before
> Kerberos tgt expired to determine the nextRefresh time. It looks like we are
> not using getNextTgtRenewalTime method to calculate nextRefresh instead
> opt-in to use ticket expiration time as base line for nextRefresh. I think
> patch 2 approach can create time gap then strain on KDC server when ticket
> can not be renewed. It would be better to calculate nextRefresh based on
> getNextTgtRenewalTime.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]