[ https://issues.apache.org/jira/browse/HADOOP-15622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gabor Bota updated HADOOP-15622: -------------------------------- Description: The calculation of nextRefresh in UserGroupInformation#spawnAutoRenewalThreadForUserCreds is currently based on: {code:java} nextRefresh = Math.max(getRefreshTime(tgt), now + kerberosMinSecondsBeforeRelogin); {code} Most of the time nextRefresh = getRefreshTime(tgt). If it is renewing exactly on refreshTime, and there are parallel operations using expired ticket. There is a time gap that some operations might not perform until the next tgt is obtained. Ideally, we want to keep service uninterrupted, therefore getNextTgtRenewalTime supposed to calculate the time a few minutes before Kerberos tgt expired to determine the nextRefresh time. It looks like we are not using getNextTgtRenewalTime method to calculate nextRefresh instead opt-in to use ticket expiration time as base line for nextRefresh. was: The calculation of nextRefresh in UserGroupInformation#spawnAutoRenewalThreadForUserCreds is currently based on: {code:java} nextRefresh = Math.max(getRefreshTime(tgt), now + kerberosMinSecondsBeforeRelogin); {code} Most of the time nextRefresh = getRefreshTime(tgt). If it is renewing exactly on refreshTime, and there are parallel operations using expired ticket. There is a time gap that some operations might not perform until the next tgt is obtained. Ideally, we want to keep service uninterrupted, therefore getNextTgtRenewalTime supposed to calculate the time a few minutes before Kerberos tgt expired to determine the nextRefresh time. It looks like we are not using getNextTgtRenewalTime method to calculate nextRefresh instead opt-in to use ticket expiration time as base line for nextRefresh. I think patch 2 approach can create time gap then strain on KDC server when ticket can not be renewed. It would be better to calculate nextRefresh based on getNextTgtRenewalTime. > UserGroupInformation TGT renewer refreshTime should be based on > getNextTgtRenewalTime > ------------------------------------------------------------------------------------- > > Key: HADOOP-15622 > URL: https://issues.apache.org/jira/browse/HADOOP-15622 > Project: Hadoop Common > Issue Type: Bug > Reporter: Gabor Bota > Priority: Major > > The calculation of nextRefresh in > UserGroupInformation#spawnAutoRenewalThreadForUserCreds is currently based on: > {code:java} > nextRefresh = Math.max(getRefreshTime(tgt), > now + kerberosMinSecondsBeforeRelogin); > {code} > Most of the time nextRefresh = getRefreshTime(tgt). If it is renewing exactly > on refreshTime, and there are parallel operations using expired ticket. > There is a time gap that some operations might not perform until the next tgt > is obtained. Ideally, we want to keep service uninterrupted, therefore > getNextTgtRenewalTime supposed to calculate the time a few minutes before > Kerberos tgt expired to determine the nextRefresh time. > It looks like we are not using getNextTgtRenewalTime method to calculate > nextRefresh instead opt-in to use ticket expiration time as base line for > nextRefresh. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org