[ https://issues.apache.org/jira/browse/HADOOP-18581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17652501#comment-17652501 ]
ASF GitHub Bot commented on HADOOP-18581: ----------------------------------------- surendralilhore commented on code in PR #5248: URL: https://github.com/apache/hadoop/pull/5248#discussion_r1058364619 ########## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java: ########## @@ -2206,7 +2206,25 @@ private void saslProcess(RpcSaslProto saslMessage) AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":" + attemptingUser + " (" + e.getLocalizedMessage() + ") with true cause: (" + tce.getLocalizedMessage() + ")"); - throw tce; + if (!UserGroupInformation.getLoginUser().isLoginSuccess()) { + LOG.info("Initiating re-login from IPC Server"); + if (UserGroupInformation.isLoginKeytabBased()) { + UserGroupInformation.getLoginUser().reloginFromKeytab(); + } else if (UserGroupInformation.isLoginTicketBased()) { + UserGroupInformation.getLoginUser().reloginFromTicketCache(); + } + try { + // try processing message again + saslResponse = processSaslMessage(saslMessage); + AUDITLOG.info("Retry " + AUTH_SUCCESSFUL_FOR + this.toString() + + ":" + attemptingUser + " after failure"); + } catch (IOException exp) { + tce = (IOException) getTrueCause(e); Review Comment: There is proper null check inside getTrueCause, it will not return null. ########## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java: ########## @@ -2206,7 +2206,25 @@ private void saslProcess(RpcSaslProto saslMessage) AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":" + attemptingUser + " (" + e.getLocalizedMessage() + ") with true cause: (" + tce.getLocalizedMessage() + ")"); - throw tce; + if (!UserGroupInformation.getLoginUser().isLoginSuccess()) { + LOG.info("Initiating re-login from IPC Server"); + if (UserGroupInformation.isLoginKeytabBased()) { + UserGroupInformation.getLoginUser().reloginFromKeytab(); + } else if (UserGroupInformation.isLoginTicketBased()) { + UserGroupInformation.getLoginUser().reloginFromTicketCache(); + } + try { + // try processing message again + saslResponse = processSaslMessage(saslMessage); + AUDITLOG.info("Retry " + AUTH_SUCCESSFUL_FOR + this.toString() + + ":" + attemptingUser + " after failure"); + } catch (IOException exp) { + tce = (IOException) getTrueCause(e); Review Comment: There is proper null check inside getTrueCause(), it will not return null. > Handle Server KDC re-login when Server and Client run in same JVM. > ------------------------------------------------------------------ > > Key: HADOOP-18581 > URL: https://issues.apache.org/jira/browse/HADOOP-18581 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 3.1.1 > Reporter: Surendra Singh Lilhore > Assignee: Surendra Singh Lilhore > Priority: Major > Labels: pull-request-available > > Handle re-login in Server when client, server running in same JVM and client > trying to re-login, but it fails. > For example, NameNode is server but in same JVM journal node client also > running to push to edit logs. When JN client try to re-login and it fails, it > will destroy server service ticket also and NameNode not able to server > client request. We can see the below error logs in NameNode log file. > > {noformat} > Auth failed for x.x.x.x:42199:null (GSS initiate failed) with true cause: > (GSS initiate failed) > Auth failed for x.x.x.x:42199:null (GSS initiate failed) with true cause: > (GSS initiate failed) > Auth failed for x.x.x.x:42199:null (GSS initiate failed) with true cause: > (GSS initiate failed){noformat} > Same discussion happened in HADOOP-17996. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org