[ 
https://issues.apache.org/jira/browse/HADOOP-18581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17652501#comment-17652501
 ] 

ASF GitHub Bot commented on HADOOP-18581:
-----------------------------------------

surendralilhore commented on code in PR #5248:
URL: https://github.com/apache/hadoop/pull/5248#discussion_r1058364619


##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java:
##########
@@ -2206,7 +2206,25 @@ private void saslProcess(RpcSaslProto saslMessage)
           AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":"
               + attemptingUser + " (" + e.getLocalizedMessage()
               + ") with true cause: (" + tce.getLocalizedMessage() + ")");
-          throw tce;
+          if (!UserGroupInformation.getLoginUser().isLoginSuccess()) {
+            LOG.info("Initiating re-login from IPC Server");
+            if (UserGroupInformation.isLoginKeytabBased()) {
+              UserGroupInformation.getLoginUser().reloginFromKeytab();
+            } else if (UserGroupInformation.isLoginTicketBased()) {
+              UserGroupInformation.getLoginUser().reloginFromTicketCache();
+            }
+            try {
+              // try processing message again
+              saslResponse = processSaslMessage(saslMessage);
+              AUDITLOG.info("Retry " + AUTH_SUCCESSFUL_FOR + this.toString()
+                  + ":" + attemptingUser + " after failure");
+            } catch (IOException exp) {
+              tce = (IOException) getTrueCause(e);

Review Comment:
   There is proper null check inside getTrueCause, it will not return null.



##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java:
##########
@@ -2206,7 +2206,25 @@ private void saslProcess(RpcSaslProto saslMessage)
           AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":"
               + attemptingUser + " (" + e.getLocalizedMessage()
               + ") with true cause: (" + tce.getLocalizedMessage() + ")");
-          throw tce;
+          if (!UserGroupInformation.getLoginUser().isLoginSuccess()) {
+            LOG.info("Initiating re-login from IPC Server");
+            if (UserGroupInformation.isLoginKeytabBased()) {
+              UserGroupInformation.getLoginUser().reloginFromKeytab();
+            } else if (UserGroupInformation.isLoginTicketBased()) {
+              UserGroupInformation.getLoginUser().reloginFromTicketCache();
+            }
+            try {
+              // try processing message again
+              saslResponse = processSaslMessage(saslMessage);
+              AUDITLOG.info("Retry " + AUTH_SUCCESSFUL_FOR + this.toString()
+                  + ":" + attemptingUser + " after failure");
+            } catch (IOException exp) {
+              tce = (IOException) getTrueCause(e);

Review Comment:
   There is proper null check inside getTrueCause(), it will not return null.





> Handle Server KDC re-login when Server and Client run in same JVM.
> ------------------------------------------------------------------
>
>                 Key: HADOOP-18581
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18581
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 3.1.1
>            Reporter: Surendra Singh Lilhore
>            Assignee: Surendra Singh Lilhore
>            Priority: Major
>              Labels: pull-request-available
>
> Handle re-login in Server when client, server running in same JVM and client 
> trying to re-login, but it fails.
> For example, NameNode is server but in same JVM journal node client also 
> running to push to edit logs. When JN client try to re-login and it fails, it 
> will destroy server service ticket also and NameNode not able to server 
> client request. We can see the below error logs in NameNode log file.
>  
> {noformat}
> Auth failed for x.x.x.x:42199:null (GSS initiate failed) with true cause: 
> (GSS initiate failed)
> Auth failed for x.x.x.x:42199:null (GSS initiate failed) with true cause: 
> (GSS initiate failed)
> Auth failed for x.x.x.x:42199:null (GSS initiate failed) with true cause: 
> (GSS initiate failed){noformat}
> Same discussion happened in HADOOP-17996.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to