[
https://issues.apache.org/jira/browse/HADOOP-18581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17652501#comment-17652501
]
ASF GitHub Bot commented on HADOOP-18581:
-----------------------------------------
surendralilhore commented on code in PR #5248:
URL: https://github.com/apache/hadoop/pull/5248#discussion_r1058364619
##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java:
##########
@@ -2206,7 +2206,25 @@ private void saslProcess(RpcSaslProto saslMessage)
AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":"
+ attemptingUser + " (" + e.getLocalizedMessage()
+ ") with true cause: (" + tce.getLocalizedMessage() + ")");
- throw tce;
+ if (!UserGroupInformation.getLoginUser().isLoginSuccess()) {
+ LOG.info("Initiating re-login from IPC Server");
+ if (UserGroupInformation.isLoginKeytabBased()) {
+ UserGroupInformation.getLoginUser().reloginFromKeytab();
+ } else if (UserGroupInformation.isLoginTicketBased()) {
+ UserGroupInformation.getLoginUser().reloginFromTicketCache();
+ }
+ try {
+ // try processing message again
+ saslResponse = processSaslMessage(saslMessage);
+ AUDITLOG.info("Retry " + AUTH_SUCCESSFUL_FOR + this.toString()
+ + ":" + attemptingUser + " after failure");
+ } catch (IOException exp) {
+ tce = (IOException) getTrueCause(e);
Review Comment:
There is proper null check inside getTrueCause, it will not return null.
##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java:
##########
@@ -2206,7 +2206,25 @@ private void saslProcess(RpcSaslProto saslMessage)
AUDITLOG.warn(AUTH_FAILED_FOR + this.toString() + ":"
+ attemptingUser + " (" + e.getLocalizedMessage()
+ ") with true cause: (" + tce.getLocalizedMessage() + ")");
- throw tce;
+ if (!UserGroupInformation.getLoginUser().isLoginSuccess()) {
+ LOG.info("Initiating re-login from IPC Server");
+ if (UserGroupInformation.isLoginKeytabBased()) {
+ UserGroupInformation.getLoginUser().reloginFromKeytab();
+ } else if (UserGroupInformation.isLoginTicketBased()) {
+ UserGroupInformation.getLoginUser().reloginFromTicketCache();
+ }
+ try {
+ // try processing message again
+ saslResponse = processSaslMessage(saslMessage);
+ AUDITLOG.info("Retry " + AUTH_SUCCESSFUL_FOR + this.toString()
+ + ":" + attemptingUser + " after failure");
+ } catch (IOException exp) {
+ tce = (IOException) getTrueCause(e);
Review Comment:
There is proper null check inside getTrueCause(), it will not return null.
> Handle Server KDC re-login when Server and Client run in same JVM.
> ------------------------------------------------------------------
>
> Key: HADOOP-18581
> URL: https://issues.apache.org/jira/browse/HADOOP-18581
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 3.1.1
> Reporter: Surendra Singh Lilhore
> Assignee: Surendra Singh Lilhore
> Priority: Major
> Labels: pull-request-available
>
> Handle re-login in Server when client, server running in same JVM and client
> trying to re-login, but it fails.
> For example, NameNode is server but in same JVM journal node client also
> running to push to edit logs. When JN client try to re-login and it fails, it
> will destroy server service ticket also and NameNode not able to server
> client request. We can see the below error logs in NameNode log file.
>
> {noformat}
> Auth failed for x.x.x.x:42199:null (GSS initiate failed) with true cause:
> (GSS initiate failed)
> Auth failed for x.x.x.x:42199:null (GSS initiate failed) with true cause:
> (GSS initiate failed)
> Auth failed for x.x.x.x:42199:null (GSS initiate failed) with true cause:
> (GSS initiate failed){noformat}
> Same discussion happened in HADOOP-17996.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]