[
https://issues.apache.org/jira/browse/HADOOP-14982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255866#comment-16255866
]
Hudson commented on HADOOP-14982:
---------------------------------
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13249 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/13249/])
HADOOP-14982. Clients using FailoverOnNetworkExceptionRetry can go into
(rkanter: rev f2efaf013f7577948061abbb49c6d17c375e92cc)
* (edit)
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/retry/UnreliableImplementation.java
* (edit)
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/retry/TestRetryProxy.java
* (edit)
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryPolicies.java
* (edit)
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/retry/UnreliableInterface.java
> Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're
> used without authenticating with kerberos in HA env
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-14982
> URL: https://issues.apache.org/jira/browse/HADOOP-14982
> Project: Hadoop Common
> Issue Type: Bug
> Components: common
> Reporter: Peter Bacsko
> Assignee: Peter Bacsko
> Fix For: 3.1.0, 2.10.0
>
> Attachments: HADOOP-14892-001.patch, HADOOP-14892-002.patch,
> HADOOP-14982-003.patch
>
>
> If HA is configured for the Resource Manager in a secure environment, using
> the mapred client goes into a loop if the user is not authenticated with
> Kerberos.
> {noformat}
> [root@pb6sec-1 ~]# mapred job -list
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over
> to rm36
> 17/10/25 06:37:43 WARN ipc.Client: Exception encountered while connecting to
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused
> by GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: java.io.IOException:
> Failed on local exception: java.io.IOException:
> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to find
> any Kerberos tgt)]; Host Details : local host is:
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; ,
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over
> rm36 after 1 failover attempts. Trying to failover after sleeping for 160ms.
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over
> to rm25
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler:
> java.net.ConnectException: Call From host_redacted/IP_redacted to
> com.host.redacted:8032 failed on connection exception:
> java.net.ConnectException: Connection refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 2
> failover attempts. Trying to failover after sleeping for 582ms.
> 17/10/25 06:37:44 INFO client.ConfiguredRMFailoverProxyProvider: Failing over
> to rm36
> 17/10/25 06:37:44 WARN ipc.Client: Exception encountered while connecting to
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused
> by GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]
> 17/10/25 06:37:44 INFO retry.RetryInvocationHandler: java.io.IOException:
> Failed on local exception: java.io.IOException:
> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to find
> any Kerberos tgt)]; Host Details : local host is:
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; ,
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over
> rm36 after 3 failover attempts. Trying to failover after sleeping for 977ms.
> 17/10/25 06:37:45 INFO client.ConfiguredRMFailoverProxyProvider: Failing over
> to rm25
> 17/10/25 06:37:45 INFO retry.RetryInvocationHandler:
> java.net.ConnectException: Call From host_redacted/IP_redacted to
> com.host.redacted:8032 failed on connection exception:
> java.net.ConnectException: Connection refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 4
> failover attempts. Trying to failover after sleeping for 1667ms.
> 17/10/25 06:37:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over
> to rm36
> 17/10/25 06:37:46 WARN ipc.Client: Exception encountered while connecting to
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused
> by GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]
> 17/10/25 06:37:46 INFO retry.RetryInvocationHandler: java.io.IOException:
> Failed on local exception: java.io.IOException:
> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to find
> any Kerberos tgt)]; Host Details : local host is:
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; ,
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over
> rm36 after 5 failover attempts. Trying to failover after sleeping for 2776ms.
> 17/10/25 06:37:49 INFO client.ConfiguredRMFailoverProxyProvider: Failing over
> to rm25
> 17/10/25 06:37:49 INFO retry.RetryInvocationHandler:
> java.net.ConnectException: Call From host_redacted/IP_redacted to
> com.host.redacted:8032 failed on connection exception:
> java.net.ConnectException: Connection refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 6
> failover attempts. Trying to failover after sleeping for 1055ms.
> 17/10/25 06:37:50 INFO client.ConfiguredRMFailoverProxyProvider: Failing over
> to rm36
> 17/10/25 06:37:50 WARN ipc.Client: Exception encountered while connecting to
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused
> by GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]
> 17/10/25 06:37:50 INFO retry.RetryInvocationHandler: java.io.IOException:
> Failed on local exception: java.io.IOException:
> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to find
> any Kerberos tgt)]; Host Details : local host is:
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; ,
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over
> rm36 after 7 failover attempts. Trying to failover after sleeping for 2608ms.
> ...
> {noformat}
> The reason is that the retry handler sees a {{ConnectException}}, then fails
> over to the inactive RM. This obviously doesn't work, so it comes back to the
> active and whole process starts again. The RetryHandler should examine if the
> {{ConnectException}} is actually caused by a {{GSSException}} (and probably
> check the "No valid credentials provided" message) and if so, it should not
> perform a failover.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]