[ 
https://issues.apache.org/jira/browse/HADOOP-15487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500348#comment-16500348
 ] 

Daryn Sharp commented on HADOOP-15487:
--------------------------------------

The second exception is an unrelated jdk bug fixed in 8u161.  [JDK-8170278: 
ticket renewal won't happen with debugging turned 
on|https://bugs.openjdk.java.net/browse/JDK-8170278].  The gssapi is smart 
recognizes and handles expired tickets from a keytab.  The problem is 
{{KerberosTicket#toString}} throws the ISE if it's expired. Easy workaround is 
don't enable debug logging.

The original issue is distinct.  If there truly are no custom plugins, it may 
be related to curator/zookeeper/AuthenticatedURL.  What is the specific apache 
release?  Did the server recover?

We may need to consider using a distinct subject/ugi for rpc servers to prevent 
other code munging our JASS, but there are a few possible grues lurking there.



> ConcurrentModificationException resulting in Kerberos authentication error.
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-15487
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15487
>             Project: Hadoop Common
>          Issue Type: Bug
>         Environment: CDH 5.13.3. Kerberized, Hadoop-HA, jdk1.8.0_152
>            Reporter: Wei-Chiu Chuang
>            Priority: Major
>
> We found the following exception message in a NameNode log. It seems the 
> ConcurrentModificationException caused Kerberos authentication error.
> It appears to be a JDK bug, similar to HADOOP-13433 (Race in 
> UGI.reloginFromKeytab) but the version of Hadoop (CDH5.13.3) already patched 
> HADOOP-13433. (The stacktrace also differs) This cluster runs on JDK 
> 1.8.0_152.
> {noformat}
> 2018-05-19 04:00:00,182 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:hdfs/no...@example.com (auth:KERBEROS) 
> cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2018-05-19 04:00:00,183 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client 10.16.20.122 threw exception 
> [java.util.ConcurrentModificationException]
> java.util.ConcurrentModificationException
>         at 
> java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
>         at java.util.LinkedList$ListItr.next(LinkedList.java:888)
>         at javax.security.auth.Subject$SecureSet$1.next(Subject.java:1070)
>         at javax.security.auth.Subject$ClassSet$1.run(Subject.java:1401)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1399)
>         at javax.security.auth.Subject$ClassSet.<init>(Subject.java:1372)
>         at javax.security.auth.Subject.getPrivateCredentials(Subject.java:767)
>         at 
> sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:127)
>         at 
> sun.security.jgss.krb5.SubjectComber.findMany(SubjectComber.java:69)
>         at 
> sun.security.jgss.krb5.ServiceCreds.getInstance(ServiceCreds.java:96)
>         at sun.security.jgss.krb5.Krb5Util.getServiceCreds(Krb5Util.java:203)
>         at 
> sun.security.jgss.krb5.Krb5AcceptCredential$1.run(Krb5AcceptCredential.java:74)
>         at 
> sun.security.jgss.krb5.Krb5AcceptCredential$1.run(Krb5AcceptCredential.java:72)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at 
> sun.security.jgss.krb5.Krb5AcceptCredential.getInstance(Krb5AcceptCredential.java:71)
>         at 
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:127)
>         at 
> sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:193)
>         at sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:427)
>         at 
> sun.security.jgss.GSSCredentialImpl.<init>(GSSCredentialImpl.java:62)
>         at 
> sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:154)
>         at 
> com.sun.security.sasl.gsskerb.GssKrb5Server.<init>(GssKrb5Server.java:108)
>         at 
> com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
>         at 
> org.apache.hadoop.security.SaslRpcServer$FastSaslServerFactory.createSaslServer(SaslRpcServer.java:398)
>         at 
> org.apache.hadoop.security.SaslRpcServer$1.run(SaslRpcServer.java:164)
>         at 
> org.apache.hadoop.security.SaslRpcServer$1.run(SaslRpcServer.java:161)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>         at 
> org.apache.hadoop.security.SaslRpcServer.create(SaslRpcServer.java:160)
>         at 
> org.apache.hadoop.ipc.Server$Connection.createSaslServer(Server.java:1742)
>         at 
> org.apache.hadoop.ipc.Server$Connection.processSaslMessage(Server.java:1522)
>         at 
> org.apache.hadoop.ipc.Server$Connection.saslProcess(Server.java:1433)
>         at 
> org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1396)
>         at 
> org.apache.hadoop.ipc.Server$Connection.processRpcOutOfBandRequest(Server.java:2080)
>         at 
> org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1920)
>         at 
> org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1682)
>         at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:896)
>         at 
> org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:752)
>         at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:723)
> {noformat}
> We saw a few GSSException in the NN log, but only one threw the 
> ConcurrentModificationException. This NN had a failover, which is caused by 
> ZKFC having GSSException too. Suspect it's related issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to