[ https://issues.apache.org/jira/browse/MAPREDUCE-7273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085325#comment-17085325 ]
Eric Yang commented on MAPREDUCE-7273: -------------------------------------- I just committed this. Thank you [~pbacsko]. > JHS: make sure that Kerberos relogin is performed when KDC becomes offline > then online again > -------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-7273 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7273 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver > Affects Versions: 2.10.0, 3.2.1, 3.1.3 > Reporter: Peter Bacsko > Assignee: Peter Bacsko > Priority: Major > Fix For: 3.4.0 > > Attachments: MAPREDUCE-7273-001.patch, MAPREDUCE-7273-002.patch > > > In JHS, if the KDC goes offline, the IPC layer does try to relogin, but it's > not always enough. You have to wait for 60 seconds for the next retry. In the > meantime, if the KDC comes back, the following error might occur: > {noformat} > 2020-04-09 03:27:52,075 DEBUG ipc.Server (Server.java:processSaslToken(1952)) > - Have read input token of size 708 for processing by > saslServer.evaluateResponse() > 2020-04-09 03:27:52,077 DEBUG ipc.Server (Server.java:saslProcess(1829)) - > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: Failure unspecified at GSS-API level (Mechanism level: Invalid > argument (400) - Cannot find key of appropriate type to decrypt AP REP - > AES128 CTS mode with HMAC SHA1-96)] > at > com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:199) > ... > {noformat} > When this happens, JHS has to be restarted. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org