David Ayres created HDDS-9191:
---------------------------------
Summary: OM initialization fails when securely talking to SCM via
Kerberos Tickets
Key: HDDS-9191
URL: https://issues.apache.org/jira/browse/HDDS-9191
Project: Apache Ozone
Issue Type: Bug
Components: OM, SCM
Affects Versions: 1.3.0, 1.4.0
Environment: OM server: ddl07zone01
SCM server: ddl07zone02
ozone-site.xml security settings:
{code:java}
<!-- SCM Security Settings -->
<property>
<name>hdds.scm.kerberos.principal</name>
<value>scm/[email protected]</value>
</property><property>
<name>hdds.scm.kerberos.keytab.file</name>
<value>/opt/ozone/ozone.keytab</value>
</property><!-- OM Security Settings -->
<property>
<name>ozone.om.kerberos.principal</name>
<value>om/[email protected]</value>
</property><property>
<name>ozone.om.kerberos.keytab.file</name>
<value>/opt/ozone/ozone.keytab</value>
</property> {code}
Reporter: David Ayres
When enabling cluster security via kerberos and providing UPN and keytab file
in the ozone-site.xml initialization of OM fails with the following error:
{code:java}
2023-08-18 11:27:50,208 [main] DEBUG security.UserGroupInformation:
PrivilegedAction [as: om/[email protected]
(auth:KERBEROS)][action: org.apache.hadoop.ipc.Client$Connection$1@42714a7]
java.lang.Exception
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:752)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:856)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:235)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:122)
at com.sun.proxy.$Proxy31.send(Unknown Source)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy31.send(Unknown Source)
at
org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolClientSideTranslatorPB.submitRequest(ScmBlockLocationProtocolClientSideTranslatorPB.java:121)
at
org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolClientSideTranslatorPB.getScmInfo(ScmBlockLocationProtocolClientSideTranslatorPB.java:266)
at org.apache.hadoop.hdds.utils.HAUtils.getScmInfo(HAUtils.java:100)
at
org.apache.hadoop.ozone.om.OzoneManager.omInit(OzoneManager.java:1233)
at
org.apache.hadoop.ozone.om.OzoneManagerStarter$OMStarterHelper.init(OzoneManagerStarter.java:204)
at
org.apache.hadoop.ozone.om.OzoneManagerStarter.initOm(OzoneManagerStarter.java:102)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at picocli.CommandLine.executeUserObject(CommandLine.java:1972)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at
picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
at
picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
at picocli.CommandLine.execute(CommandLine.java:2078)
at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:100)
at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:91)
at
org.apache.hadoop.ozone.om.OzoneManagerStarter.main(OzoneManagerStarter.java:58)
2023-08-18 11:27:50,208 [main] DEBUG ipc.Client: Exception encountered while
connecting to the server :
org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS
initiate failed {code}
On the SCM side I get this error:
{code:java}
2023-08-17 16:05:53,628 [Socket Reader #1 for port 9863] WARN
SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for
10.236.152.241:35587:null (GSS initiate failed) with true cause: (GSS initiate
failed)2023-08-17 16:05:53,628 [Socket Reader #1 for port 9863] DEBUG
org.apache.hadoop.ipc.Server: Socket Reader #1 for port 9863: processOneRpc
from client 10.236.152.241:35587 threw exception
[javax.security.sasl.SaslException: GSS initiate failed [Caused by
GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum
failed)]]
{code}
One thing that stands out:
{code:java}
2023-08-18 11:27:51,438 [main] DEBUG security.SaslRpcClient: Get kerberos info
proto:interface
org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolPB
info:@org.apache.hadoop.security.KerberosInfo(clientPrincipal="",
serverPrincipal="hdds.scm.kerberos.principal") {code}
{{Is clientPrincipal supposed to be blank?}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]