David Ayres created HDDS-9191:
---------------------------------

             Summary: OM initialization fails when securely talking to SCM via 
Kerberos Tickets 
                 Key: HDDS-9191
                 URL: https://issues.apache.org/jira/browse/HDDS-9191
             Project: Apache Ozone
          Issue Type: Bug
          Components: OM, SCM
    Affects Versions: 1.3.0, 1.4.0
         Environment: OM server: ddl07zone01

SCM server: ddl07zone02

ozone-site.xml security settings:
{code:java}
<!-- SCM Security Settings -->
<property>
   <name>hdds.scm.kerberos.principal</name>
   <value>scm/[email protected]</value>
</property><property>
   <name>hdds.scm.kerberos.keytab.file</name>
   <value>/opt/ozone/ozone.keytab</value>
</property><!-- OM Security Settings -->
<property>
   <name>ozone.om.kerberos.principal</name>
   <value>om/[email protected]</value>
</property><property>
   <name>ozone.om.kerberos.keytab.file</name>
   <value>/opt/ozone/ozone.keytab</value>
</property> {code}
            Reporter: David Ayres


When enabling cluster security via kerberos and providing UPN and keytab file 
in the ozone-site.xml initialization of OM fails with the following error:
{code:java}
2023-08-18 11:27:50,208 [main] DEBUG security.UserGroupInformation: 
PrivilegedAction [as: om/[email protected] 
(auth:KERBEROS)][action: org.apache.hadoop.ipc.Client$Connection$1@42714a7]
java.lang.Exception
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at 
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:752)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:856)
        at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
        at org.apache.hadoop.ipc.Client.call(Client.java:1502)
        at org.apache.hadoop.ipc.Client.call(Client.java:1455)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:235)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:122)
        at com.sun.proxy.$Proxy31.send(Unknown Source)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
        at com.sun.proxy.$Proxy31.send(Unknown Source)
        at 
org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolClientSideTranslatorPB.submitRequest(ScmBlockLocationProtocolClientSideTranslatorPB.java:121)
        at 
org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolClientSideTranslatorPB.getScmInfo(ScmBlockLocationProtocolClientSideTranslatorPB.java:266)
        at org.apache.hadoop.hdds.utils.HAUtils.getScmInfo(HAUtils.java:100)
        at 
org.apache.hadoop.ozone.om.OzoneManager.omInit(OzoneManager.java:1233)
        at 
org.apache.hadoop.ozone.om.OzoneManagerStarter$OMStarterHelper.init(OzoneManagerStarter.java:204)
        at 
org.apache.hadoop.ozone.om.OzoneManagerStarter.initOm(OzoneManagerStarter.java:102)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1972)
        at picocli.CommandLine.access$1300(CommandLine.java:145)
        at 
picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
        at 
picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
        at picocli.CommandLine.execute(CommandLine.java:2078)
        at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:100)
        at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:91)
        at 
org.apache.hadoop.ozone.om.OzoneManagerStarter.main(OzoneManagerStarter.java:58)
2023-08-18 11:27:50,208 [main] DEBUG ipc.Client: Exception encountered while 
connecting to the server : 
org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS 
initiate failed {code}
On the SCM side I get this error:
{code:java}
2023-08-17 16:05:53,628 [Socket Reader #1 for port 9863] WARN 
SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
10.236.152.241:35587:null (GSS initiate failed) with true cause: (GSS initiate 
failed)2023-08-17 16:05:53,628 [Socket Reader #1 for port 9863] DEBUG 
org.apache.hadoop.ipc.Server: Socket Reader #1 for port 9863: processOneRpc 
from client 10.236.152.241:35587 threw exception 
[javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum 
failed)]]
{code}
One thing that stands out:
{code:java}
2023-08-18 11:27:51,438 [main] DEBUG security.SaslRpcClient: Get kerberos info 
proto:interface 
org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolPB 
info:@org.apache.hadoop.security.KerberosInfo(clientPrincipal="", 
serverPrincipal="hdds.scm.kerberos.principal") {code}
{{Is clientPrincipal supposed to be blank?}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to