Github user bolkedebruin commented on the pull request: https://github.com/apache/spark/pull/7489#issuecomment-122700827 Ok. I have tested the the same on a CDH 5.4.2 cluster and I see a difference in 1) Spark 1.3.0 (bundled with 5.4.2) and 2) Spark 1.5.0-SNAPSHOT and 3) HDP vs CDH . 1) Spark 1.3.0 does not connect to the resource manager but to the scheduler (that runs on port 8030) instead: ``` 15/07/19 21:42:00 INFO YarnRMClient: Registering the ApplicationMaster 15/07/19 21:42:00 DEBUG Client: The ping interval is 60000 ms. 15/07/19 21:42:00 DEBUG Client: Connecting to master01.paymentslab.int/172.17.12.10:8030 15/07/19 21:42:00 DEBUG UserGroupInformation: PrivilegedAction as:bolkedebruin (auth:SIMPLE) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717) 15/07/19 21:42:00 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE 15/07/19 21:42:00 DEBUG SaslRpcClient: Received SASL message state: NEGOTIATE auths { method: "TOKEN" mechanism: "DIGEST-MD5" protocol: "" serverId: "default" challenge: "realm=\"default\",nonce=\"P6hVNMbIZZ+KtpdxwsktwDpkortSjhlXdI1heRHb\",qop=\"auth\",charset=utf-8,algorithm=md5-sess" } 15/07/19 21:42:00 DEBUG SaslRpcClient: Get token info proto:interface org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB info:org.apache.hadoop.yarn.security.SchedulerSecurityInfo$1@3d362683 15/07/19 21:42:00 DEBUG AMRMTokenSelector: Looking for a token with service 172.17.12.10:8030 15/07/19 21:42:00 DEBUG AMRMTokenSelector: Token kind is HDFS_DELEGATION_TOKEN and the token's service name is 172.17.12.10:8020 15/07/19 21:42:00 DEBUG AMRMTokenSelector: Token kind is YARN_AM_RM_TOKEN and the token's service name is 172.17.12.10:8030 15/07/19 21:42:00 DEBUG SaslRpcClient: Creating SASL DIGEST-MD5(TOKEN) client to authenticate to service at default 15/07/19 21:42:00 DEBUG SaslRpcClient: Use TOKEN authentication for protocol ApplicationMasterProtocolPB 15/07/19 21:42:00 DEBUG SaslRpcClient: SASL client callback: setting username: AAABTqfFTXUAAAAFAAAAARKwkeU= 15/07/19 21:42:00 DEBUG SaslRpcClient: SASL client callback: setting userPassword 15/07/19 21:42:00 DEBUG SaslRpcClient: SASL client callback: setting realm: default 15/07/19 21:42:00 DEBUG SaslRpcClient: Sending sasl message state: INITIATE token: "charset=utf-8,username=\"AAABTqfFTXUAAAAFAAAAARKwkeU=\",realm=\"default\",nonce=\"P6hVNMbIZZ+KtpdxwsktwDpkortSjhlXdI1heRHb\",nc=00000001,cnonce=\"rL0eXrixoIFyuiPaGRUGeYwFWiPbGv8JcMIqHrAV\",digest-uri=\"/default\",maxbuf=65536,response=c00d228ec16b5fc9e0a4bab4f906c249,qop=auth" auths { method: "TOKEN" mechanism: "DIGEST-MD5" protocol: "" serverId: "default" } 15/07/19 21:42:00 DEBUG SaslRpcClient: Received SASL message state: SUCCESS token: "rspauth=9f9908f9b225fd633c9efe57caa5f09c" ``` 2) *Spark 1.5.0-SNAPSHOT without my patch* ``` 15/07/19 21:56:30 DEBUG AbstractService: Service org.apache.hadoop.yarn.client.api.impl.YarnClientImpl is started 15/07/19 21:56:30 DEBUG Client: The ping interval is 60000 ms. 15/07/19 21:56:30 DEBUG Client: Connecting to master01.paymentslab.int/172.17.12.10:8032 15/07/19 21:56:30 DEBUG UserGroupInformation: PrivilegedAction as:bolkedebruin (auth:SIMPLE) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717) 15/07/19 21:56:30 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE 15/07/19 21:56:30 DEBUG SaslRpcClient: Received SASL message state: NEGOTIATE auths { method: "TOKEN" mechanism: "DIGEST-MD5" protocol: "" serverId: "default" challenge: "realm=\"default\",nonce=\"TcfchRLxjw/FLx4eooDgeKHp+Oqh4D5I/e/b39oC\",qop=\"auth\",charset=utf-8,algorithm=md5-sess" } auths { method: "KERBEROS" mechanism: "GSSAPI" protocol: "yarn" serverId: "master01.paymentslab.int" } 15/07/19 21:56:30 DEBUG SaslRpcClient: Get token info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo$2@7e758e43 15/07/19 21:56:30 DEBUG RMDelegationTokenSelector: Looking for a token with service 172.17.12.10:8032 15/07/19 21:56:30 DEBUG RMDelegationTokenSelector: Token kind is HDFS_DELEGATION_TOKEN and the token's service name is 172.17.12.10:8020 15/07/19 21:56:30 DEBUG RMDelegationTokenSelector: Token kind is YARN_AM_RM_TOKEN and the token's service name is 15/07/19 21:56:30 DEBUG UserGroupInformation: PrivilegedActionException as:bolkedebruin (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 15/07/19 21:56:30 DEBUG UserGroupInformation: PrivilegedAction as:bolkedebruin (auth:SIMPLE) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643) 15/07/19 21:56:30 WARN Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 15/07/19 21:56:30 DEBUG UserGroupInformation: PrivilegedActionException as:bolkedebruin (auth:SIMPLE) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 15/07/19 21:56:30 DEBUG Client: closing ipc connection to master01.paymentslab.int/172.17.12.10:8032: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy21.getClusterNodes(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy22.getClusterNodes(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73) at org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:1993) at org.apache.spark.SparkContext.<init>(SparkContext.scala:544) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:28) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:516) Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553) at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717) ... 28 more 15/07/19 21:56:30 DEBUG Client: IPC Client (1657081124) connection to master01.paymentslab.int/172.17.12.10:8032 from bolkedebruin: closed 15/07/19 21:56:30 INFO YarnClusterSchedulerBackend: Node Report API is not available in the version of YARN being used, so AM logs link will not appear in application UI ``` So it does have the same error only here it is considered non fatal *SPARK 1.5.0-SNAPSHOT with my patch* ``` 5/07/19 21:47:37 DEBUG Client: The ping interval is 60000 ms. 15/07/19 21:47:37 DEBUG Client: Connecting to master01.paymentslab.int/172.17.12.10:8032 15/07/19 21:47:37 DEBUG UserGroupInformation: PrivilegedAction as:bolkedebruin (auth:SIMPLE) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717) 15/07/19 21:47:37 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE 15/07/19 21:47:37 DEBUG SaslRpcClient: Received SASL message state: NEGOTIATE auths { method: "TOKEN" mechanism: "DIGEST-MD5" protocol: "" serverId: "default" challenge: "realm=\"default\",nonce=\"cU+ygZlbzCbqdsTml1q7BDO3FzcAJOseZPtYKkml\",qop=\"auth\",charset=utf-8,algorithm=md5-sess" } auths { method: "KERBEROS" mechanism: "GSSAPI" protocol: "yarn" serverId: "master01.paymentslab.int" } 15/07/19 21:47:37 DEBUG SaslRpcClient: Get token info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo$2@39cc736c 15/07/19 21:47:37 DEBUG RMDelegationTokenSelector: Looking for a token with service 172.17.12.10:8032 15/07/19 21:47:37 DEBUG RMDelegationTokenSelector: Token kind is HDFS_DELEGATION_TOKEN and the token's service name is 172.17.12.10:8020 15/07/19 21:47:37 DEBUG RMDelegationTokenSelector: Token kind is YARN_AM_RM_TOKEN and the token's service name is 15/07/19 21:47:37 DEBUG RMDelegationTokenSelector: Token kind is RM_DELEGATION_TOKEN and the token's service name is 172.17.12.10:8032 15/07/19 21:47:37 DEBUG SaslRpcClient: Creating SASL DIGEST-MD5(TOKEN) client to authenticate to service at default 15/07/19 21:47:37 DEBUG SaslRpcClient: Use TOKEN authentication for protocol ApplicationClientProtocolPB 15/07/19 21:47:37 DEBUG SaslRpcClient: SASL client callback: setting username: ABxib2xrZWRlYnJ1aW5AUEFZTUVOVFNMQUIuSU5UBHlhcm4AigFOp9tNVYoBTsvn0VUCAg== 15/07/19 21:47:37 DEBUG SaslRpcClient: SASL client callback: setting userPassword 15/07/19 21:47:37 DEBUG SaslRpcClient: SASL client callback: setting realm: default 15/07/19 21:47:37 DEBUG SaslRpcClient: Sending sasl message state: INITIATE token: "charset=utf-8,username=\"ABxib2xrZWRlYnJ1aW5AUEFZTUVOVFNMQUIuSU5UBHlhcm4AigFOp9tNVYoBTsvn0VUCAg==\",realm=\"default\",nonce=\"cU+ygZlbzCbqdsTml1q7BDO3FzcAJOseZPtYKkml\",nc=00000001,cnonce=\"xWU7TjKq9IKtci8lG185kDi4t9r9jUcM9ADW6PJY\",digest-uri=\"/default\",maxbuf=65536,response=81dc2419495d5c5c3886f031a54a78ea,qop=auth" auths { method: "TOKEN" mechanism: "DIGEST-MD5" protocol: "" serverId: "default" } 15/07/19 21:47:37 DEBUG SaslRpcClient: Received SASL message state: SUCCESS token: "rspauth=b84e94b9d514c0ea602ba59f4394adfe" 15/07/19 21:47:37 DEBUG Client: Negotiated QOP is :auth 15/07/19 21:47:37 DEBUG Client: IPC Client (1586183723) connection to master01.paymentslab.int/172.17.12.10:8032 from bolkedebruin: starting, having connections 1 15/07/19 21:47:37 DEBUG Client: IPC Client (1586183723) connection to master01.paymentslab.int/172.17.12.10:8032 from bolkedebruin sending #0 15/07/19 21:47:37 DEBUG Client: IPC Client (1586183723) connection to master01.paymentslab.int/172.17.12.10:8032 from bolkedebruin got value #0 15/07/19 21:47:37 DEBUG ProtobufRpcEngine: Call: getClusterNodes took 157ms ``` 3) On HDP we found the error to be fatal in CDH not for some reason. Conclusion (imho): The RM delegation token is not included in any of the tested Spark versions and in all occasions there is an error in not having the RM delegation token. The consequences are different across versions it seems. My patch fixes these errors and I think it should be considered.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org