[ https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629432#comment-14629432 ]
Bolke de Bruin edited comment on SPARK-9019 at 7/16/15 8:39 AM: ---------------------------------------------------------------- I tried running this on an updated environment where YARN-3103 was fixed, however it still fails although behavior is a bit different now. The task is now being accepted but stays in the running state forever without executing anything. Please note that the trace below is without key tab usage, but with an authorized user (kinit admin/admin) 15/07/16 04:27:34 DEBUG Client: getting client out of cache: org.apache.hadoop.ipc.Client@53abb73 15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] received message AkkaMessage(ReviveOffers,false) from Actor[akka://sparkDriver/deadLetters] 15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: Received RPC message: AkkaMessage(ReviveOffers,false) 15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] handled message (1.632126 ms) AkkaMessage(ReviveOffers,false) from Actor[akka://sparkDriver/deadLetters] 15/07/16 04:27:34 DEBUG AbstractService: Service org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl is started 15/07/16 04:27:34 DEBUG AbstractService: Service org.apache.hadoop.yarn.client.api.impl.YarnClientImpl is started 15/07/16 04:27:34 DEBUG Client: The ping interval is 60000 ms. 15/07/16 04:27:34 DEBUG Client: Connecting to node6.local/10.79.10.6:8050 15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedAction as:admin (auth:SIMPLE) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717) 15/07/16 04:27:34 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE 15/07/16 04:27:34 DEBUG SaslRpcClient: Received SASL message state: NEGOTIATE auths { method: "TOKEN" mechanism: "DIGEST-MD5" protocol: "" serverId: "default" challenge: "realm=\"default\",nonce=\"wjgFp9L22uDJt41FNtY9M8CP/T+dswfBoF48r9+s\",qop=\"auth\",charset=utf-8,algorithm=md5-sess" } auths { method: "KERBEROS" mechanism: "GSSAPI" protocol: "rm" serverId: "node6.local" } 15/07/16 04:27:34 DEBUG SaslRpcClient: Get token info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo$2@69990fa7 15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Looking for a token with service 10.79.10.6:8050 15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is YARN_AM_RM_TOKEN and the token's service name is 15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is HIVE_DELEGATION_TOKEN and the token's service name is 15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is TIMELINE_DELEGATION_TOKEN and the token's service name is 10.79.10.6:8188 15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is HDFS_DELEGATION_TOKEN and the token's service name is 10.79.10.4:8020 15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedActionException as:admin (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedAction as:admin (auth:SIMPLE) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643) 15/07/16 04:27:34 WARN Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedActionException as:admin (auth:SIMPLE) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 15/07/16 04:27:34 DEBUG Client: closing ipc connection to node6.local/10.79.10.6:8050: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy30.getClusterNodes(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy31.getClusterNodes(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73) at org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:2016) at org.apache.spark.SparkContext.<init>(SparkContext.scala:552) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) at py4j.Gateway.invoke(Gateway.java:214) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553) at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717) ... 33 more 15/07/16 04:27:34 DEBUG Client: IPC Client (2062834323) connection to node6.local/10.79.10.6:8050 from admin: closed 15/07/16 04:27:34 INFO YarnClusterSchedulerBackend: Node Report API is not available in the version of YARN being used, so AM logs link will not appear in application UI java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "node6/10.79.10.6"; destination host is: "node6.local":8050; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) at org.apache.hadoop.ipc.Client.call(Client.java:1472) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy30.getClusterNodes(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy31.getClusterNodes(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73) at org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:2016) at org.apache.spark.SparkContext.<init>(SparkContext.scala:552) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) at py4j.Gateway.invoke(Gateway.java:214) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) ... 30 more Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553) at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717) ... 33 more was (Author: bolke): I tried running this on an update environment, however it still fails although behavior is a bit different now. The task is now being accepted but stays in the running state forever without executing anything. Please note that the trace below is without key tab usage, but with an authorized user (kinit admin/admin) 15/07/16 04:27:34 DEBUG Client: getting client out of cache: org.apache.hadoop.ipc.Client@53abb73 15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] received message AkkaMessage(ReviveOffers,false) from Actor[akka://sparkDriver/deadLetters] 15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: Received RPC message: AkkaMessage(ReviveOffers,false) 15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] handled message (1.632126 ms) AkkaMessage(ReviveOffers,false) from Actor[akka://sparkDriver/deadLetters] 15/07/16 04:27:34 DEBUG AbstractService: Service org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl is started 15/07/16 04:27:34 DEBUG AbstractService: Service org.apache.hadoop.yarn.client.api.impl.YarnClientImpl is started 15/07/16 04:27:34 DEBUG Client: The ping interval is 60000 ms. 15/07/16 04:27:34 DEBUG Client: Connecting to node6.local/10.79.10.6:8050 15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedAction as:admin (auth:SIMPLE) from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717) 15/07/16 04:27:34 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE 15/07/16 04:27:34 DEBUG SaslRpcClient: Received SASL message state: NEGOTIATE auths { method: "TOKEN" mechanism: "DIGEST-MD5" protocol: "" serverId: "default" challenge: "realm=\"default\",nonce=\"wjgFp9L22uDJt41FNtY9M8CP/T+dswfBoF48r9+s\",qop=\"auth\",charset=utf-8,algorithm=md5-sess" } auths { method: "KERBEROS" mechanism: "GSSAPI" protocol: "rm" serverId: "node6.local" } 15/07/16 04:27:34 DEBUG SaslRpcClient: Get token info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo$2@69990fa7 15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Looking for a token with service 10.79.10.6:8050 15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is YARN_AM_RM_TOKEN and the token's service name is 15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is HIVE_DELEGATION_TOKEN and the token's service name is 15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is TIMELINE_DELEGATION_TOKEN and the token's service name is 10.79.10.6:8188 15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is HDFS_DELEGATION_TOKEN and the token's service name is 10.79.10.4:8020 15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedActionException as:admin (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedAction as:admin (auth:SIMPLE) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643) 15/07/16 04:27:34 WARN Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedActionException as:admin (auth:SIMPLE) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 15/07/16 04:27:34 DEBUG Client: closing ipc connection to node6.local/10.79.10.6:8050: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy30.getClusterNodes(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy31.getClusterNodes(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73) at org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:2016) at org.apache.spark.SparkContext.<init>(SparkContext.scala:552) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) at py4j.Gateway.invoke(Gateway.java:214) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553) at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717) ... 33 more 15/07/16 04:27:34 DEBUG Client: IPC Client (2062834323) connection to node6.local/10.79.10.6:8050 from admin: closed 15/07/16 04:27:34 INFO YarnClusterSchedulerBackend: Node Report API is not available in the version of YARN being used, so AM logs link will not appear in application UI java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "node6/10.79.10.6"; destination host is: "node6.local":8050; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) at org.apache.hadoop.ipc.Client.call(Client.java:1472) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy30.getClusterNodes(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy31.getClusterNodes(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73) at org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:2016) at org.apache.spark.SparkContext.<init>(SparkContext.scala:552) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) at py4j.Gateway.invoke(Gateway.java:214) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) ... 30 more Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553) at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717) ... 33 more > spark-submit fails on yarn with kerberos enabled > ------------------------------------------------ > > Key: SPARK-9019 > URL: https://issues.apache.org/jira/browse/SPARK-9019 > Project: Spark > Issue Type: Bug > Components: Spark Submit > Affects Versions: 1.5.0 > Environment: Hadoop 2.6 with YARN and kerberos enabled > Reporter: Bolke de Bruin > Labels: kerberos, spark-submit, yarn > > It is not possible to run jobs using spark-submit on yarn with a kerberized > cluster. > Commandline: > /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob > --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 > --executor-memory 5G --master yarn-cluster /tmp/get_peers.py > Fails with: > 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT > 15/07/13 22:48:31 INFO server.AbstractConnector: Started > SelectChannelConnector@0.0.0.0:58380 > 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on > port 58380. > 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at > http://10.111.114.9:58380 > 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created > YarnClusterScheduler > 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler > for source because spark.app.id is not set. > 15/07/13 22:48:32 INFO util.Utils: Successfully started service > 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470. > 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on > 43470 > 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register > BlockManager > 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block > manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, > 10.111.114.9, 43470) > 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager > 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: > http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/ > 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to > the server : org.apache.hadoop.security.AccessControlException: Client cannot > authenticate via:[TOKEN, KERBEROS] > 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm2 > 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking > getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after > 1 fail over attempts. Trying to fail over after sleeping for 32582ms. > java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to > lxhnl013.ad.ing.net:8032 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731) > at org.apache.hadoop.ipc.Client.call(Client.java:1472) > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475) > at > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92) > at > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73) > at > org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:1993) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:544) > at > org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) > at py4j.Gateway.invoke(Gateway.java:214) > at > py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) > at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) > at py4j.GatewayConnection.run(GatewayConnection.java:207) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) > at > org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705) > at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) > at org.apache.hadoop.ipc.Client.call(Client.java:1438) > ... 30 more > If not using --principal and --keytab the same error shows. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org