[
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629432#comment-14629432
]
Bolke de Bruin edited comment on SPARK-9019 at 7/16/15 8:39 AM:
----------------------------------------------------------------
I tried running this on an updated environment where YARN-3103 was fixed,
however it still fails although behavior is a bit different now. The task is
now being accepted but stays in the running state forever without executing
anything. Please note that the trace below is without key tab usage, but with
an authorized user (kinit admin/admin)
15/07/16 04:27:34 DEBUG Client: getting client out of cache:
org.apache.hadoop.ipc.Client@53abb73
15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1:
[actor] received message AkkaMessage(ReviveOffers,false) from
Actor[akka://sparkDriver/deadLetters]
15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1:
Received RPC message: AkkaMessage(ReviveOffers,false)
15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1:
[actor] handled message (1.632126 ms) AkkaMessage(ReviveOffers,false) from
Actor[akka://sparkDriver/deadLetters]
15/07/16 04:27:34 DEBUG AbstractService: Service
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl is started
15/07/16 04:27:34 DEBUG AbstractService: Service
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl is started
15/07/16 04:27:34 DEBUG Client: The ping interval is 60000 ms.
15/07/16 04:27:34 DEBUG Client: Connecting to node6.local/10.79.10.6:8050
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedAction as:admin
(auth:SIMPLE)
from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
15/07/16 04:27:34 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE
15/07/16 04:27:34 DEBUG SaslRpcClient: Received SASL message state: NEGOTIATE
auths {
method: "TOKEN"
mechanism: "DIGEST-MD5"
protocol: ""
serverId: "default"
challenge:
"realm=\"default\",nonce=\"wjgFp9L22uDJt41FNtY9M8CP/T+dswfBoF48r9+s\",qop=\"auth\",charset=utf-8,algorithm=md5-sess"
}
auths {
method: "KERBEROS"
mechanism: "GSSAPI"
protocol: "rm"
serverId: "node6.local"
}
15/07/16 04:27:34 DEBUG SaslRpcClient: Get token info proto:interface
org.apache.hadoop.yarn.api.ApplicationClientProtocolPB
info:org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo$2@69990fa7
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Looking for a token with
service 10.79.10.6:8050
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is
YARN_AM_RM_TOKEN and the token's service name is
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is
HIVE_DELEGATION_TOKEN and the token's service name is
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is
TIMELINE_DELEGATION_TOKEN and the token's service name is 10.79.10.6:8188
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is
HDFS_DELEGATION_TOKEN and the token's service name is 10.79.10.4:8020
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedActionException
as:admin (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException:
Client cannot authenticate via:[TOKEN, KERBEROS]
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedAction as:admin
(auth:SIMPLE)
from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
15/07/16 04:27:34 WARN Client: Exception encountered while connecting to the
server : org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedActionException
as:admin (auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
15/07/16 04:27:34 DEBUG Client: closing ipc connection to
node6.local/10.79.10.6:8050: org.apache.hadoop.security.AccessControlException:
Client cannot authenticate via:[TOKEN, KERBEROS]
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client
cannot authenticate via:[TOKEN, KERBEROS]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy30.getClusterNodes(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy31.getClusterNodes(Unknown Source)
at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73)
at
org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:2016)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:552)
at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
at
org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
... 33 more
15/07/16 04:27:34 DEBUG Client: IPC Client (2062834323) connection to
node6.local/10.79.10.6:8050 from admin: closed
15/07/16 04:27:34 INFO YarnClusterSchedulerBackend: Node Report API is not
available in the version of YARN being used, so AM logs link will not appear in
application UI
java.io.IOException: Failed on local exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]; Host Details : local host is: "node6/10.79.10.6";
destination host is: "node6.local":8050;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy30.getClusterNodes(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy31.getClusterNodes(Unknown Source)
at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73)
at
org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:2016)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:552)
at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 30 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
at
org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
... 33 more
was (Author: bolke):
I tried running this on an update environment, however it still fails although
behavior is a bit different now. The task is now being accepted but stays in
the running state forever without executing anything. Please note that the
trace below is without key tab usage, but with an authorized user (kinit
admin/admin)
15/07/16 04:27:34 DEBUG Client: getting client out of cache:
org.apache.hadoop.ipc.Client@53abb73
15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1:
[actor] received message AkkaMessage(ReviveOffers,false) from
Actor[akka://sparkDriver/deadLetters]
15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1:
Received RPC message: AkkaMessage(ReviveOffers,false)
15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1:
[actor] handled message (1.632126 ms) AkkaMessage(ReviveOffers,false) from
Actor[akka://sparkDriver/deadLetters]
15/07/16 04:27:34 DEBUG AbstractService: Service
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl is started
15/07/16 04:27:34 DEBUG AbstractService: Service
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl is started
15/07/16 04:27:34 DEBUG Client: The ping interval is 60000 ms.
15/07/16 04:27:34 DEBUG Client: Connecting to node6.local/10.79.10.6:8050
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedAction as:admin
(auth:SIMPLE)
from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
15/07/16 04:27:34 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE
15/07/16 04:27:34 DEBUG SaslRpcClient: Received SASL message state: NEGOTIATE
auths {
method: "TOKEN"
mechanism: "DIGEST-MD5"
protocol: ""
serverId: "default"
challenge:
"realm=\"default\",nonce=\"wjgFp9L22uDJt41FNtY9M8CP/T+dswfBoF48r9+s\",qop=\"auth\",charset=utf-8,algorithm=md5-sess"
}
auths {
method: "KERBEROS"
mechanism: "GSSAPI"
protocol: "rm"
serverId: "node6.local"
}
15/07/16 04:27:34 DEBUG SaslRpcClient: Get token info proto:interface
org.apache.hadoop.yarn.api.ApplicationClientProtocolPB
info:org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo$2@69990fa7
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Looking for a token with
service 10.79.10.6:8050
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is
YARN_AM_RM_TOKEN and the token's service name is
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is
HIVE_DELEGATION_TOKEN and the token's service name is
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is
TIMELINE_DELEGATION_TOKEN and the token's service name is 10.79.10.6:8188
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is
HDFS_DELEGATION_TOKEN and the token's service name is 10.79.10.4:8020
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedActionException
as:admin (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException:
Client cannot authenticate via:[TOKEN, KERBEROS]
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedAction as:admin
(auth:SIMPLE)
from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
15/07/16 04:27:34 WARN Client: Exception encountered while connecting to the
server : org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedActionException
as:admin (auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
15/07/16 04:27:34 DEBUG Client: closing ipc connection to
node6.local/10.79.10.6:8050: org.apache.hadoop.security.AccessControlException:
Client cannot authenticate via:[TOKEN, KERBEROS]
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client
cannot authenticate via:[TOKEN, KERBEROS]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy30.getClusterNodes(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy31.getClusterNodes(Unknown Source)
at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73)
at
org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:2016)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:552)
at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
at
org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
... 33 more
15/07/16 04:27:34 DEBUG Client: IPC Client (2062834323) connection to
node6.local/10.79.10.6:8050 from admin: closed
15/07/16 04:27:34 INFO YarnClusterSchedulerBackend: Node Report API is not
available in the version of YARN being used, so AM logs link will not appear in
application UI
java.io.IOException: Failed on local exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]; Host Details : local host is: "node6/10.79.10.6";
destination host is: "node6.local":8050;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy30.getClusterNodes(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy31.getClusterNodes(Unknown Source)
at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73)
at
org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:2016)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:552)
at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 30 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
at
org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
... 33 more
> spark-submit fails on yarn with kerberos enabled
> ------------------------------------------------
>
> Key: SPARK-9019
> URL: https://issues.apache.org/jira/browse/SPARK-9019
> Project: Spark
> Issue Type: Bug
> Components: Spark Submit
> Affects Versions: 1.5.0
> Environment: Hadoop 2.6 with YARN and kerberos enabled
> Reporter: Bolke de Bruin
> Labels: kerberos, spark-submit, yarn
>
> It is not possible to run jobs using spark-submit on yarn with a kerberized
> cluster.
> Commandline:
> /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob
> --keytab sparkjob.keytab --num-executors 3 --executor-cores 5
> --executor-memory 5G --master yarn-cluster /tmp/get_peers.py
> Fails with:
> 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
> 15/07/13 22:48:31 INFO server.AbstractConnector: Started
> [email protected]:58380
> 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on
> port 58380.
> 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at
> http://10.111.114.9:58380
> 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created
> YarnClusterScheduler
> 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler
> for source because spark.app.id is not set.
> 15/07/13 22:48:32 INFO util.Utils: Successfully started service
> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
> 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on
> 43470
> 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register
> BlockManager
> 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block
> manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver,
> 10.111.114.9, 43470)
> 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
> 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address:
> http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
> 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to
> the server : org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]
> 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over
> to rm2
> 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking
> getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after
> 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
> java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to
> lxhnl013.ad.ing.net:8032 failed on connection exception:
> java.net.ConnectException: Connection refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
> at org.apache.hadoop.ipc.Client.call(Client.java:1472)
> at org.apache.hadoop.ipc.Client.call(Client.java:1399)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
> at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source)
> at
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
> at
> org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
> at
> org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:73)
> at scala.Option.foreach(Option.scala:236)
> at
> org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend.getDriverLogUrls(YarnClusterSchedulerBackend.scala:73)
> at
> org.apache.spark.SparkContext.postApplicationStart(SparkContext.scala:1993)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:544)
> at
> org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> at py4j.Gateway.invoke(Gateway.java:214)
> at
> py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
> at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
> at py4j.GatewayConnection.run(GatewayConnection.java:207)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
> at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
> at
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
> at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
> at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
> at org.apache.hadoop.ipc.Client.call(Client.java:1438)
> ... 30 more
> If not using --principal and --keytab the same error shows.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]