Re: spark keeps on creating executors and each one fails with "TransportClient has not yet been set."
Anyone has any idea what could I enable so as to find out what it is trying to connect to? On Thu, Mar 2, 2017 at 5:34 PM, Aseem Bansalwrote: > Is there a way to find out what is it trying to connect to? I am running > my spark client from within a docker container so I opened up various ports > as per http://stackoverflow.com/questions/27729010/how-to- > configure-apache-spark-random-worker-ports-for-tight-firewalls after > adding all the properties in conf/spark-defaults.conf on my spark cluster's > installation. > > In stdout of the executor I can see the following debug > > 17/03/02 12:01:17 DEBUG UserGroupInformation: PrivilegedActionException > as:root (auth:SIMPLE) cause:org.apache.spark.rpc.RpcTimeoutException: > Cannot receive any reply in 120 seconds. This timeout is controlled by > spark.rpc.askTimeout > > What is it trying to connect to? >
spark keeps on creating executors and each one fails with "TransportClient has not yet been set."
Is there a way to find out what is it trying to connect to? I am running my spark client from within a docker container so I opened up various ports as per http://stackoverflow.com/questions/27729010/how-to-configure-apache-spark-random-worker-ports-for-tight-firewalls after adding all the properties in conf/spark-defaults.conf on my spark cluster's installation. In stdout of the executor I can see the following debug 17/03/02 12:01:17 DEBUG UserGroupInformation: PrivilegedActionException as:root (auth:SIMPLE) cause:org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in 120 seconds. This timeout is controlled by spark.rpc.askTimeout What is it trying to connect to? 17/03/02 11:46:47 INFO SecurityManager: Changing modify acls groups to: 17/03/02 11:46:47 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ec2-user, root); groups with view permissions: Set(); users with modify permissions: Set(ec2-user, root); groups with modify permissions: Set() java.lang.IllegalArgumentException: requirement failed: TransportClient has not yet been set. at scala.Predef$.require(Predef.scala:224) at org.apache.spark.rpc.netty.RpcOutboxMessage.onTimeout(Outbox.scala:70) at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$ask$1.applyOrElse(NettyRpcEnv.scala:232) at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$ask$1.applyOrElse(NettyRpcEnv.scala:231) at scala.concurrent.Future$$anonfun$onFailure$1.apply(Future.scala:138) at scala.concurrent.Future$$anonfun$onFailure$1.apply(Future.scala:136) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) at org.spark_project.guava.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293) at scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:136) at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40) at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248) at scala.concurrent.Promise$class.tryFailure(Promise.scala:112) at scala.concurrent.impl.Promise$DefaultPromise.tryFailure(Promise.scala:153) at org.apache.spark.rpc.netty.NettyRpcEnv.org$apache$spark$rpc$netty$NettyRpcEnv$$onFailure$1(NettyRpcEnv.scala:205) at org.apache.spark.rpc.netty.NettyRpcEnv$$anon$1.run(NettyRpcEnv.scala:239) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:284) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) Caused by: org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in 120 seconds. This timeout is controlled by spark.rpc.askTimeout - To unsubscribe e-mail: user-unsubscr...@spark.apache.org