This:

Caused by: java.util.concurrent.TimeoutException: Futures timed out after
[30 seconds]

Could happen for many reasons, one of them could be because of insufficient
memory. Are you running all 20 apps on the same node? How are you
submitting the apps? (with spark-submit?). I see you have driver memory
specified as 8g, does it mean all 20apps are running on the same machine
and each will have 8g of driver memory on a 12g machine?

Can you look in all worker nodes stderr and see the exact reason?


Thanks
Best Regards

On Mon, Jun 29, 2015 at 3:08 PM, <luohui20...@sina.com> wrote:

> Hi there
>
>       I am running 30 APPs in my spark cluster, and some of the APPs got
> exception like below:
>
> [root@slave3 0]# cat stderr
> 15/06/29 17:20:08 INFO executor.CoarseGrainedExecutorBackend: Registered
> signal handlers for [TERM, HUP, INT]
> 15/06/29 17:20:09 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 15/06/29 17:20:09 INFO spark.SecurityManager: Changing view acls to: root
> 15/06/29 17:20:09 INFO spark.SecurityManager: Changing modify acls to: root
> 15/06/29 17:20:09 INFO spark.SecurityManager: SecurityManager:
> authentication disabled; ui acls disabled; users with view permissions:
> Set(root); users with modify permissions: Set(root)
> 15/06/29 17:20:09 INFO slf4j.Slf4jLogger: Slf4jLogger started
> 15/06/29 17:20:09 INFO Remoting: Starting remoting
> 15/06/29 17:20:10 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://driverPropsFetcher@slave3:51026]
> 15/06/29 17:20:10 INFO util.Utils: Successfully started service
> 'driverPropsFetcher' on port 51026.
> Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643)
>         at
> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)
>         at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:128)
>         at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:224)
>         at
> org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
> Caused by: java.util.concurrent.TimeoutException: Futures timed out after
> [30 seconds]
>         at
> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>         at
> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>         at
> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>         at
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>         at scala.concurrent.Await$.result(package.scala:107)
>         at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:144)
>         at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60)
>         at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         ... 4 more
>
> when i am running 20 APPs,it is OK. So I doubt this problem looks like
> executor get disassicated with the driver due to high I/O pressure or
> network latency.however I have no idea which parameter is spark could fix
> this. Any idea will be appreciated.
>
>
> Here is some infomation about my cluster:
>
> 1master and 6workers.every node has 8cores and 12GB memory.
>
>
> And settings in my spark-default.conf and spark-env.sh is like this:
>
>
> spark-default.conf
>
> spark.master                     spark://master:7077
> spark.eventLog.enabled           true
> spark.eventLog.dir               /var/log/spark
> spark.serializer                 org.apache.spark.serializer.KryoSerializer
> spark.driver.memory              8g
> spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value
> -Dnumbers="one two three"
> spark.kryoserializer.buffer.max.mb    128
> spark.storage.memoryFraction     0.2
> spark.shuffle.memoryFraction     0.4
> spark.sql.shuffle.partitions     32
> spark.scheduler.mode             FAIR
> spark.worker.cleanup.appDataTtl  259200
> spark.port.maxRetries            10000
>
> spark.scheduler.maxRegisteredResourcesWaitingTime   40
>
>
> spark-env.sh:
>
> export SPARK_WORKER_INSTANCES=1
> export SPARK_EXECUTOR_INSTANCES=8
> export SPARK_EXECUTOR_CORES=1
> export SPARK_EXECUTOR_MEMORY=1g
>
>
>
> --------------------------------
>
> Thanks&amp;Best regards!
> San.Luo
>

Reply via email to