Hi, I'm getting the following errors running SparkPi on a clean just compiled and checked Mesos 0.29.0 installation with Spark 1.6.1
16/05/15 23:05:52 ERROR TaskSchedulerImpl: Lost executor e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/0 on xxx Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages. The Mesos examples are running fine, only the SparkPi example isn't... I'm not sure what to do, I thought it had to do with the installation so I installed and compiled everything again, but without any good results. Please help, thanks in advance, Richard The complete logs are sudo ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master mesos://192.168.33.10:5050 --deploy-mode client ./lib/spark-examples* 10 16/05/15 23:05:36 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable I0515 23:05:38.393546 10915 sched.cpp:224] Version: 0.29.0 I0515 23:05:38.402220 10909 sched.cpp:328] New master detected at master@192.168.33.10:5050 I0515 23:05:38.403033 10909 sched.cpp:338] No credentials provided. Attempting to register without authentication I0515 23:05:38.431784 10909 sched.cpp:710] Framework registered with e23f2d53-22c5-40f0-918d-0d73805fdfec-0006 Pi is roughly 3.145964 16/05/15 23:05:52 ERROR TaskSchedulerImpl: Lost executor e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/0 on xxx: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages. 16/05/15 23:05:52 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerExecutorRemoved(1463346352364,e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/0,Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.) I0515 23:05:52.380164 10810 sched.cpp:1921] Asked to stop the driver I0515 23:05:52.382272 10910 sched.cpp:1150] Stopping framework 'e23f2d53-22c5-40f0-918d-0d73805fdfec-0006' The Mesos sandbox gives the following messages in STDERR 16/05/15 23:05:52 INFO Executor: Finished task 7.0 in stage 0.0 (TID 7). 1029 bytes result sent to driver 16/05/15 23:05:52 INFO CoarseGrainedExecutorBackend: Got assigned task 8 16/05/15 23:05:52 INFO Executor: Running task 8.0 in stage 0.0 (TID 8) 16/05/15 23:05:52 INFO Executor: Finished task 8.0 in stage 0.0 (TID 8). 1029 bytes result sent to driver 16/05/15 23:05:52 INFO CoarseGrainedExecutorBackend: Got assigned task 9 16/05/15 23:05:52 INFO Executor: Running task 9.0 in stage 0.0 (TID 9) 16/05/15 23:05:52 INFO Executor: Finished task 9.0 in stage 0.0 (TID 9). 1029 bytes result sent to driver 16/05/15 23:05:52 INFO CoarseGrainedExecutorBackend: Driver commanded a shutdown 16/05/15 23:05:52 INFO MemoryStore: MemoryStore cleared 16/05/15 23:05:52 INFO BlockManager: BlockManager stopped 16/05/15 23:05:52 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 16/05/15 23:05:52 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 16/05/15 23:05:52 WARN CoarseGrainedExecutorBackend: An unknown (anabrix:45663) driver disconnected. 16/05/15 23:05:52 ERROR CoarseGrainedExecutorBackend: Driver 192.168.33.10:45663 disassociated! Shutting down. I0515 23:05:52.388991 11120 exec.cpp:399] Executor asked to shutdown 16/05/15 23:05:52 INFO ShutdownHookManager: Shutdown hook called 16/05/15 23:05:52 INFO ShutdownHookManager: Deleting directory /tmp/mesos/slaves/e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/frameworks/e23f2d53-22c5-40f0-918d-0d73805fdfec-0006/executors/0/runs/b9df4275-a597-4b8e-9a7b-45e7fb79bd93/spark-a99d0380-2d0d-4bbd-a593-49ad885e5430 And the following messages in STDOUT Registered executor on xxx Starting task 0 sh -c 'cd spark-1*; ./bin/spark-class org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark:// CoarseGrainedScheduler@192.168.33.10:45663 --executor-id e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/0 --hostname xxx --cores 1 --app-id e23f2d53-22c5-40f0-918d-0d73805fdfec-0006' Forked command at 11124 Shutting down Sending SIGTERM to process tree at pid 11124 Sent SIGTERM to the following process trees: [ -+- 11124 sh -c cd spark-1*; ./bin/spark-class org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark:// CoarseGrainedScheduler@192.168.33.10:45663 --executor-id e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/0 --hostname xxx --cores 1 --app-id e23f2d53-22c5-40f0-918d-0d73805fdfec-0006 \--- 11125 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-0.b14.el7_2.x86_64/jre/bin/java -cp /tmp/mesos/slaves/e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/frameworks/e23f2d53-22c5-40f0-918d-0d73805fdfec-0006/executors/0/runs/b9df4275-a597-4b8e-9a7b-45e7fb79bd93/spark-1.6.1-bin-hadoop2.6/conf/:/tmp/mesos/slaves/e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/frameworks/e23f2d53-22c5-40f0-918d-0d73805fdfec-0006/executors/0/runs/b9df4275-a597-4b8e-9a7b-45e7fb79bd93/spark-1.6.1-bin-hadoop2.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar:/tmp/mesos/slaves/e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/frameworks/e23f2d53-22c5-40f0-918d-0d73805fdfec-0006/executors/0/runs/b9df4275-a597-4b8e-9a7b-45e7fb79bd93/spark-1.6.1-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/tmp/mesos/slaves/e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/frameworks/e23f2d53-22c5-40f0-918d-0d73805fdfec-0006/executors/0/runs/b9df4275-a597-4b8e-9a7b-45e7fb79bd93/spark-1.6.1-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/tmp/mesos/slaves/e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/frameworks/e23f2d53-22c5-40f0-918d-0d73805fdfec-0006/executors/0/runs/b9df4275-a597-4b8e-9a7b-45e7fb79bd93/spark-1.6.1-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar -Xms512m -Xmx512m org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@192.168.33.10:45663 --executor-id e23f2d53-22c5-40f0-918d-0d73805fdfec-S0/0 --hostname xxx --cores 1 --app-id e23f2d53-22c5-40f0-918d-0d73805fdfec-0006 ] Command terminated with signal Terminated (pid: 11124)