Hey, this seems to be a problem in the docs about how to set the executor URI. 
It looks like the SPARK_EXECUTOR_URI variable is not actually used. Instead, 
set the spark.executor.uri Java system property using 
System.setProperty("spark.executor.uri", "<your URI>") before you create a 
SparkContext.

Matei

On Oct 12, 2013, at 2:26 AM, Bart Vercammen <[email protected]> wrote:

> Hi,
> 
> I have an issue getting Spark jobs to run on a mesos cluster.
> (most probably it's a config issue - I hope - but let me explain what I did) :
> 
> - installed mesos on a cluster cluster (1 master and 3 workers) with 
> zookeeper support.
> - mesos is running fine : 
>   curl 'http://mesos-master:5050/master/state.json' | python -mjson.tool 
> shows me the master and the slaves
> - then, as mentioned in the spark readme, I created a Spark distribution with 
> 'make-distribution.sh' and uploaded it to HDFS (=> 
> spark/spark-0.8.0-incubating.tar.g)
> - I configured the environment variables on all the instances :
>    * 
> SPARK_EXECUTOR_URI="hdfs://hdfs-namenode:8020/spark/spark-0.8.0-incubating.tar.gz"
>    * MESOS_NATIVE_LIBRARY="/usr/local/lib/libmesos.so"
> 
> When I start spark-shell, it starts up fine,
> MASTER=mesos://mesos-master:5050 ./spark-shell
>      / __/__  ___ _____/ /__
>     _\ \/ _ \/ _ `/ __/  '_/
>    /___/ .__/\_,_/_/ /_/\_\   version 0.8.0
>       /_/                  
> 
> Using Scala version 2.9.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_45)
> Initializing interpreter...
> 13/10/11 15:04:34 INFO server.Server: jetty-7.x.y-SNAPSHOT
> 13/10/11 15:04:34 INFO server.AbstractConnector: Started 
> [email protected]:38491
> Creating SparkContext...
> 13/10/11 15:04:50 INFO slf4j.Slf4jEventHandler: Slf4jEventHandler started
> 13/10/11 15:04:50 INFO spark.SparkEnv: Registering BlockManagerMaster
> 13/10/11 15:04:50 INFO storage.MemoryStore: MemoryStore started with capacity 
> 326.7 MB.
> 13/10/11 15:04:50 INFO storage.DiskStore: Created local directory at 
> /tmp/spark-local-20131011150450-3616
> 13/10/11 15:04:50 INFO network.ConnectionManager: Bound socket to port 36220 
> with id = ConnectionManagerId(ip-*****,36220)
> 13/10/11 15:04:50 INFO storage.BlockManagerMaster: Trying to register 
> BlockManager
> 13/10/11 15:04:50 INFO storage.BlockManagerMaster: Registered BlockManager
> 13/10/11 15:04:50 INFO server.Server: jetty-7.x.y-SNAPSHOT
> 13/10/11 15:04:50 INFO server.AbstractConnector: Started 
> [email protected]:60233
> 13/10/11 15:04:50 INFO broadcast.HttpBroadcast: Broadcast server started at 
> http://*****:60233
> 13/10/11 15:04:50 INFO spark.SparkEnv: Registering MapOutputTracker
> 13/10/11 15:04:51 INFO spark.HttpFileServer: HTTP File server directory is 
> /tmp/spark-ca7119f6-4190-4d93-83ed-0fda261f3071
> 13/10/11 15:04:51 INFO server.Server: jetty-7.x.y-SNAPSHOT
> 13/10/11 15:04:51 INFO server.AbstractConnector: Started 
> [email protected]:44652
> 13/10/11 15:04:51 INFO server.Server: jetty-7.x.y-SNAPSHOT
> 13/10/11 15:04:51 INFO handler.ContextHandler: started 
> o.e.j.s.h.ContextHandler{/storage/rdd,null}
> 13/10/11 15:04:51 INFO handler.ContextHandler: started 
> o.e.j.s.h.ContextHandler{/storage,null}
> 13/10/11 15:04:51 INFO handler.ContextHandler: started 
> o.e.j.s.h.ContextHandler{/stages/stage,null}
> 13/10/11 15:04:51 INFO handler.ContextHandler: started 
> o.e.j.s.h.ContextHandler{/stages/pool,null}
> 13/10/11 15:04:51 INFO handler.ContextHandler: started 
> o.e.j.s.h.ContextHandler{/stages,null}
> 13/10/11 15:04:51 INFO handler.ContextHandler: started 
> o.e.j.s.h.ContextHandler{/environment,null}
> 13/10/11 15:04:51 INFO handler.ContextHandler: started 
> o.e.j.s.h.ContextHandler{/executors,null}
> 13/10/11 15:04:51 INFO handler.ContextHandler: started 
> o.e.j.s.h.ContextHandler{/metrics/json,null}
> 13/10/11 15:04:51 INFO handler.ContextHandler: started 
> o.e.j.s.h.ContextHandler{/static,null}
> 13/10/11 15:04:51 INFO handler.ContextHandler: started 
> o.e.j.s.h.ContextHandler{/,null}
> 13/10/11 15:04:51 INFO server.AbstractConnector: Started 
> [email protected]:4040
> 13/10/11 15:04:51 INFO ui.SparkUI: Started Spark Web UI at 
> http://ip-*****:4040
> 13/10/11 15:04:51 INFO mesos.MesosSchedulerBackend: Registered as framework 
> ID 201310110648-2269968906-5050-12731-0007
> Spark context available as sc.
> Type in expressions to have them evaluated.
> Type :help for more information.
> 
> ( the '*****' are of course the commented out IP's of the instances ;-)
> 
> but when I want to launch a job (e.g. a word-count of something on HDFS), I 
> can see the mesos-slaves shooting into action, and I also see the 'tasks' 
> popping up in the mesos-UI,
> but the tasks are failing with following error :
> 13/10/11 15:06:59 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:5 
> as 1581 bytes in 0 ms
> 13/10/11 15:07:05 INFO cluster.ClusterTaskSetManager: Re-queueing tasks for 
> 201310110648-2269968906-5050-12731-34 from TaskSet 0.0
> 13/10/11 15:07:05 INFO cluster.ClusterTaskSetManager: Lost TID 23 (task 0.0:5)
> 13/10/11 15:07:05 INFO cluster.ClusterTaskSetManager: Lost TID 21 (task 0.0:4)
> 13/10/11 15:07:05 INFO cluster.ClusterTaskSetManager: Lost TID 22 (task 0.0:1)
> 13/10/11 15:07:05 INFO scheduler.DAGScheduler: Executor lost: 
> 201310110648-2269968906-5050-12731-34 (epoch 6)
> 13/10/11 15:07:05 INFO storage.BlockManagerMasterActor: Trying to remove 
> executor 201310110648-2269968906-5050-12731-34 from BlockManagerMaster.
> 13/10/11 15:07:05 INFO storage.BlockManagerMaster: Removed 
> 201310110648-2269968906-5050-12731-34 successfully in removeExecutor
> 
> In the mesos logs (in stderr on the mesos-slaves):
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/spark/executor/MesosExecutorBackend
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.spark.executor.MesosExecutorBackend
>  at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> Could not find the main class: 
> org.apache.spark.executor.MesosExecutorBackend.  Program will exit.
> 
> Can someone explain what might be going on?
> 
> Also: I did not install spark on any of the mesos-slaves, as I am under the 
> assumption that a Spark installation on the worker nodes is not needed 
> anymore when working on top of Mesos, is this a correct assumption?  If not, 
> how should I configure this then?
> 
> Thanks in advance.
> Greets,
> Bart

Reply via email to