Many thanks for the replies.

The way I currently have my set up is as follows;
6 nodes running Hadoop with each node having approximately 5GB of data.
Launched a Spark Master (and Shark via ./shark) on one of the Hadoop nodes
and launched 5 worker Spark nodes on the remaining 5 Hadoop nodes. 

So I'm assuming the setup above constitutes a Standalone deployment? and
from reading the documentation, it was mentioned that it's best to have
Spark as close as possible to the HDFS so hence my choice in using the
mentioned setup.

Is that the best way to setup Spark to ensure data locality? and would there
be any benefits in running Mesos in that setup as well?

Thanks
Majd




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Master-on-Hadoop-Job-Tracker-tp680p714.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to