Hi Michael, I see you capped the cores to 60.
I wonder what's the settings you used for standalone mode that you compared with? I can try to run a MLib workload on both to compare. Tim > On Jan 9, 2015, at 6:42 AM, Michael V Le <m...@us.ibm.com> wrote: > > Hi Tim, > > Thanks for your response. > > The benchmark I used just reads data in from HDFS and builds the Linear > Regression model using methods from the MLlib. > Unfortunately, for various reasons, I can't open the source code for the > benchmark at this time. > I will try to replicate the problem using some sample benchmarks provided by > the vanilla Spark distribution. > It is very possible that I have something very screwy in my workload or setup. > > The parameters I used for the Spark on Mesos are the following: > driver memory = 1G > total-executor-cores = 60 > spark.executor.memory 6g > spark.storage.memoryFraction 0.9 > spark.mesos.coarse = true > > The rest are default values, so spark.locality.wait should just be 3000ms. > > I launched the Spark job on a separate node from the 10-node cluster using > spark-submit. > > With regards to Mesos in fine-grained mode, do you have a feel for the > overhead of > launching executors for every task? Of course, any perceived slow down will > probably be very dependent > on the workload. I just want to have a feel of the possible overhead (e.g., > factor of 2 or 3 slowdown?). > If not a data locality issue, perhaps this overhead can be a factor in the > slowdown I observed, at least in the fine-grained case. > > BTW: i'm using Spark ver 1.1.0 and Mesos ver 0.20.0 > > Thanks, > Mike > > > <graycol.gif>Tim Chen ---01/08/2015 03:04:51 PM---How did you run this > benchmark, and is there a open version I can try it with? > > From: Tim Chen <t...@mesosphere.io> > To: Michael V Le/Watson/IBM@IBMUS > Cc: user <user@spark.apache.org> > Date: 01/08/2015 03:04 PM > Subject: Re: Data locality running Spark on Mesos > > > > How did you run this benchmark, and is there a open version I can try it with? > > And what is your configurations, like spark.locality.wait, etc? > > Tim > > On Thu, Jan 8, 2015 at 11:44 AM, mvle <m...@us.ibm.com> wrote: > Hi, > > I've noticed running Spark apps on Mesos is significantly slower compared to > stand-alone or Spark on YARN. > I don't think it should be the case, so I am posting the problem here in > case someone has some explanation > or can point me to some configuration options i've missed. > > I'm running the LinearRegression benchmark with a dataset of 48.8GB. > On a 10-node stand-alone Spark cluster (each node 4-core, 8GB of RAM), > I can finish the workload in about 5min (I don't remember exactly). > The data is loaded into HDFS spanning the same 10-node cluster. > There are 6 worker instances per node. > > However, when running the same workload on the same cluster but now with > Spark on Mesos (course-grained mode), the execution time is somewhere around > 15min. Actually, I tried with find-grained mode and giving each Mesos node 6 > VCPUs (to hopefully get 6 executors like the stand-alone test), I still get > roughly 15min. > > I've noticed that when Spark is running on Mesos, almost all tasks execute > with locality NODE_LOCAL (even in Mesos in coarse-grained mode). On > stand-alone, the locality is mostly PROCESS_LOCAL. > > I think this locality issue might be the reason for the slow down but I > can't figure out why, especially for coarse-grained mode as the executors > supposedly do not go away until job completion. > > Any ideas? > > Thanks, > Mike > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Data-locality-running-Spark-on-Mesos-tp21041.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >