Re: Spark on YARN performance

2014-04-18 Thread Nishkam Ravi
Spark-on-YARN takes 10-30 seconds of setup time for workloads like WordCount and PageRank on a small-sized cluster and thereafter performs as well as Spark standalone, as has been noted by Tom and Patrick. However, certain amount of configuration/tuning effort is required to match peak performance.

Re: Spark on Yarn or Mesos?

2014-04-17 Thread Arpit Tak
Hi Wel, Take a look at this post... http://apache-spark-user-list.1001560.n3.nabble.com/Job-initialization-performance-of-Spark-standalone-mode-vs-YARN-td2016.html Regards, Arpit Tak On Thu, Apr 17, 2014 at 3:42 PM, Wei Wang wrote: > Hi, there > > I would like to know is there any differences

Re: Spark on YARN performance

2014-04-11 Thread Mayur Rustagi
I am using Mesos right now & it works great. Mesos has fine grained as well as coarse grained allocation & really useful for prioritizing different pipelines. On Apr 11, 2014 1:19 PM, "Patrick Wendell" wrote: > To reiterate what Tom was saying - the code that runs inside of Spark on > YARN is exa

Re: Spark on YARN performance

2014-04-11 Thread Patrick Wendell
To reiterate what Tom was saying - the code that runs inside of Spark on YARN is exactly the same code that runs in any deployment mode. There shouldn't be any performance difference once your application starts (assuming you are comparing apples-to-apples in terms of hardware). The differences ar

Re: Spark on YARN performance

2014-04-11 Thread Tom Graves
I haven't run on mesos before, but I do run on yarn. The performance differences are going to be in how long it takes you go get the Executors allocated.  On yarn that is going to depend on the cluster setup. If you have dedicated resources to a queue where you are running your spark job the ov

Re: Spark on YARN performance

2014-04-10 Thread Flavio Pompermaier
Thank you for the reply Mayur, it would be nice to have a comparison about that. I hope one day it will be available, or to have the time to test it myself :) So you're using Mesos for the moment, right? Which are the main differences in you experience? YARN seems to be more flexible and interopera

Re: Spark on YARN performance

2014-04-10 Thread Mayur Rustagi
I've had better luck with standalone in terms of speed & latency. I think thr is impact but not really very high. Bigger impact is towards being able to manage resources & share cluster. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi

<    1   2   3