Is there any difference in the performance of Spark standalone mode and YARN when it comes to initializing a new Spark job?
In my application, response time is absolutely critical, and I'm hoping to have the executors working within a few seconds of submitting the job. Both options ran quickly for me (running the SparkPi example) in a single node cluster, only a couple of seconds until executors began work. On my 10 node cluster it takes YARN over 10 seconds before the executors actually begin work. Could I expect Spark standalone to get going any quicker? If so I will take the time to configure it on 10 node cluster. Why does the example run so much quicker on my local single node cluster than on my 10 EC2 m1.larges? Aside from YARN being able to schedule Spark, MRv2 and other job types, are there any major differences between Spark standalone and YARN? Thanks. - Dan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Job-initialization-performance-of-Spark-standalone-mode-vs-YARN-tp2016.html Sent from the Apache Spark User List mailing list archive at Nabble.com.