The linked thread does a good job answering your question. You should
create a SparkContext at startup and re-use it for all of your queries. For
example we create a SparkContext in a web server at startup, and are then
able to use the Spark cluster for serving Ajax queries with latency of a
second or less. The executors keep running during this time, so there is
minimal overhead to starting a job.


On Thu, Apr 17, 2014 at 8:02 PM, Jim Carroll <jimfcarr...@gmail.com> wrote:

> Is there a way to create continuously-running, or at least
> continuously-loaded, jobs that can be 'invoked' rather than 'sent' to to
> avoid the job creation overhead of a couple seconds?
>
> I read through the following:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Job-initialization-performance-of-Spark-standalone-mode-vs-YARN-td2016.html
>
> Thanks.
> Jim
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Continuously-running-non-streaming-jobs-tp4391.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to