Could you possibly describe what you are trying to learn how to do in more
detail? Some basics of submitting programmatically:

- Create a SparkContext instance and use that to build your RDDs
- You can only have 1 SparkContext per JVM you are running, so if you need
to satisfy concurrent job requests you would need to manage a SparkContext
as a shared resource on that server. Keep in mind if something goes wrong
with that SparkContext, all running jobs would probably be in a failed
state and you'd need to try to get a new SparkContext.
- There are System.exit calls built into Spark as of now that could kill
your running JVM. We have shadowed some of the most offensive bits within
our own application to work around this. You'd likely want to do that or to
do your own Spark fork. For example, if the SparkContext can't connect to
your cluster master node when it is created, it will System.exit.
- You'll need to provide all of the relevant classes that your platform
uses in the jobs on the classpath of the spark cluster. We do this with a
JAR file loaded from S3 dynamically by a SparkContext, but there are other
options.

On Mon, Apr 20, 2015 at 10:12 PM, firemonk9 <dhiraj.peech...@gmail.com>
wrote:

> I have built a data analytics SaaS platform by creating Rest end points and
> based on the type of job request I would invoke the necessary spark
> job/jobs
> and return the results as json(async). I used yarn-client mode to submit
> the
> jobs to yarn cluster.
>
> hope this helps.
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Instantiating-starting-Spark-jobs-programmatically-tp22577p22584.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to