Re: Purpose of spark-submit?

Sandy Ryza Wed, 09 Jul 2014 09:06:17 -0700

Spark still supports the ability to submit jobs programmatically without
shell scripts.


Koert,
The main reason that the unification can't be a part of SparkContext is
that YARN and standalone support deploy modes where the driver runs in a
managed process on the cluster.  In this case, the SparkContext is created
on a remote node well after the application is launched.


On Wed, Jul 9, 2014 at 8:34 AM, Andrei <faithlessfri...@gmail.com> wrote:

> One another +1. For me it's a question of embedding. With
> SparkConf/SparkContext I can easily create larger projects with Spark as a
> separate service (just like MySQL and JDBC, for example). With spark-submit
> I'm bound to Spark as a main framework that defines how my application
> should look like. In my humble opinion, using Spark as embeddable library
> rather than main framework and runtime is much easier.
>
>
>
>
> On Wed, Jul 9, 2014 at 5:14 PM, Jerry Lam <chiling...@gmail.com> wrote:
>
>> +1 as well for being able to submit jobs programmatically without using
>> shell script.
>>
>> we also experience issues of submitting jobs programmatically without
>> using spark-submit. In fact, even in the Hadoop World, I rarely used
>> "hadoop jar" to submit jobs in shell.
>>
>>
>>
>> On Wed, Jul 9, 2014 at 9:47 AM, Robert James <srobertja...@gmail.com>
>> wrote:
>>
>>> +1 to be able to do anything via SparkConf/SparkContext.  Our app
>>> worked fine in Spark 0.9, but, after several days of wrestling with
>>> uber jars and spark-submit, and so far failing to get Spark 1.0
>>> working, we'd like to go back to doing it ourself with SparkConf.
>>>
>>> As the previous poster said, a few scripts should be able to give us
>>> the classpath and any other params we need, and be a lot more
>>> transparent and debuggable.
>>>
>>> On 7/9/14, Surendranauth Hiraman <suren.hira...@velos.io> wrote:
>>> > Are there any gaps beyond convenience and code/config separation in
>>> using
>>> > spark-submit versus SparkConf/SparkContext if you are willing to set
>>> your
>>> > own config?
>>> >
>>> > If there are any gaps, +1 on having parity within
>>> SparkConf/SparkContext
>>> > where possible. In my use case, we launch our jobs programmatically. In
>>> > theory, we could shell out to spark-submit but it's not the best
>>> option for
>>> > us.
>>> >
>>> > So far, we are only using Standalone Cluster mode, so I'm not
>>> knowledgeable
>>> > on the complexities of other modes, though.
>>> >
>>> > -Suren
>>> >
>>> >
>>> >
>>> > On Wed, Jul 9, 2014 at 8:20 AM, Koert Kuipers <ko...@tresata.com>
>>> wrote:
>>> >
>>> >> not sure I understand why unifying how you submit app for different
>>> >> platforms and dynamic configuration cannot be part of SparkConf and
>>> >> SparkContext?
>>> >>
>>> >> for classpath a simple script similar to "hadoop classpath" that shows
>>> >> what needs to be added should be sufficient.
>>> >>
>>> >> on spark standalone I can launch a program just fine with just
>>> SparkConf
>>> >> and SparkContext. not on yarn, so the spark-launch script must be
>>> doing a
>>> >> few things extra there I am missing... which makes things more
>>> difficult
>>> >> because I am not sure its realistic to expect every application that
>>> >> needs
>>> >> to run something on spark to be launched using spark-submit.
>>> >>  On Jul 9, 2014 3:45 AM, "Patrick Wendell" <pwend...@gmail.com>
>>> wrote:
>>> >>
>>> >>> It fulfills a few different functions. The main one is giving users a
>>> >>> way to inject Spark as a runtime dependency separately from their
>>> >>> program and make sure they get exactly the right version of Spark. So
>>> >>> a user can bundle an application and then use spark-submit to send it
>>> >>> to different types of clusters (or using different versions of
>>> Spark).
>>> >>>
>>> >>> It also unifies the way you bundle and submit an app for Yarn, Mesos,
>>> >>> etc... this was something that became very fragmented over time
>>> before
>>> >>> this was added.
>>> >>>
>>> >>> Another feature is allowing users to set configuration values
>>> >>> dynamically rather than compile them inside of their program. That's
>>> >>> the one you mention here. You can choose to use this feature or not.
>>> >>> If you know your configs are not going to change, then you don't need
>>> >>> to set them with spark-submit.
>>> >>>
>>> >>>
>>> >>> On Wed, Jul 9, 2014 at 10:22 AM, Robert James <
>>> srobertja...@gmail.com>
>>> >>> wrote:
>>> >>> > What is the purpose of spark-submit? Does it do anything outside of
>>> >>> > the standard val conf = new SparkConf ... val sc = new SparkContext
>>> >>> > ... ?
>>> >>>
>>> >>
>>> >
>>> >
>>> > --
>>> >
>>> > SUREN HIRAMAN, VP TECHNOLOGY
>>> > Velos
>>> > Accelerating Machine Learning
>>> >
>>> > 440 NINTH AVENUE, 11TH FLOOR
>>> > NEW YORK, NY 10001
>>> > O: (917) 525-2466 ext. 105
>>> > F: 646.349.4063
>>> > E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io
>>> > W: www.velos.io
>>> >
>>>
>>
>>
>

Re: Purpose of spark-submit?

Reply via email to