I actually not use spark submit for several use cases, all of them currently 
revolve around running it directly with python.
One of the most important ones is developing in pycharm.
Basically I have am using pycharm and configure it with a remote interpreter 
which runs on the server while my pycharm runs on my local windows machine.
In order for me to be able to effectively debug (stepping etc.), I want to 
define a run configuration in pycharm which would integrate fully with its 
debug tools. Unfortunately I couldn’t figure out a way to use spark-submit 
effectively. Instead I chose the following solution:
I defined the project to use the remorete interpreter running on the driver in 
the cluster.
I defined environment variables in the run configuration including setting 
PYTHONPATH to include pyspark and py4j manually, set up the relevant 
relevant configurations (e.g. relevant jars) and made sure it ended with 

By providing this type of behavior I could debug spark remotely as if it was 

Similar use cases include using standard tools that know how to run “python” 
script but are not aware of spark-submit.

I haven’t found similar reasons for scala/java code though (although I wish 
there was a similar “remote” setup for scala).

From: RussS [via Apache Spark Developers List] 
Sent: Monday, October 10, 2016 9:14 PM
To: Mendelson, Assaf
Subject: Re: Official Stance on Not Using Spark Submit

Just folks who don't want to use spark-submit, no real use-cases I've seen yet.

I didn't know about SparkLauncher myself and I don't think there are any 
official docs on that or launching spark as an embedded library for tests.

On Mon, Oct 10, 2016 at 11:09 AM Matei Zaharia <[hidden 
email]</user/SendEmail.jtp?type=node&node=19384&i=0>> wrote:
What are the main use cases you've seen for this? Maybe we can add a page to 
the docs about how to launch Spark as an embedded library.


On Oct 10, 2016, at 10:21 AM, Russell Spitzer <[hidden 
email]</user/SendEmail.jtp?type=node&node=19384&i=1>> wrote:

I actually had not seen SparkLauncher before, that looks pretty great :)

On Mon, Oct 10, 2016 at 10:17 AM Russell Spitzer <[hidden 
email]</user/SendEmail.jtp?type=node&node=19384&i=2>> wrote:
I'm definitely only talking about non-embedded uses here as I also use embedded 
Spark (cassandra, and kafka) to run tests. This is almost always safe since 
everything is in the same JVM. It's only once we get to launching against a 
real distributed env do we end up with issues.

Since Pyspark uses spark submit in the java gateway i'm not sure if that 
matters :)

The cases I see are usually usually going through main directly, adding jars 

Usually ends up with classpath errors (Spark not on the CP, their jar not on 
the CP, dependencies not on the cp),
conf errors (executors have the incorrect environment, executor classpath 
broken, not understanding spark-defaults won't do anything),
Jar version mismatches
Etc ...

On Mon, Oct 10, 2016 at 10:05 AM Sean Owen <[hidden 
email]</user/SendEmail.jtp?type=node&node=19384&i=3>> wrote:
I have also 'embedded' a Spark driver without much trouble. It isn't that it 
can't work.

The Launcher API is ptobably the recommended way to do that though. 
spark-submit is the way to go for non programmatic access.

If you're not doing one of those things and it is not working, yeah I think 
people would tell you you're on your own. I think that's consistent with all 
the JIRA discussions I have seen over time.

On Mon, Oct 10, 2016, 17:33 Russell Spitzer <[hidden 
email]</user/SendEmail.jtp?type=node&node=19384&i=4>> wrote:
I've seen a variety of users attempting to work around using Spark Submit with 
at best middling levels of success. I think it would be helpful if the project 
had a clear statement that submitting an application without using Spark Submit 
is truly for experts only or is unsupported entirely.

I know this is a pretty strong stance and other people have had different 
experiences than me so please let me know what you think :)

If you reply to this email, your message will be added to the discussion below:
To start a new topic under Apache Spark Developers List, email 
To unsubscribe from Apache Spark Developers List, click 

View this message in context: 
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Reply via email to