I actually had not seen SparkLauncher before, that looks pretty great :) On Mon, Oct 10, 2016 at 10:17 AM Russell Spitzer <russell.spit...@gmail.com> wrote:
> I'm definitely only talking about non-embedded uses here as I also use > embedded Spark (cassandra, and kafka) to run tests. This is almost always > safe since everything is in the same JVM. It's only once we get to > launching against a real distributed env do we end up with issues. > > Since Pyspark uses spark submit in the java gateway i'm not sure if that > matters :) > > The cases I see are usually usually going through main directly, adding > jars programatically. > > Usually ends up with classpath errors (Spark not on the CP, their jar not > on the CP, dependencies not on the cp), > conf errors (executors have the incorrect environment, executor classpath > broken, not understanding spark-defaults won't do anything), > Jar version mismatches > Etc ... > > On Mon, Oct 10, 2016 at 10:05 AM Sean Owen <so...@cloudera.com> wrote: > > I have also 'embedded' a Spark driver without much trouble. It isn't that > it can't work. > > The Launcher API is ptobably the recommended way to do that though. > spark-submit is the way to go for non programmatic access. > > If you're not doing one of those things and it is not working, yeah I > think people would tell you you're on your own. I think that's consistent > with all the JIRA discussions I have seen over time. > > > On Mon, Oct 10, 2016, 17:33 Russell Spitzer <russell.spit...@gmail.com> > wrote: > > I've seen a variety of users attempting to work around using Spark Submit > with at best middling levels of success. I think it would be helpful if the > project had a clear statement that submitting an application without using > Spark Submit is truly for experts only or is unsupported entirely. > > I know this is a pretty strong stance and other people have had different > experiences than me so please let me know what you think :) > >