Hey Evan, any chance you might find the link to the above mentioned SBT recipe? Would greatly appreciate it.
Thanks, Grega On Fri, Aug 9, 2013 at 10:00 AM, Evan Chan <[email protected]> wrote: > Hey Patrick, > > A while back I posted an SBT recipe allowing users to build Scala job > assemblies that excluded Spark and its deps, which is what most people want > I believe. This allows you to include your own libraries and exclude > Spark's for the smallest possible one. > > We don't use Spark's run script, instead we have SBT configured so that you > can simply type "run" to run jobs. I believe this gives maximum developer > velocity. We also have "sbt console" hooked up so that you can run spark > shell from it (no need for ./spark-shell script). > > And, as you know, we are going to contribute back a job server. We > believe that for most organizations this will provide the easiest way for > submitting and managing jobs -- IT/OPS sets up Spark as HTTP service (using > job server), and users/developers can submit jobs to a managed service. > We even have a giter8 template to make creating jobs for job server super > simple. The template has support for local run, spark shell, assembly, and > testing. > > So anyways, I believe we'll have a lot to contribute to your guide -- both > now and especially once the job server is contributed.... feel free to > touch base offline. > > -Evan > > > > > > On Fri, Aug 2, 2013 at 9:50 PM, Patrick Wendell <[email protected]> > wrote: > > > Hey All, > > > > I'm working on SPARK-800 [1]. The goal is to document a best practice or > > recommended way of bundling and running Spark jobs. We have a quickstart > > guide for writing a standlone job, but it doesn't cover how to deal with > > packaging up your dependencies and setting the correct environment > > variables required to submit a full job to a cluster. This can be a > > confusing process for beginners - it would be good to extend the guide to > > cover this. > > > > First though I wanted to sample this list and see how people tend to run > > Spark jobs inside their org's. Knowing any of the following would be > > helpful: > > > > - Do you create an uber jar with all of your job (and Spark)'s recursive > > dependencies? > > - Do you try to use sbt run or maven exec with some way to pass the > correct > > environment variables? > > - Do people use a modified version of spark's own `run` script? > > - Do you have some other way of submitting jobs? > > > > Any notes would be helpful in compiling this! > > > > https://spark-project.atlassian.net/browse/SPARK-800 > > > > > > -- > -- > Evan Chan > Staff Engineer > [email protected] | > > <http://www.ooyala.com/> > <http://www.facebook.com/ooyala><http://www.linkedin.com/company/ooyala>< > http://www.twitter.com/ooyala> >
