Ruslan, thanks for your feedback.

You mean the spark-submit context? Or like the SparkContext and
SparkSession? I don't think we could keep that alive, because it wouldn't
work out with multiple calls to spark-submit. I do feel your pain, though.
Maybe someone else can see how this might be done?

If SparkContext was able to open the spark/pyspark console, then multiple
job submissions would be possible. I didn't have this in mind but an
InteractiveSparkContext or SparkConsoleContext might be able to do this?

Russell Jurney @rjurney <http://twitter.com/rjurney>
[email protected] LI <http://linkedin.com/in/russelljurney> FB
<http://facebook.com/jurney> datasyndrome.com

On Sat, Mar 18, 2017 at 3:02 PM, Ruslan Dautkhanov <[email protected]>
wrote:

> +1 Great idea.
>
> my two cents - it would be nice (as an option) if SparkOperator would be
> able to keep context open between different calls,
> as it takes 30+ seconds to create a new context (on our cluster). Not sure
> how well it fits Airflow architecture.
>
>
>
> --
> Ruslan Dautkhanov
>
> On Sat, Mar 18, 2017 at 3:45 PM, Russell Jurney <[email protected]>
> wrote:
>
> > What do people think about creating a SparkOperator that uses
> spark-submit
> > to submit jobs? Would work for Scala/Java Spark and PySpark. The patterns
> > outlined in my presentation on Airflow and PySpark
> > <http://bit.ly/airflow_pyspark> would fit well inside an Operator, I
> > think.
> > BashOperator works, but why not tailor something to spark-submit?
> >
> > I'm open to doing the work, but I wanted to see what people though about
> it
> > and get feedback about things they would like to see in SparkOperator and
> > get any pointers people had to doing the implementation.
> >
> > Russell Jurney @rjurney <http://twitter.com/rjurney>
> > [email protected] LI <http://linkedin.com/in/russelljurney> FB
> > <http://facebook.com/jurney> datasyndrome.com
> >
>

Reply via email to