A spark operator exists as of 1.8.0 (which will be released tomorrow), you might want to take a look at that. I know that an update is coming to that operator that improves communication with Yarn.
Bolke > On 18 Mar 2017, at 18:43, Russell Jurney <[email protected]> wrote: > > Ruslan, thanks for your feedback. > > You mean the spark-submit context? Or like the SparkContext and > SparkSession? I don't think we could keep that alive, because it wouldn't > work out with multiple calls to spark-submit. I do feel your pain, though. > Maybe someone else can see how this might be done? > > If SparkContext was able to open the spark/pyspark console, then multiple > job submissions would be possible. I didn't have this in mind but an > InteractiveSparkContext or SparkConsoleContext might be able to do this? > > Russell Jurney @rjurney <http://twitter.com/rjurney> > [email protected] LI <http://linkedin.com/in/russelljurney> FB > <http://facebook.com/jurney> datasyndrome.com > > On Sat, Mar 18, 2017 at 3:02 PM, Ruslan Dautkhanov <[email protected]> > wrote: > >> +1 Great idea. >> >> my two cents - it would be nice (as an option) if SparkOperator would be >> able to keep context open between different calls, >> as it takes 30+ seconds to create a new context (on our cluster). Not sure >> how well it fits Airflow architecture. >> >> >> >> -- >> Ruslan Dautkhanov >> >> On Sat, Mar 18, 2017 at 3:45 PM, Russell Jurney <[email protected]> >> wrote: >> >>> What do people think about creating a SparkOperator that uses >> spark-submit >>> to submit jobs? Would work for Scala/Java Spark and PySpark. The patterns >>> outlined in my presentation on Airflow and PySpark >>> <http://bit.ly/airflow_pyspark> would fit well inside an Operator, I >>> think. >>> BashOperator works, but why not tailor something to spark-submit? >>> >>> I'm open to doing the work, but I wanted to see what people though about >> it >>> and get feedback about things they would like to see in SparkOperator and >>> get any pointers people had to doing the implementation. >>> >>> Russell Jurney @rjurney <http://twitter.com/rjurney> >>> [email protected] LI <http://linkedin.com/in/russelljurney> FB >>> <http://facebook.com/jurney> datasyndrome.com >>> >>
