+1 Great idea.

my two cents - it would be nice (as an option) if SparkOperator would be
able to keep context open between different calls,
as it takes 30+ seconds to create a new context (on our cluster). Not sure
how well it fits Airflow architecture.



-- 
Ruslan Dautkhanov

On Sat, Mar 18, 2017 at 3:45 PM, Russell Jurney <[email protected]>
wrote:

> What do people think about creating a SparkOperator that uses spark-submit
> to submit jobs? Would work for Scala/Java Spark and PySpark. The patterns
> outlined in my presentation on Airflow and PySpark
> <http://bit.ly/airflow_pyspark> would fit well inside an Operator, I
> think.
> BashOperator works, but why not tailor something to spark-submit?
>
> I'm open to doing the work, but I wanted to see what people though about it
> and get feedback about things they would like to see in SparkOperator and
> get any pointers people had to doing the implementation.
>
> Russell Jurney @rjurney <http://twitter.com/rjurney>
> [email protected] LI <http://linkedin.com/in/russelljurney> FB
> <http://facebook.com/jurney> datasyndrome.com
>

Reply via email to