Re: SparkOperator - tips and feedback?

2017-03-18 Thread Ruslan Dautkhanov
Thanks Bolke. That's awesome. 1) So each task would creates its own spark session? Is there is a way to have spark session sharing like discussed in this email chain? 2) Looks like SparkSqlHook calls `spark-sql` shell with all those parameters?

Re: SparkOperator - tips and feedback?

2017-03-18 Thread Bolke de Bruin
A spark operator exists as of 1.8.0 (which will be released tomorrow), you might want to take a look at that. I know that an update is coming to that operator that improves communication with Yarn. Bolke > On 18 Mar 2017, at 18:43, Russell Jurney wrote: > > Ruslan,

Re: SparkOperator - tips and feedback?

2017-03-18 Thread Russell Jurney
Ruslan, thanks for your feedback. You mean the spark-submit context? Or like the SparkContext and SparkSession? I don't think we could keep that alive, because it wouldn't work out with multiple calls to spark-submit. I do feel your pain, though. Maybe someone else can see how this might be done?

Re: SparkOperator - tips and feedback?

2017-03-18 Thread Ruslan Dautkhanov
+1 Great idea. my two cents - it would be nice (as an option) if SparkOperator would be able to keep context open between different calls, as it takes 30+ seconds to create a new context (on our cluster). Not sure how well it fits Airflow architecture. -- Ruslan Dautkhanov On Sat, Mar 18,