Hi!

Im currently working in adding SSH super-powers to SparkSubmitOperator. It’s 
really simple, using SshHook and a simple wrapper of the connection to mimic 
Popen interface. We are using internally in our company, because we have 
different secured spark clusters, with different software version and would be 
really difficult to manage with an installation of airflow worker in every 
cluster or installing spark-submit binary into airflow worker. I think this is 
a common problem.

 I wanna know if someone want this kind of feature, if so, I can continue the 
work with tests and documentation and making a PR. Plus, hearing ideas, 
concerns, etc… of this approach. I will be happy to hear feedback from Airflow 
community.

The WIP code is available in 
https://github.com/flolas/airflow/blob/5bc837a03d226718f78eecbf4c637de222280adc/airflow/contrib/hooks/spark_submit_hook.py
 
<https://github.com/flolas/airflow/blob/5bc837a03d226718f78eecbf4c637de222280adc/airflow/contrib/hooks/spark_submit_hook.py>



Cheers,

Felipe L.

Reply via email to