Sorry if this came twice I accidentally submitted it before subscribing to
the list and so I'm resending.

For starters: I am familiar with all the parts involved and have created an
SSH connection, a tunnel from that connection and a connection to the Spark
master that doesn't use SSH (so it can't connect). I see the myriad ways to
interact with Spark in Airflow, both in contrib and the main package.

*What I can't find a single discussion about is: how do I submit a Spark
job to a Spark master through an SSH tunnel?*

SSH tunnels are done in DAGs via the hook and not as connections (seems
like a bad design decision) and so I can't find a way to actually make a
connection to the Spark master that uses a tunnel. There is no parameter in
the spark-submit operators that might use an ssh tunnel, so I am stuck.

Thanks,
Russell Jurney @rjurney <http://twitter.com/rjurney>
russell.jur...@gmail.com LI <http://linkedin.com/in/russelljurney> FB
<http://facebook.com/jurney> datasyndrome.com

Reply via email to