manuel garrido created AIRFLOW-1331:
---------------------------------------
Summary: Contrib.SparkSubmitOperator should allow --packages
parameter
Key: AIRFLOW-1331
URL: https://issues.apache.org/jira/browse/AIRFLOW-1331
Project: Apache Airflow
Issue Type: Bug
Components: contrib
Reporter: manuel garrido
Right now SparkSubmitOperator (and its related hook SparkSubmitHook) does not
allow for the parameter packages, an option very useful to pull packages from
the spark-packages repository.
I am not an expert by no means , but given how SparkSubmitHook builds the
command to submit a spark job this could be as easy as adding
{code:python}
if self._jars:
connection_cmd += ["--jars", self._jars]
{code}
Right under [this
line](https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/spark_submit_hook.py#L167),
as well as adding the *packages* parameter (defaulting to None) both in the
SparkSubmitHook and SparkSubmitOperator init methods (basically, anywhere where
the jars parameter is called).
To be honest I would not mind doing a pull request to fix this, however I am
not knowledgeable enough both about Airflow and how the Contribution guidelines
are setup. I the community thinks this could be an easy fix that a newbie like
me can do (i do believe this) then please let me know and I will do my best.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)