Chris Sng created AIRFLOW-1319:
----------------------------------

             Summary: Fix misleading SparkSubmitOperator and SparkSubmitHook 
docstring
                 Key: AIRFLOW-1319
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1319
             Project: Apache Airflow
          Issue Type: Improvement
          Components: contrib
            Reporter: Chris Sng
            Assignee: Chris Sng
            Priority: Trivial


In the community-contributed Spark submit hook and operator, it support 
`spark-submit`'s `--files` command line option. The `--files` option is used to 
submit file to each executor to be used. A good example of such files are 
serialized objects.

However, in both docstrings 
(https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/operators/spark_submit_operator.py#L37
 and 
https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/spark_submit_hook.py#L36),
 it provided `hive-site.xml` as an example. This may mislead less-informed 
developers into assuming that Hive configuration files can be submitted to the 
cluster in this manner. According to Apache Hive's documentation, hive 
configuration files are located in the directory located in the `HIVE_CONF_DIR` 
environment variable.

I propose excluding this example from the docstrings.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to