[ 
https://issues.apache.org/jira/browse/AIRFLOW-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bolke de Bruin resolved AIRFLOW-1255.
-------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.8.3

Issue resolved by pull request #2438
[https://github.com/apache/incubator-airflow/pull/2438]

> SparkSubmitOperator logs do not stream correctly
> ------------------------------------------------
>
>                 Key: AIRFLOW-1255
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1255
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: hooks, operators
>    Affects Versions: Airflow 1.8
>         Environment: Spark 1.6.0 with Yarn cluster
> Airflow 1.8
>            Reporter: Himanshu Jain
>            Priority: Minor
>              Labels: easyfix
>             Fix For: 1.8.3
>
>
> Logging in SparkSubmitOperator does not work as intended (continuous logging 
> as received in the subprocess). This is because, spark-submit internally 
> redirects all logs to stdout (including stderr), which causes the current two 
> iterator logging to get stuck with empty stderr pipe. The logs are written 
> only when the subprocess finishes. This leads to yarn_application_id not 
> being available until the end of application.
>  Specifically,
> {code:title= spark_submit_hook.py (lines 217-220)|borderStyle=solid}
> self._sp = subprocess.Popen(spark_submit_cmd,
>                                 stdout=subprocess.PIPE,
>                                 stderr=subprocess.PIPE,
>                                 **kwargs)
> {code}
> needs to be changed to 
> {code:title= spark_submit_hook.py|borderStyle=solid}
> self._sp = subprocess.Popen(spark_submit_cmd,
>                                 stdout=subprocess.PIPE,
>                                 **kwargs)
> {code}
> with subsequent changes in the following lines.
> I have not tested whether the issue exists with spark 2 versions as well or 
> not.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to