[
https://issues.apache.org/jira/browse/AIRFLOW-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16287490#comment-16287490
]
ASF subversion and git services commented on AIRFLOW-1854:
----------------------------------------------------------
Commit 3e6babe8ed8f8f281b67aa3f4e03bf3cfc1bcbaa in incubator-airflow's branch
refs/heads/master from [~milanvdm]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=3e6babe ]
[AIRFLOW-1854] Improve Spark Submit operator for standalone cluster mode
Closes #2852 from milanvdmria/svend/submit2
> Improve Spark submit hook for cluster mode
> ------------------------------------------
>
> Key: AIRFLOW-1854
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1854
> Project: Apache Airflow
> Issue Type: Improvement
> Components: hooks
> Reporter: Milan van der Meer
> Assignee: Milan van der Meer
> Priority: Minor
> Labels: features
> Fix For: 1.9.1
>
>
> *We are already working on this issue and will submit a PR soon*
> When executing a Spark submit to a standalone cluster using the Spark submit
> hook, it will get a return code from the Spark submit action and not the
> Spark job itself.
> This means when a Spark submit is executed and successfully received by the
> cluster, the Airflow job will be successful, even when the Spark job fails on
> the cluster later on.
> Suggested solution:
> * When you execute a Spark submit in cluster mode, the logs will contain a
> driver ID.
> * Use this driver ID to poll the cluster for the driver state.
> * Based on the drivers state, the job will be successful or failed.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)