[
https://issues.apache.org/jira/browse/AIRFLOW-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209421#comment-17209421
]
Cyril Shcherbin commented on AIRFLOW-7052:
------------------------------------------
Fixed in https://github.com/apache/airflow/pull/8730
> spark 3.0.0 does not work with sparksubmitoperator
> --------------------------------------------------
>
> Key: AIRFLOW-7052
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7052
> Project: Apache Airflow
> Issue Type: Bug
> Components: operators
> Affects Versions: 1.10.9
> Reporter: t oo
> Priority: Major
>
> from slack:
> If anyone runs into this in the future I've found out where the issue is in
> the spark_submit_hook.py.
> Line 419
> ``match_exit_code = re.search(r'\s*Exit code: (\d+)', line)```
> in spark 3.0 the line that prints the exit code is actually lower case E on
> "Exit code:" so this re.search will never find that value. To fix this you
> can simply switch the line to this
> ```match_exit_code = re.search(r'\s*Exit code: (\d+)', line, re.IGNORECASE)```
> Which should also be backwards compatible.
> MattD
> Having some difficulty understanding why my spark-submit task is being marked
> as failed even though the spark job has completed successfully,
> I see these logs at the end of the job,
> exit code: 0
> termination reason: Completed
> But then it also right after displays this,
> Traceback (most recent call last):
> File
> "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line
> 966, in _run_raw_task
> result = task_copy.execute(context=context)
> File
> "/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/spark_submit_operator.py",
> line 187, in execute
> self._hook.submit(self._application)
> File
> "/usr/local/lib/python3.7/site-packages/airflow/contrib/hooks/spark_submit_hook.py",
> line 403, in submit
> self._mask_cmd(spark_submit_cmd), returncode
> airflow.exceptions.AirflowException: Cannot execute: spark-submit (spark
> submit args would be here) Error code is: 0.
> I took a look at spark_submit_hook.py line 403 and it shows that it shouldn't
> be throwing that exception if the error code is 0. Anyone have any ideas? I'm
> only finding this happens now that I've switched to using spark 3.0, never
> ran into this with spark 2.4.5. *Also running 1.10.9 now
--
This message was sent by Atlassian Jira
(v8.3.4#803005)