karthyk16 opened a new issue #20949:
URL: https://github.com/apache/airflow/issues/20949
### Apache Airflow Provider(s)
apache-spark
### Versions of Apache Airflow Providers
2.0.3. Same issue in 2.0.1 as well.
### Apache Airflow version
2.1.2
### Operating System
Red Hat Enterprise Linux
### Deployment
Other
### Deployment details
_No response_
### What happened
We have used Spark Submit operator from Airflow to submit a Spark job on
Standalone cluster mode. Spark job gets submitted successfully and we have used
spark.standalone.submit.waitAppCompletion = true as a conf. So, the client
waits for the application to complete. On successful completion of the
application and executor termination, airflow is not able to poll the
successful completion of job and mark the job as successful. Instead it marks
the job as failure even though the job was successfully finished. Status of
application in Spark History shows Finished.
[2022-01-18, 20:49:37 UTC] {spark_submit.py:523} INFO - 22/01/18 20:49:37
INFO ClientEndpoint: State of driver driver-20220118204802-0000 is FINISHED,
exiting spark-submit JVM.
[2022-01-18, 20:49:37 UTC] {spark_submit.py:523} INFO - 22/01/18 20:49:37
INFO ShutdownHookManager: Shutdown hook called
[2022-01-18, 20:49:37 UTC] {spark_submit.py:523} INFO - 22/01/18 20:49:37
INFO ShutdownHookManager: Deleting directory
/tmp/spark-1b7f430e-2cb8-493e-83b3-07b1a9c5728b
[2022-01-18, 20:49:38 UTC] {spark_submit.py:456} DEBUG - Should track
driver: True
.
.
.
.
[2022-01-18, 20:50:08 UTC] {spark_submit.py:587} DEBUG - polling status of
spark driver with id driver-20220118204802-0000
[2022-01-18, 20:50:08 UTC] {spark_submit.py:409} DEBUG - Poll driver status
cmd: ['spark-submit', '--master',
'spark://ip-10-150-101-107.ap-south-1.compute.internal:7077', '--status',
'driver-20220118204802-0000']
[2022-01-18, 20:50:10 UTC] {spark_submit.py:541} DEBUG - spark driver status
log: 22/01/18 20:50:10 WARN RestSubmissionClient: Unable to connect to server
spark://ip-14-1XXXX-127.ap-south-1.compute.internal:7077.
[2022-01-18, 20:50:10 UTC] {spark_submit.py:541} DEBUG - spark driver status
log: Exception in thread "main"
org.apache.spark.deploy.rest.SubmitRestConnectionException: Unable to connect
to server
[2022-01-18, 20:50:10 UTC] {spark_submit.py:541} DEBUG - spark driver status
log: at
org.apache.spark.deploy.rest.RestSubmissionClient.$anonfun$requestSubmissionStatus$3(RestSubmissionClient.scala:163)
### What you expected to happen
Airflow hook should be able to poll the status correctly and mark the job
status appropriately.(Failed in case of genuine failure and successful in case
of successful completion of job.
### How to reproduce
Create an airflow DAG with Spark submit operator. Use
"spark.standalone.submit.waitAppCompletion = true" in conf so that client will
wait for application to complete. Enable Debug logs in Airflow to see the log
trace.
### Anything else
Happens every time and its a blocker for using spark submit operator.
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]