karthyk16 opened a new issue #20949:
URL: https://github.com/apache/airflow/issues/20949


   ### Apache Airflow Provider(s)
   
   apache-spark
   
   ### Versions of Apache Airflow Providers
   
   2.0.3. Same issue in 2.0.1 as well.
   
   ### Apache Airflow version
   
   2.1.2
   
   ### Operating System
   
   Red Hat Enterprise Linux
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   We have used Spark Submit operator from Airflow to submit a Spark job on 
Standalone cluster mode. Spark job gets submitted successfully and we have used 
spark.standalone.submit.waitAppCompletion = true as a conf. So, the client 
waits for the application to complete. On successful completion of the 
application and executor termination, airflow is not able to poll the 
successful completion of job and mark the job as successful. Instead it marks 
the job as failure even though the job was successfully finished. Status of 
application in Spark History shows Finished.
   
   [2022-01-18, 20:49:37 UTC] {spark_submit.py:523} INFO - 22/01/18 20:49:37 
INFO ClientEndpoint: State of driver driver-20220118204802-0000 is FINISHED, 
exiting spark-submit JVM.
   [2022-01-18, 20:49:37 UTC] {spark_submit.py:523} INFO - 22/01/18 20:49:37 
INFO ShutdownHookManager: Shutdown hook called
   [2022-01-18, 20:49:37 UTC] {spark_submit.py:523} INFO - 22/01/18 20:49:37 
INFO ShutdownHookManager: Deleting directory 
/tmp/spark-1b7f430e-2cb8-493e-83b3-07b1a9c5728b
   [2022-01-18, 20:49:38 UTC] {spark_submit.py:456} DEBUG - Should track 
driver: True
   .
   .
   .
   .
   [2022-01-18, 20:50:08 UTC] {spark_submit.py:587} DEBUG - polling status of 
spark driver with id driver-20220118204802-0000
   [2022-01-18, 20:50:08 UTC] {spark_submit.py:409} DEBUG - Poll driver status 
cmd: ['spark-submit', '--master', 
'spark://ip-10-150-101-107.ap-south-1.compute.internal:7077', '--status', 
'driver-20220118204802-0000']
   [2022-01-18, 20:50:10 UTC] {spark_submit.py:541} DEBUG - spark driver status 
log: 22/01/18 20:50:10 WARN RestSubmissionClient: Unable to connect to server 
spark://ip-14-1XXXX-127.ap-south-1.compute.internal:7077.
   [2022-01-18, 20:50:10 UTC] {spark_submit.py:541} DEBUG - spark driver status 
log: Exception in thread "main" 
org.apache.spark.deploy.rest.SubmitRestConnectionException: Unable to connect 
to server
   [2022-01-18, 20:50:10 UTC] {spark_submit.py:541} DEBUG - spark driver status 
log: at 
org.apache.spark.deploy.rest.RestSubmissionClient.$anonfun$requestSubmissionStatus$3(RestSubmissionClient.scala:163)
   
   
   ### What you expected to happen
   
   Airflow hook should be able to poll the status correctly and mark the job 
status appropriately.(Failed in case of genuine failure and successful in case 
of successful completion of job.
   
   ### How to reproduce
   
   Create an airflow DAG with Spark submit operator. Use 
"spark.standalone.submit.waitAppCompletion = true" in conf so that client will 
wait for application to complete. Enable Debug logs in Airflow to see the log 
trace.
   
   ### Anything else
   
   Happens every time and its a blocker for using spark submit operator.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to