Hi All,

We are facing an issue and would be thankful if anyone can help us on  this
issue.
Environment: Spark, Kubernetes and Airflow.
Airflow is used to schedule job spark job over kubernetes.
We are using bash script which is using spark submit command to submit
spark jobs.
Issue:
We are submitting spark job through airflow in a *cluster mode*. However,
when the job is completed and executors are closed, airflow is not able to
schedule another job.

As per our investigation, we found that jobs are completed but spark submit
command in our script is not able to exit and continuously running with
following logs:
21/10/12 08:54:26 INFO LoggingPodStatusWatcherImpl: Application status for
spark-3f914f93ad684743b1a7b17aa26b4329 (phase: Running)

In order to confirm this issue is not from the airflow side, we tried to
kill spark submit command and it was able to schedule another job so our
observation is that somehow after completion of job spark-submit script is
not able to exit and still running.

FYI we have closed the spark session already in our code.
One of the weird observations was that it is running completely fine in
local mode and we are checking for client mode presently.


Would be thankful if you can guide us on this?


Thanks,
Shishir

Reply via email to