Ferdinanddb commented on issue #57416:
URL: https://github.com/apache/airflow/issues/57416#issuecomment-3458933409

   This is the behavior I am talking about:
   
   As one can see, the SparkApplication is completed, but was not removed 
because the worker had an issue (I guess):
   <img width="1940" height="256" alt="Image" 
src="https://github.com/user-attachments/assets/a394b871-a148-4b53-8446-700548f260a8";
 />
   
   As one can see, the first attempt did not end up well (sometimes this 
happens but I guess this is normal behavior for CeleryExecutor in the sense 
that they can sometimes fail):
   
   <img width="2220" height="716" alt="Image" 
src="https://github.com/user-attachments/assets/0876233e-82b7-4c83-ba4e-278f6c0cc221";
 />
   
   But I have the feeling that the issue is fixed now, in the sense that the 
spark application started from scratch and succeeded in the next retry. Before 
upgrading to 3.1.1 (and all the providers using the constraints file), I 
thought I had another behavior (i.e. the next runs were blocked and I had to 
manually remove the sparkapplication objects using `kubectl` so that the job 
could restart properly). I close this issue, sorry for the noise.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to