atrbgithub commented on PR #58047:
URL: https://github.com/apache/airflow/pull/58047#issuecomment-3723315582

   Hi @uranusjr , this does still seem to be broken with the latest release, 
3.1.5. 
   
   To test I did the following:
   1. I span up a cluster locally with unmodified code.
   2. Span up tasks via the executor
   3. stopped the scheduler before they finished. 
   4. Pods go to a completed state
   5. Start the scheduler
   6. Pods are never cleaned up
   
   However it seems that in main, the event_scheduler which this change relies 
on has been removed. To fix the issue in 3.1.5, a diff to 
`kubernetes_executor.py` looks as follows:
   
   ```74a75
   > from airflow.utils.event_scheduler import EventScheduler
   161a163
   >         self.event_scheduler: EventScheduler | None = None
   252a255,262
   >         self.event_scheduler = EventScheduler()
   > 
   >         self.event_scheduler.call_regular_interval(
   >             conf.getfloat("scheduler", "orphaned_tasks_check_interval", 
fallback=300.0),
   >             self._adopt_completed_pods,
   >             (self.kube_client,),
   >         )
   > 
   419a430,432
   >         if self.event_scheduler:
   >             self.event_scheduler.run(blocking=False)
   > 
   ```
   
   That ensures completed pods are cleaned up. I'm not sure if re-added the 
scheduler is the right approach for main given that it has been removed, there 
might be a better way to resolve it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to