atrbgithub commented on PR #58047:
URL: https://github.com/apache/airflow/pull/58047#issuecomment-3723315582
Hi @uranusjr , this does still seem to be broken with the latest release,
3.1.5.
To test I did the following:
1. I span up a cluster locally with unmodified code.
2. Span up tasks via the executor
3. stopped the scheduler before they finished.
4. Pods go to a completed state
5. Start the scheduler
6. Pods are never cleaned up
However it seems that in main, the event_scheduler which this change relies
on has been removed. To fix the issue in 3.1.5, a diff to
`kubernetes_executor.py` looks as follows:
```74a75
> from airflow.utils.event_scheduler import EventScheduler
161a163
> self.event_scheduler: EventScheduler | None = None
252a255,262
> self.event_scheduler = EventScheduler()
>
> self.event_scheduler.call_regular_interval(
> conf.getfloat("scheduler", "orphaned_tasks_check_interval",
fallback=300.0),
> self._adopt_completed_pods,
> (self.kube_client,),
> )
>
419a430,432
> if self.event_scheduler:
> self.event_scheduler.run(blocking=False)
>
```
That ensures completed pods are cleaned up. I'm not sure if re-added the
scheduler is the right approach for main given that it has been removed, there
might be a better way to resolve it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]