Friendly ping on the above! Has anyone encountered this by chance? We're still seeing it occasionally on longer running tasks.
On Tue, Nov 20, 2018 at 10:31 AM Kevin Lam <[email protected]> wrote: > Hi, > > We run Apache Airflow in Kubernetes in a manner very similar to what is > outlined in puckel/docker-airflow [1] (Celery Executor, Redis for > messaging, Postgres). > > Lately, we've encountered some of our Tasks getting stuck in a running > state, and printing out the errors: > > [2018-11-20 05:31:23,009] {models.py:1329} INFO - Dependencies not met for > <TaskInstance: BLAH 2018-11-19T19:19:50.757184+00:00 [running]>, dependency > 'Task Instance Not Already Running' FAILED: Task is already running, it > started on 2018-11-19 23:29:11.974497+00:00. >> [2018-11-20 05:31:23,016] {models.py:1329} INFO - Dependencies not met for >> <TaskInstance: BLAH 2018-11-19T19:19:50.757184+00:00 [running]>, dependency >> 'Task Instance State' FAILED: Task is in the 'running' state which is not a >> valid state for execution. The task must be cleared in order to be run. >> >> > Is there anyway to avoid this? Does anyone know what causes this issue? > > This is quite problematic. The task is stuck in running state without > making any progress when the above error occurs, and so turning on retries > on doesn't help with getting our DAGs to reliably run to completion. > > Thanks! > > [1] https://github.com/puckel/docker-airflow >
