internetcoffeephone commented on issue #41123:
URL: https://github.com/apache/airflow/issues/41123#issuecomment-2286514059

   @EvertonSA You are right, I included these for completeness.
   
   I found another clue: the second (erroneous) execution of the task always 
happens ~30 seconds later, never off by more than 1 second.
   E.g.
   ```
   [2024-08-13, 03:55:37 CEST] {{taskinstance.py:2077}} INFO - Dependencies all 
met for dep_context=non-requeueable deps ti=<TaskInstance: 
test_dag.check_basics scheduled__2024-08-12T00:00:00+00:00 [queued]>
   ...
   [2024-08-13, 03:56:06 CEST] {{taskinstance.py:2067}} INFO - Dependencies not 
met for <TaskInstance: test_dag.check_basics 
scheduled__2024-08-12T00:00:00+00:00 [running]>, dependency 'Task Instance 
State' FAILED: Task is in the 'running' state.
   ```
   
   This gives us a clue as to which config settings may be relevant here, but 
the only relevant one I was able to find was: 
`min_serialized_dag_update_interval = 30`
   
   I lack understanding of the interaction between Airflow/Celery to know where 
exactly I need to look - either Airflow is 
   scheduling or picking up tasks that it shouldn't, or Celery is not correctly 
communicating running task state. Any pointers on where in the Airflow code 
this process happens would be appreciated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to