robinedwards opened a new issue #18943:
URL: https://github.com/apache/airflow/issues/18943


   ### Apache Airflow version
   
   2.2.0 (latest released)
   
   ### Operating System
   
   Debian GNU/Linux 11 (bullseye)
   
   ### Versions of Apache Airflow Providers
   
   ```
   apache-airflow-providers-amazon @ 
file:///root/.cache/pypoetry/artifacts/c9/69/16/ffa2eb7a2e6e850a7048eaf66b6c40c990ef7c58149f20d3d3f333a2e9/apache_airflow_providers_amazon-2.2.0-py3-none-any.whl
                                                                                
                                                                                
                         
   apache-airflow-providers-celery @ 
file:///root/.cache/pypoetry/artifacts/6e/1b/2f/f968318a7474e979af4dc53893ecafe8cd11a98a94077a9c3c27304eb7/apache_airflow_providers_celery-2.1.0-py3-none-any.whl
                                                                                
                                                                                
                         
   apache-airflow-providers-ftp @ 
file:///root/.cache/pypoetry/artifacts/8b/9a/dd/79a36c62bc7f37f98d0ea33652570e19272e8a7a2297db13a6785698d1/apache_airflow_providers_ftp-2.0.1-py3-none-any.whl
 
   apache-airflow-providers-http @ 
file:///root/.cache/pypoetry/artifacts/52/28/81/03a89147daf7daceb55f1218189d1c4af01c33c45849b568769ca6765f/apache_airflow_providers_http-2.0.1-py3-none-any.whl
                                                                                
                                                                                
                             
   apache-airflow-providers-imap @ 
file:///root/.cache/pypoetry/artifacts/1c/5d/c5/269e8a8098e7017a26a2a376eb3020e1a864775b7ff310ed39e1bd503d/apache_airflow_providers_imap-2.0.1-py3-none-any.whl
                                                                                
                                                                                
                             
   apache-airflow-providers-postgres @ 
file:///root/.cache/pypoetry/artifacts/fb/69/ac/e8e25a0f6a4b0daf162c81c9cfdbb164a93bef6bd652c1c00eee6e0815/apache_airflow_providers_postgres-2.3.0-py3-none-any.whl
                                                                                
                                                                                
                     
   apache-airflow-providers-redis @ 
file:///root/.cache/pypoetry/artifacts/cf/2b/56/75563b6058fe45b70f93886dd92541e8349918eeea9d70c703816f2639/apache_airflow_providers_redis-2.0.1-py3-none-any.whl
                                                                                
                                                                                
                           
   apache-airflow-providers-sqlite @ 
file:///root/.cache/pypoetry/artifacts/61/ba/e9/c0b4b7ef2599dbd902b32afc99f2620d8a616b3072122e90f591de4807/apache_airflow_providers_sqlite-2.0.1-py3-none-any.whl
  
   ```
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   AWS ECS, Celery Executor, Postgres 13, S3 Logging, Sentry integration
   
   ### What happened
   
   Noticed our Sentry getting a lot of integrity errors inserting into the 
task_fail table with a null execution date.
   
   This seemed to be caused specifically by zombie task failures (We use AWS 
ECS Spot instances).
   
   Specifically this callback from the dag file processor:
   
   
https://github.com/apache/airflow/blob/e6c56c4ae475605636f4a1b5ab3884383884a8cf/airflow/models/taskinstance.py#L1746
   
   Adds a task_fail here: 
https://github.com/apache/airflow/blob/e6c56c4ae475605636f4a1b5ab3884383884a8cf/airflow/models/taskinstance.py#L1705
   
   This blows up when it flushes further down the method. This i believe is 
because when the task instance is refreshed from the database the dag_run 
property is not populated so the proxy from `ti.execution_date` to 
ti.dag_run.execution_date` returns `None`
   
   ### What you expected to happen
   
   Insert into task_fail successfully and trigger callback
   
   ### How to reproduce
   
   Run this dag:
   
   ```python
   import logging
   import time
   from datetime import datetime
   
   from airflow import DAG
   from airflow.operators.python import PythonOperator
   
   
   def long_running_task():
       for i in range(60):
           time.sleep(5)
           logging.info("Slept for 5")
   
   
   def log_failure_dag(*args, **kwargs):
       logging.error("Our failure callback")
   
   
   dag = DAG(
       dag_id="test_null_task_fail",
       schedule_interval='@daily',
       catchup=True,
       start_date=datetime(2021, 10, 9),
       max_active_runs=1,
       max_active_tasks=1,
       on_failure_callback=log_failure_dag,
   )
   
   with dag:
       PythonOperator(
           task_id="long_running",
           python_callable=long_running_task,
           on_failure_callback=log_failure_dag
       )
   ```
   
   Kill the celery worker whilst its executing the long_running tasks. Wait for 
the zombie reaper of the scheduler to begin and call the failure handler.
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to