robinedwards opened a new issue #18943:
URL: https://github.com/apache/airflow/issues/18943
### Apache Airflow version
2.2.0 (latest released)
### Operating System
Debian GNU/Linux 11 (bullseye)
### Versions of Apache Airflow Providers
```
apache-airflow-providers-amazon @
file:///root/.cache/pypoetry/artifacts/c9/69/16/ffa2eb7a2e6e850a7048eaf66b6c40c990ef7c58149f20d3d3f333a2e9/apache_airflow_providers_amazon-2.2.0-py3-none-any.whl
apache-airflow-providers-celery @
file:///root/.cache/pypoetry/artifacts/6e/1b/2f/f968318a7474e979af4dc53893ecafe8cd11a98a94077a9c3c27304eb7/apache_airflow_providers_celery-2.1.0-py3-none-any.whl
apache-airflow-providers-ftp @
file:///root/.cache/pypoetry/artifacts/8b/9a/dd/79a36c62bc7f37f98d0ea33652570e19272e8a7a2297db13a6785698d1/apache_airflow_providers_ftp-2.0.1-py3-none-any.whl
apache-airflow-providers-http @
file:///root/.cache/pypoetry/artifacts/52/28/81/03a89147daf7daceb55f1218189d1c4af01c33c45849b568769ca6765f/apache_airflow_providers_http-2.0.1-py3-none-any.whl
apache-airflow-providers-imap @
file:///root/.cache/pypoetry/artifacts/1c/5d/c5/269e8a8098e7017a26a2a376eb3020e1a864775b7ff310ed39e1bd503d/apache_airflow_providers_imap-2.0.1-py3-none-any.whl
apache-airflow-providers-postgres @
file:///root/.cache/pypoetry/artifacts/fb/69/ac/e8e25a0f6a4b0daf162c81c9cfdbb164a93bef6bd652c1c00eee6e0815/apache_airflow_providers_postgres-2.3.0-py3-none-any.whl
apache-airflow-providers-redis @
file:///root/.cache/pypoetry/artifacts/cf/2b/56/75563b6058fe45b70f93886dd92541e8349918eeea9d70c703816f2639/apache_airflow_providers_redis-2.0.1-py3-none-any.whl
apache-airflow-providers-sqlite @
file:///root/.cache/pypoetry/artifacts/61/ba/e9/c0b4b7ef2599dbd902b32afc99f2620d8a616b3072122e90f591de4807/apache_airflow_providers_sqlite-2.0.1-py3-none-any.whl
```
### Deployment
Other Docker-based deployment
### Deployment details
AWS ECS, Celery Executor, Postgres 13, S3 Logging, Sentry integration
### What happened
Noticed our Sentry getting a lot of integrity errors inserting into the
task_fail table with a null execution date.
This seemed to be caused specifically by zombie task failures (We use AWS
ECS Spot instances).
Specifically this callback from the dag file processor:
https://github.com/apache/airflow/blob/e6c56c4ae475605636f4a1b5ab3884383884a8cf/airflow/models/taskinstance.py#L1746
Adds a task_fail here:
https://github.com/apache/airflow/blob/e6c56c4ae475605636f4a1b5ab3884383884a8cf/airflow/models/taskinstance.py#L1705
This blows up when it flushes further down the method. This i believe is
because when the task instance is refreshed from the database the dag_run
property is not populated so the proxy from `ti.execution_date` to
ti.dag_run.execution_date` returns `None`
### What you expected to happen
Insert into task_fail successfully and trigger callback
### How to reproduce
Run this dag:
```python
import logging
import time
from datetime import datetime
from airflow import DAG
from airflow.operators.python import PythonOperator
def long_running_task():
for i in range(60):
time.sleep(5)
logging.info("Slept for 5")
def log_failure_dag(*args, **kwargs):
logging.error("Our failure callback")
dag = DAG(
dag_id="test_null_task_fail",
schedule_interval='@daily',
catchup=True,
start_date=datetime(2021, 10, 9),
max_active_runs=1,
max_active_tasks=1,
on_failure_callback=log_failure_dag,
)
with dag:
PythonOperator(
task_id="long_running",
python_callable=long_running_task,
on_failure_callback=log_failure_dag
)
```
Kill the celery worker whilst its executing the long_running tasks. Wait for
the zombie reaper of the scheduler to begin and call the failure handler.
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]