awildturtok opened a new issue #19469:
URL: https://github.com/apache/airflow/issues/19469
### Apache Airflow version
2.1.3
### Operating System
Debian GNU/Linux 10 (buster)
### Versions of Apache Airflow Providers
apache-airflow==2.1.3
apache-airflow-providers-amazon==2.1.0
apache-airflow-providers-celery==2.0.0
apache-airflow-providers-cncf-kubernetes==2.0.2
apache-airflow-providers-docker==2.1.0
apache-airflow-providers-elasticsearch==2.0.2
apache-airflow-providers-ftp==2.0.0
apache-airflow-providers-google==5.0.0
apache-airflow-providers-grpc==2.0.0
apache-airflow-providers-hashicorp==2.0.0
apache-airflow-providers-http==2.0.0
apache-airflow-providers-imap==2.0.0
apache-airflow-providers-microsoft-azure==3.1.0
apache-airflow-providers-mysql==2.1.0
apache-airflow-providers-odbc==2.0.1
apache-airflow-providers-postgres==2.0.0
apache-airflow-providers-redis==2.0.0
apache-airflow-providers-sendgrid==2.0.0
apache-airflow-providers-sftp==2.1.0
apache-airflow-providers-slack==4.0.0
apache-airflow-providers-sqlite==2.0.0
apache-airflow-providers-ssh==2.1.0
### Deployment
Other Docker-based deployment
### Deployment details
Container image based on official airflow image.
### What happened
To circumvent troubles with one of our database providers, we have resorted
to very brute force retry methodology:
```{python}
except pyodbc.Error as error:
# We retry the query if we can guess that it's due to too much
contention.
if should_retry(error):
raise error
reschedule_date = datetime.now() +
timedelta(minutes=int(random.uniform(60, 120)))
logging.info(f"Rescheduling for {reschedule_date}")
raise AirflowRescheduleException(reschedule_date=reschedule_date)
```
This does work somewhat, in that airflow does reschedule our tasks, but they
are re-executed much too soon. As can be seen in the santizied log from below.
```
[2021-11-08 10:51:41,965] {full_export.py:129} INFO - Rescheduling for
2021-11-08 12:50:41.965787
[2021-11-08 10:51:42,001] {local_task_job.py:151} INFO - Task exited with
return code 1
[2021-11-08 10:51:42,012] {taskinstance.py:1505} INFO - Marking task as
UP_FOR_RETRY. execution_date=20211106T111520, start_date=20211108T094536,
end_date=20211108T095142
---
[2021-11-08 12:28:56,945] {full_export.py:129} INFO - Rescheduling for
2021-11-08 13:41:56.945495
[2021-11-08 12:28:57,005] {local_task_job.py:151} INFO - Task exited with
return code 1
[2021-11-08 12:28:57,014] {taskinstance.py:1505} INFO - Marking task as
UP_FOR_RETRY. execution_date=20211106T111520, start_date=20211108T111715,
end_date=20211108T112857
----
[2021-11-08 12:49:51,009] {full_export.py:129} INFO - Rescheduling for
2021-11-08 14:13:51.009492
[2021-11-08 12:49:51,058] {local_task_job.py:151} INFO - Task exited with
return code 1
[2021-11-08 12:49:51,067] {taskinstance.py:1505} INFO - Marking task as
UP_FOR_RETRY. execution_date=20211106T111520, start_date=20211108T113956,
end_date=20211108T114951
---
[2021-11-08 13:05:59,583] {full_export.py:129} INFO - Rescheduling for
2021-11-08 15:03:59.583298
[2021-11-08 13:05:59,636] {local_task_job.py:151} INFO - Task exited with
return code 1
[2021-11-08 13:05:59,646] {taskinstance.py:1505} INFO - Marking task as
UP_FOR_RETRY. execution_date=20211106T111520, start_date=20211108T115742,
end_date=20211108T120559
----
[2021-11-08 13:16:13,485] {full_export.py:129} INFO - Rescheduling for
2021-11-08 14:45:13.485722
[2021-11-08 13:16:13,551] {local_task_job.py:151} INFO - Task exited with
return code 1
[2021-11-08 13:16:13,562] {taskinstance.py:1505} INFO - Marking task as
FAILED. execution_date=20211106T111520, start_date=20211108T121101,
end_date=20211108T121613
```
### What you expected to happen
The tasks should be started at the time specified in the exception and not
an arbitrary time within 10 minutes.
### How to reproduce
```{python}
import logging
from datetime import datetime, timedelta
import random
from airflow import DAG
from airflow.exceptions import AirflowRescheduleException
from airflow.operators.python import PythonOperator
def retry():
try:
# Maybe also sleep a bit here
raise ValueError("Hi")
except ValueError:
reschedule_date = datetime.now() +
timedelta(minutes=int(random.uniform(60, 120)))
logging.info(f"Rescheduling for {reschedule_date}")
raise AirflowRescheduleException(reschedule_date=reschedule_date)
with DAG(dag_id="Am300_Pipeline") as dag:
for iteration in range(30):
do_export = PythonOperator(task_id="retry", python_callable=retry)
```
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]