CSammy opened a new issue #19844:
URL: https://github.com/apache/airflow/issues/19844
### Apache Airflow version
2.2.2 (latest released)
### Operating System
Debian GNU/Linux 10 (buster) / official Airflow Docker image
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon==2.4.0
apache-airflow-providers-celery==2.1.0
apache-airflow-providers-cncf-kubernetes==2.1.0
apache-airflow-providers-docker==2.3.0
apache-airflow-providers-elasticsearch==2.1.0
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-google==6.1.0
apache-airflow-providers-grpc==2.0.1
apache-airflow-providers-hashicorp==2.1.1
apache-airflow-providers-http==2.0.1
apache-airflow-providers-imap==2.0.1
apache-airflow-providers-microsoft-azure==3.3.0
apache-airflow-providers-mysql==2.1.1
apache-airflow-providers-odbc==2.0.1
apache-airflow-providers-postgres==2.3.0
apache-airflow-providers-redis==2.0.1
apache-airflow-providers-sendgrid==2.0.1
apache-airflow-providers-sftp==2.2.0
apache-airflow-providers-slack==4.1.0
apache-airflow-providers-sqlite==2.0.1
apache-airflow-providers-ssh==2.3.0
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
Deployment via Helm chart on GKE. Helm chart v 1.3.0, pinned Docker tag to
`2.2.2-python3.9`. Isolated namespace on Kubernetes 1.16.
Customization:
- git-sync activated
I can provide the full output of `airflow info` if desired.
Since the question arose in previous conversation: Executor is the
`CeleryExecutor`.
### What happened
In a DAG with KubernetesPodOperators, following settings were used:
```python
schedule_interval="0 0 * * 6",
start_date=datetime.datetime(2021, 11, 1),
catchup=False,
```
When running the DAG via the Airflow UI, backfill jobs for the dates
`2021-11-13` and `2021-11-20` are created and run.
### What you expected to happen
I expected one job for today being created and run, no backfill jobs.
### How to reproduce
Complete DAG file:
```python
import datetime
import os
from airflow import DAG
from airflow.contrib.operators.kubernetes_pod_operator import
KubernetesPodOperator
with DAG(
dag_id="debug_dag",
# Saturday midnight
schedule_interval="0 0 * * 6",
start_date=datetime.datetime(2021, 11, 1),
catchup=False,
tags=["debug dag for catchup tests"],
default_args=default_args,
) as dag:
gcp_test_task = KubernetesPodOperator(
# Task name in Airflow 2 UI
task_id="gcp-test-task",
# Pod name
name="task-gcp-test-task",
"image": "google/cloud-sdk:slim",
cmds=["sleep", "300"],
"namespace": os.environ["K8S_NAMESPACE"],
# K8s service account linked to the GCP service account
"service_account_name": "airflow2-dag-default",
"image_pull_policy": "Always",
"get_logs": True,
)
gcp_test_task
```
Click on the "Run" button to see backfill jobs being created.
### Anything else
This behaviour has been reproducible with multiple DAGs having this
`schedule_interval` and `start_date`.
It is not reproducible in the same way however with `schedule_interval="10 3
* * *", start_date=datetime.datetime(2021, 11, 1), catchup=False`. For this
one, it shows "Next Run: 2021-11-25 03:10:00" (which is still not what I
expected, but it is not backfilling the entire month).
Possibly this is a misunderstanding about scheduling and/or backfill on my
part.
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]