o-nikolas commented on a change in pull request #19237:
URL: https://github.com/apache/airflow/pull/19237#discussion_r737876692
##########
File path:
airflow/providers/amazon/aws/example_dags/example_dms_full_load_task.py
##########
@@ -50,20 +49,14 @@
with DAG(
dag_id='dms_full_load_task_run_dag',
- default_args={
- 'owner': 'airflow',
- 'depends_on_past': False,
- 'email': ['[email protected]'],
- 'email_on_failure': False,
- 'email_on_retry': False,
- },
dagrun_timeout=timedelta(hours=2),
- start_date=days_ago(2),
+ start_date=datetime(2021, 1, 1),
schedule_interval='0 3 * * *',
+ catchup=False,
Review comment:
> I feel like there are plenty of examples in the repo, docs, and in the
wild where catchup=False is used, but maybe the question is "why don't new
users know about catchup early on"? It might be better to present this
concept/parameter more explicitly for new users in the tutorial documentation.
WDYT?
I think this relies on best intentions of new users to have read all the
docs and have a fuller understanding than can be expected when they first get
started with Airflow. Mechanisms always work better than good intentions IMHO,
so to me it seems like a cheap and easy fix to add `catchup=False` to all
examples. This will stop the possible catchup explosion more often and to your
point it also elevates that option so people see it and perhaps are more likely
to read the docs about it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]