iroddis commented on a change in pull request #5787: [AIRFLOW-5172] Add choice
of interval edge scheduling
URL: https://github.com/apache/airflow/pull/5787#discussion_r340614930
##########
File path: airflow/jobs/scheduler_job.py
##########
@@ -648,7 +648,7 @@ def create_dag_run(self, dag, session=None):
if dag.schedule_interval == '@once':
period_end = next_run_date
elif next_run_date:
- period_end = dag.following_schedule(next_run_date)
+ period_end = dag.period_end(next_run_date)
Review comment:
Nomenclature has absolutely been a pain in this change. The entire
vocabulary around scheduling seems to have exploded, and I can't find a clear
definition for all the terms. This is well outside the scope of this change,
but having well defined terms for things like:
period_start_ts - Timestamp of the start of the period to be processed
period_end_ts - Timestamp of the end of the period to be processed
scheduled_execution_ts - Timestamp of when the execution should start
execution_ts - Timestamp of actual execution start
Many of the current names, like start_date, are datetimes, not dates. It's
also not immediately obvious what "start" means: is it period_start_ts, or
execution_ts?
I'm happy to change the functions to be whatever is popular, but I think
there'd be real value in defining, clearly, and in a single spot, all of the
timestamps around a particular DAG's execution, giving them meaningful names,
and consolidating on that.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services