dstandish opened a new pull request, #59724: URL: https://github.com/apache/airflow/pull/59724
Currently it's a bit complicated and confusing. It remains so after this update, but hopefully a little improved. For example we have an argument "last automated dagrun" that takes not a dag run but a data interval. And, when we call it, in places in the scheduler, we may call it with a run that is not actually "the last automated dagrun" -- it could be any dag run the scheduler touches, which of course could be a very old dag run. I don't solve that problem here. But I do try to make things a little clearer and a little simpler at the scheduler call sites. In particular, there were two things that bothered me. 1. in the function `_should_update_dag_next_dagruns`, we did not only *check* whether we should update. In some branches we actually *did* the update then said "no, don't update." I pull this update higher up where it's more visible. 2. it's not obvious that the `calculate_dagrun_date_fields` function actually mutates the dag. It just says "calculate," right? So, to improve this slightly, and to reduce noise a bit at call site, I wrap this (along with the update mentioned above) in a function called `_update_next_dagrun_fields`. At least it's clear we are (at least probably) updating the dag fields. This has the side benefit bringing all invocations of `calculate_dagrun_date_fields` into one function in the scheduler (which is invoked in a few places). So hopefully modest clarity improvement, and hopefully no behavior change. Will have to update some tests but will wait for them to fail first. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
