dstandish commented on a change in pull request #22184:
URL: https://github.com/apache/airflow/pull/22184#discussion_r828760938
##########
File path: airflow/dag_processing/processor.py
##########
@@ -412,20 +410,20 @@ def manage_slas(self, dag: DAG, session: Session = None)
-> None:
else:
while next_info.logical_date < ts:
next_info = dag.next_dagrun_info(next_info.data_interval,
restricted=False)
-
if next_info is None:
break
- if (ti.dag_id, ti.task_id, next_info.logical_date) in
recorded_slas_query:
+ next_run_id = DR.generate_run_id(DagRunType.SCHEDULED,
next_info.logical_date)
+ if (ti.dag_id, ti.task_id, next_run_id, ti.map_index) in
recorded_sla_misses:
Review comment:
though it is weird ... i don't understand why we immediately call next
run here
https://github.com/apache/airflow/blob/main/airflow/dag_processing/processor.py#L414
we're already at "the next run" relative to the last completed TI -- why
don't we see if _that_ run has an SLA miss? it seems we skip to "the run
_after_ the run after" the latest TI. so that we could only get an SLA miss if
airflow is 2 runs behind 🤪
i need to actually do some live testing to understand how this actually
behaves.
update: i did some live testing, and yeah, SLAs are all screwed up
here's an example:

i created a task that is sleep(300), SLA of one minute, and runs every 5
minutes. here you can see that first TI gets no miss (expected) and the second
TI does not get any miss. the one that is created is always 2 runs ahead of
the latest successful TI 🤦
and one ahead of the one that is running.
also visible here is an odd quirk that appears to be a bug re timetables or
data intervals or something in that area: the execution dates are wiggly i.e.
not a value `N * timedelta + dag.start_date`. they have seemingly random
millis precision. and as a consequence, not only are the SlaMiss records in
the future, but i have found that they may _never_ correspond to TIs that
actually come into existence. 🤷
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]