dstandish commented on PR #50984: URL: https://github.com/apache/airflow/pull/50984#issuecomment-2914266370
@pierrejeambrun re > We shouldn't have to do this because it can yield wrong results. We are capable of emitting a query that will select 1 Run per dag_id, the one that has the max start_date and in case of multiple rows for the same dag_id and same max start_date will then choose the single row with the max_dag_run_id as a second criteria. (maybe with a window function over the partition of dagrun with the latest_start date or two nested subqueries) Yes, it is _possible_ to write such a query, but it would be expensive. This is a simplification that would generally be true. It would always show the latest created run. But if you cleared an old dag run, still the latest created dag run. So yeah we could do the more complicated query, but to me, it doesn't really seem worth it. What do you think. @jedcunningham ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
