Daniel, Thank you for writing this up. I commented on the wiki page.
Vikram On Fri, Dec 20, 2024 at 3:26 PM Daniel Standish <dpstand...@gmail.com> wrote: > OK folks, let's get into it. > > AIP-83 fundamentally alters Airflow's dag run semantics in a way that I > think no one fully appreciated when it was proposed and adopted. > Previously the logical key of dag_run was dag_id + logical_date. That > combination defined what a dag run means, and uniquely identified a dag > run. In AIP-83, We removed the constraint, which is easy enough; but we > did not do anything to maintain backcompat, or to address the semantic > ambiguities introduced. Ultimately, we need to decide as a community what > to do. > > I don't expect to impose my will upon the community. I let go of outcome. > But I aim to help folks understand the issue so that we can collectively > arrive at a good decision for the project. And I would only ask that, if > you intend to engage and comment, that you try to be patient and read and > consider the whole doc. It probably won't take as long as it looks. > > One fundamental question is, should we continue to support the old > semantics or not. Do users expect or depend on the old semantics? And do > we care? > > If not, then we ought to be clear about this with users. If we want to > support the old semantics, then we need to decide what that looks like. > > Here is a draft AIP amendment > < > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-83+amendment+to+support+classic+Airflow+authoring+style > > > where I've outlined the the history, motivations, concerns, and options > that I am aware of, and as I understand them. There are 5 of them. My > vote would be anything between 1 and 3, with mild preference for 3. > > Personally, I don't see any reason why we cannot or should not allow users > to elect to design dags with the old semantics. To be clear, I am no > champion of this design pattern; in my former life as data engineer, I > would say my dags were execution-date-driven something less than 1% of the > time. It makes a lot of sense for a hive / presto / athena shop, but much > less e.g. for snowflake. But I recognize that, many folks do use it, and > Airflow has gotten this far assuming *all* dags are designed this way, so, > maybe we should allow users to optionally keep those semantics in Airflow > 3. > > The other thing that we should think through is, *why* are we removing > uniqueness. Why do we care? What does this buy us? If it's just about > allowing manual triggering of dags that don't care about logical date, then > one option would be to just make logical date nullable and call it a day. > > Alright, happy holidays. Let's try and step our way towards consensus and > figure out the best path forward for Airflow. >