Hi @Tzu-ping Chung <t...@astronomer.io> I just had a chance to look over the AIP after noticing that the vote thread started.
I have a few questions / things to note. It looks like this is more than simply removing the unique constraint, but actually following through on removing execution date entirely, and replacing with logical date -- is that right? E.g. here The execution date value may still be shown to the user when useful. It > must be under another name, such as *logical date*. And here The execution_date key shall be removed from the task execution context. So basically, this AIP removes execution_date completely, it seems. If that's right, I would think we should just go ahead and rename the column (to logical) on dag run, no? WDYT? I.e. get rid of any reference to it? Why not? Other question.... It's one thing to remove the constraint (and possibly rename). But, have you thought through under what circumstances we would create / delete / keep dag runs with the same execution_date? E.g., do you envision that we keep a copy of all dag runs for all time? E.g. if you clear a run with execution_date X, do we create another one with date X and keep the old one? Or do we mutate it, like is the behavior now? What about if there's 3 runs for execution_date X. What happens if we "clear that execution_date for that dag"? Should we run all 3 instances? Or delete them and just run one? Re Arguments in Python functions and the REST API using execution_date to look > up a DAG run should be removed. The arguments should all have already been > deprecated. Should we add to this, "and ensure that we replace it with logical_date in all cases"? Other question. Right now, the dag run table, and specifically the execution date field there, counts as the store of data completeness for dags. Thus when we backfill a certain range, if there's already a run for execution_date X, we don't rerun it. But, if there can be multiple runs with same execution date, it sorta becomes not well defined whether the "data is there" or not. Might not be a real problem. But seems there might be some ambiguities arising from that. Thinking about the backfill case, it's similar to the clear case mentioned above. Probably we just rerun all of them if it's backfill with clear? And if not clear, we just rerun the failed ones? Another one that comes to mind is if `catchup=True` and task has depends_on_past=True. In that scenario, if we have 3 runs with execution date X, 2 success one fail, when evaluating depends on past, I guess there's For run X+1, do we check that all the TIs for all 3 immediately preceding runs are success? Probably yes, I guess? There might be other kinds of cases.