Re: [DISCUSSION] AIP-83 Remove Execution Date Unique Constraint from DAG Run

Tzu-ping Chung Thu, 01 Aug 2024 17:19:30 -0700

Replied inline

> On 2 Aug 2024, at 03:52, Daniel Standish <daniel.stand...@astronomer.io> 
> wrote:
> 
> Hi @Tzu-ping Chung <mailto:t...@astronomer.io> 
> 
> I just had a chance to look over the AIP after noticing that the vote thread 
> started.
> 
> I have a few questions / things to note.  It looks like this is more than 
> simply removing the unique constraint, but actually following through on 
> removing execution date entirely, and replacing with logical date -- is that 
> right?
> 
> E.g. here
> 
>> The execution date value may still be shown to the user when useful. It must 
>> be under another name, such as logical date.
> 
> And here
> 
>> The execution_date key shall be removed from the task execution context.
> 
> So basically, this AIP removes execution_date completely, it seems.


Logical date *is* execution date, under another name, so I guess yes and no? 
Depending on your definition of “entirely”. In many places we keep the same 
value under both keys for backward compatibility (and using execution_date to 
access the value emits a deprecation warning), and this cleans up the old key 
so the value is not exposed redundantly.


> If that's right, I would think we should just go ahead and rename the column 
> (to logical) on dag run, no?  WDYT?  I.e. get rid of any reference to it?  
> Why not?

We can, especially the SQLAlchemy model. Not sure how practical it is to rename 
the column in the database—we can certainly do it if it’s cheap, but if that 
requires a long operation I’d say it’s not worthwhile since SQLAlchemy can map 
a field to a column under different names easily anyway.


> Other question.... It's one thing to remove the constraint (and possibly 
> rename).  But, have you thought through under what circumstances we would 
> create / delete / keep dag runs with the same execution_date?  E.g., do you 
> envision that we keep a copy of all dag runs for all time?  E.g. if you clear 
> a run with execution_date X, do we create another one with date X and keep 
> the old one?  Or do we mutate it, like is the behavior now?  What about if 
> there's 3 runs for execution_date X.  What happens if we "clear that 
> execution_date for that dag"? Should we run all 3 instances?  Or delete them 
> and just run one?
> 
> Re
> 
>> Arguments in Python functions and the REST API using execution_date to look 
>> up a DAG run should be removed. The arguments should all have already been 
>> deprecated.
> 
> Should we add to this, "and ensure that we replace it with logical_date in 
> all cases"?

Runs are already identified by run_id now, and will still continue to be. 
Clearing a run is clearing the entry associated to a run_id. Two runs having 
the same execution_date have different run_id values, are different entities, 
and operating on one does not affect the other. This is the implication of the 
proposal on only removing the unique constraint.

Note that different runs of the same execution date is conceptually different 
from different task instance tries. The DAG run table does not currently have 
the ability to keep history. We do have clear_number, which is similar to 
try_number on TI, but it is not currently a part of the unique constraint, and 
changing it a la AIP-64 is out of this proposal’s scope.


> Other question.  Right now, the dag run table, and specifically the execution 
> date field there, counts as the store of data completeness for dags.  Thus 
> when we backfill a certain range, if there's already a run for execution_date 
> X, we don't rerun it.  But, if there can be multiple runs with same execution 
> date, it sorta becomes not well defined whether the "data is there" or not.  
> Might not be a real problem.  But seems there might be some ambiguities 
> arising from that.  Thinking about the backfill case, it's similar to the 
> clear case mentioned above.  Probably we just rerun all of them if it's 
> backfill with clear?  And if not clear, we just rerun the failed ones?  
> Another one that comes to mind is if `catchup=True` and task has 
> depends_on_past=True.  In that scenario, if we have 3 runs with execution 
> date X, 2 success one fail, when evaluating depends on past, I guess there's  
> For run X+1, do we check that all the TIs for all 3 immediately preceding 
> runs are success?  Probably yes, I guess?  There might be other kinds of 
> cases.

This is what partitioning aims to solve—if a value contributes to the data’s 
identity, it should be a part of the partition key. Before we can implement 
that on good ol’ DAGs, however, I think it’s more practical to follow two 
concepts

- Runs of the same execution date are distinct and represent distinct data
- Use id as the secondary ordering

This should work for most of the cases, especially since we still reuse the 
same dag_run row when clearing/rerunning. Again, changing this is out of scope 
here.

TP

Re: [DISCUSSION] AIP-83 Remove Execution Date Unique Constraint from DAG Run

Reply via email to