dstandish commented on code in PR #59724:
URL: https://github.com/apache/airflow/pull/59724#discussion_r2645737644


##########
airflow-core/src/airflow/jobs/scheduler_job_runner.py:
##########
@@ -1815,9 +1816,10 @@ def _create_dag_runs(self, dag_models: 
Collection[DagModel], session: Session) -
             # we need to set DagModel.next_dagrun_info if the DagRun already 
exists or if we
             # create a new one. This is so that in the next scheduling loop we 
try to create new runs
             # instead of falling in a loop of IntegrityError.

Review Comment:
   do you mean because i added the if condition here?
   
   ```
               if created_dag_run:
                   self._update_next_dagrun_fields(
                       serdag=serdag,
                       dag_model=dag_model,
                       session=session,
                       
active_non_backfill_runs=active_runs_of_dags[serdag.dag_id],
                       data_interval=data_interval,
                   )
   ```
   
   > Would that cause a bug where a DagRun exists but DagModel.next_dagrun 
wasn't updated (the edge case mentioned in the comment), the scheduler will get 
stuck in an infinite loop attempting to create the same dag run repeatedly?
   
   No this is not possible because the dag processor, every time it processes a 
dag, it runs dag_model.calculate_dagrun_date_fields and if it needs to be 
incremented, it will be.
   
   Basically, if the scheduler is already trying to create the dagrun that 
already exists, then something effed up, or something weird happed (like manual 
trigger of a future logical date).
   
   The most common scenario would be, I believe, that the scheduler  
_decremented_ next_dagrun.  
   
   This can happen in the scheduler because suppose it creates 5 runs for 
catchup, but then an older one finishes after a later one -- then the scheduler 
may set the next_dagrun backward because of the way this code works.
   
   **I think we should add logic so that the scheduler is never setting the 
next_dagrun backward**. but i am not trying to change that here.
   
   So ultimately, I don't think it makes much difference whether we 
conditionally update based on created or not -- but it doesn't hurt to always 
update in this context, so I removed the condition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to