hussein-awala commented on PR #35392:
URL: https://github.com/apache/airflow/pull/35392#issuecomment-1804762730

   > I have yet to find the first person that finds catchup's False current 
behavior logical
   
   Actually, I find it logical and necessary in some cases -> I have a daily 
dag that runs a heavy spark job; this job processes all the data and not just 
the records having a timestamp in the dag data interval. Sometimes, this dag 
takes more than one day to finish (sensors delay, the job time is almost ~8h, 
and if it fails, there is a retry with a backoff, etc.), so to skip creating a 
new dag run when the current run doesn't finish before the next execution date 
(creating a new run is useless for me), I use max_active_runs=1 with catchup 
False.
   
   This is just an example, but we will find different users do the same thing 
in some ML use cases like re-training a new ML using the full dataset.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to