Jorricks commented on PR #25489:
URL: https://github.com/apache/airflow/pull/25489#issuecomment-1209121259

   @argibbs these changes look great! Very well described as well.
   
   >  am acutely aware / wary of the fact that this 
separation-of-the-queues-plus-config-param is purely solving a problem that is 
(for me at least) purely a theoretically-possible issue. I've never seen it, 
but I don't think it's premature optimisation to tackle it in this change.
   
   We have had the case where because of a crucial singel point of failure, 
hardly any DAG was able to continue (as we use a lot of ExternalTaskSensors). 
This meant that all of our schedulers were down and handling SLA fires for a 
long time so I am very happy to see this optimisation 👍 
   
   ### Note: I don't think this should be included in this MR but given you 
worked on this, you probably have a good view on this.
   I was thinking. Part of what we can speed up is to instead of fire each SLA 
for the same DAG individually, group SLAs per DAG. Thereby, we add a dag with 
an SLA to the queue instead of an actual SLA. This prevents us from parsing a 
DAG file 10 times, for 10 different SLAs of the same DAG. I guess you could 
optimise this by including a dummy operator at the end that requires all the 
important tasks to succeed, but that's not super user friendly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to