potiuk commented on PR #59604:
URL: https://github.com/apache/airflow/pull/59604#issuecomment-3716169724

   This is pretty cool "reliability" feature. I think that should also be 
something that we should implement in a number of other places, because it can 
provide resilience to transient issues. But I thinkg it needs someone who has 
deeper understanding of deps handling so I will refrain with approving it for 
some time (though it's tempting).
   
   One thing to add though - I think we should have some better way of 
signalling that those issues are happening - metrics for example, or maybe even 
a warning in the UI if it happens displayed as dismissable notification? While 
i think it's cool we handle this on our own, it might hide some systemic issues 
that deployment manager should handle, so while we should let it self-recover, 
we should also notify about those issues happening pretty aggreesively.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to