potiuk commented on issue #27614: URL: https://github.com/apache/airflow/issues/27614#issuecomment-1336132339
Generally speaking - yes the callback can be executed by both task running (when task CAN execute callback) and by DagFileProcessor (when task callback cannot be executed in task - for example when task was forcefully killed). That's the theory. However - this is a distributed system - and there are different modes of execution of such distributed operations: at-most-once, at-least-once and exactly-once. Explained for example here: https://medium.com/@madhur25/meaning-of-at-least-once-at-most-once-and-exactly-once-delivery-10e477fafe16#:~:text=As%20the%20name%20suggests%2C%20At,exception%2C%20the%20message%20is%20lost. I believe (this needs to be looked a bit closer as I think it is not explicitly specified - but it would be great to verify it and document at least how the callbacks are delivered. Exactly-once is surprisingly hard to achieve in distributed systems - and I think it's almost impossible when you consider all kind of failure cases. I believe what our callback mechanism tries to achieve is "at-least-once". I am not 100% sure if this is achieved in all kinds of situations, and I am not sure exactly what's the sequence of events is to lead to having duplicates, but generally speaking I think we cannot guarantee that callback will be executed exactly once. If this is easily reproducible, then it might a bug because "at-least-once" is only happening when there are some unexpected events happening. If this is happening in "normal" circumstances - this is a bug. If it is triggered by some error scenarios - not so much and likely we won't be able to do anything about it - because it would be to complex and costly to try to implement "exactly-once". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
