potiuk commented on issue #27614:
URL: https://github.com/apache/airflow/issues/27614#issuecomment-1336132339

   Generally speaking - yes the callback can be executed by both task running 
(when task CAN execute callback) and by DagFileProcessor (when task callback 
cannot be executed in task - for example when task was forcefully killed). 
   
   That's the theory. 
   
   However - this is a distributed system - and there are different modes of 
execution of such distributed operations: at-most-once, at-least-once and 
exactly-once.
   
   Explained for example here: 
https://medium.com/@madhur25/meaning-of-at-least-once-at-most-once-and-exactly-once-delivery-10e477fafe16#:~:text=As%20the%20name%20suggests%2C%20At,exception%2C%20the%20message%20is%20lost.
   
   I believe (this needs to be looked a bit closer as I think it is not 
explicitly specified - but it would be great to verify it and document at least 
how the callbacks are delivered.  Exactly-once is surprisingly hard to achieve 
in distributed systems - and I think it's almost impossible when you consider 
all kind of failure cases.
   
   I believe what our callback mechanism tries to achieve is "at-least-once". I 
am not 100% sure if this is achieved in all kinds of situations, and I am not 
sure exactly what's the sequence of events is to lead to having duplicates, but 
generally speaking I think we cannot guarantee that callback will be executed 
exactly once. 
   
   If this is easily reproducible, then it might a bug because "at-least-once" 
is only happening when there are some unexpected events happening. If this is 
happening in "normal" circumstances - this is a bug. If it is triggered by some 
error scenarios - not so much and likely we won't be able to do anything about 
it - because it would be to complex and costly to try to implement 
"exactly-once".
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to