jens-scheffler-bosch commented on issue #32375:
URL: https://github.com/apache/airflow/issues/32375#issuecomment-1631376367

   I was reading through the code a bit and took me a while to try to 
understand the "why". I don't actually see directly why the 
`pod_instance_mutation_hook` fails in DB connectivity. Maybe to be on the safe 
side you could add some `finally:`statements to areas/functions of code where a 
DB connection/engine is used to ensure it is really closed.
   
   Question: Is the DB you are connecting to in mutation hook the same DB that 
is used in Airflow as storage for the data? Or is it a different DB/instance? I 
am not 100% sure if sqlalchemy does some sharing if different functions open 
connections to the same backend and therefore interferes with the session 
setting-up the task instance in area of `task_instance_mutation_hook`.
   
   Nevertheless from point of design - irrespective if is is really a bug or 
not - I would recommend to implement a different strategy. I believe the idea 
of `task_instance_mutation_hook`'s is not to have additional connections, add a 
lot of business logic and to make "heavy lifting" logic operations. This might 
take time and as the mutation hooks are executed within the scheduler or 
webserver (and not the workers) any blocking activity (waiting for DB...) will 
slow down UI and/scheduler.
   
   I'd propose rather moving the DB checks for the status into a dedicated own 
task in front/beginning of the DAG. Not using the `task_instance_mutation_hook`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to