jens-scheffler-bosch commented on issue #32375: URL: https://github.com/apache/airflow/issues/32375#issuecomment-1631376367
I was reading through the code a bit and took me a while to try to understand the "why". I don't actually see directly why the `pod_instance_mutation_hook` fails in DB connectivity. Maybe to be on the safe side you could add some `finally:`statements to areas/functions of code where a DB connection/engine is used to ensure it is really closed. Question: Is the DB you are connecting to in mutation hook the same DB that is used in Airflow as storage for the data? Or is it a different DB/instance? I am not 100% sure if sqlalchemy does some sharing if different functions open connections to the same backend and therefore interferes with the session setting-up the task instance in area of `task_instance_mutation_hook`. Nevertheless from point of design - irrespective if is is really a bug or not - I would recommend to implement a different strategy. I believe the idea of `task_instance_mutation_hook`'s is not to have additional connections, add a lot of business logic and to make "heavy lifting" logic operations. This might take time and as the mutation hooks are executed within the scheduler or webserver (and not the workers) any blocking activity (waiting for DB...) will slow down UI and/scheduler. I'd propose rather moving the DB checks for the status into a dedicated own task in front/beginning of the DAG. Not using the `task_instance_mutation_hook`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
