potiuk commented on pull request #19860: URL: https://github.com/apache/airflow/pull/19860#issuecomment-983661659
In short (for the future reference): * Seems that ORM libraries in spawned processes share some resources with the parent processes and reinitializing the ORM engine in the spawned process will corrupt sessions already opened in the parent :exploding_head: - we should address this separately, as this is potentially (though not very probable) real issue that could affect our users who run processor manager in `spawn` mode. The result of it is unpredictable behaviour of sqlalchemy queries when the reload happens while an sql alchemy session is open. * It was triggered in our tests by unlikely race condition (and this is an issue I am going to fix in Airflow separately). When the process manager has been terminated before it managed to change the process group, our "reap_process_group" function did not actually terminate that process. Changing the process group is the very first thing that DagProcesorManager does, so it is rather unlikely to happen in "reality" but it is rather likely to happen when our CI runners are busy running a lot of tests in parallel and the test terminates the processor_manager very quickly after it started (which was the case in "spawn" case). I am fixing that one by catching the error that happens in this case (which we re-raised originally) and additionally killing the DagProcessor rather than killing the whole group. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
