potiuk commented on pull request #19860:
URL: https://github.com/apache/airflow/pull/19860#issuecomment-983661659


   In short (for the future reference): 
   
   * Seems that ORM libraries in spawned processes share some resources with 
the parent processes and reinitializing the ORM engine in the spawned process 
will corrupt sessions already opened in the parent :exploding_head: - we should 
address this separately, as this is potentially (though not very probable) real 
issue that could affect our users who run processor manager in `spawn` mode. 
The result of it is unpredictable behaviour of sqlalchemy queries when the 
reload happens while an sql alchemy session is open.
   
   * It was triggered in our tests by unlikely race condition (and this is an 
issue I am going to fix in Airflow separately). When the process manager has 
been terminated before it managed to change the process group, our 
"reap_process_group" function did not actually terminate that process. Changing 
the process group is the very first thing that DagProcesorManager does, so it 
is rather unlikely to happen in "reality" but it is rather likely to happen 
when our CI runners are busy running a lot of tests in parallel and the test 
terminates the processor_manager very quickly after it started (which was the 
case in "spawn" case). I am fixing that one by catching the error that happens 
in this case (which we re-raised originally) and additionally killing the 
DagProcessor rather than killing the whole group.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to