potiuk commented on pull request #19860:
URL: https://github.com/apache/airflow/pull/19860#issuecomment-983440489


   Hey @ashb @ephraimbuddy @uranusjr 
   
   I "caught it more" in the act..
   
   I have added some more debugging to the issue and I dumped both stack-trace 
and tree of processes at the moment of the "dag processor reload".
   
   We have now three stack-traces to analyse - see the latest errors.
   
   Findings: 
   1) Indeed, the `settings reload` that is causing the problem is caused  by 
dag processor manager.
   ```
     ----------------------------- Captured stderr call 
-----------------------------
       File "<string>", line 1, in <module>
       File "/usr/local/lib/python3.9/multiprocessing/spawn.py", line 116, in 
spawn_main
         exitcode = _main(fd, parent_sentinel)
       File "/usr/local/lib/python3.9/multiprocessing/spawn.py", line 129, in 
_main
         return self._bootstrap(parent_sentinel)
       File "/usr/local/lib/python3.9/multiprocessing/process.py", line 315, in 
_bootstrap
         self.run()
       File "/usr/local/lib/python3.9/multiprocessing/process.py", line 108, in 
run
         self._target(*self._args, **self._kwargs)
       File "/opt/airflow/airflow/dag_processing/manager.py", line 268, in 
_run_processor_manager
         traceback.print_stack()
   ```
   
   2) The process tree is interesting. I did not know how Pytest manages 
separation between different tests, but it seems that it forks a separate 
process for each tests and runs one process at a time, but all the processes 
are loaded and waiting to start while one of the processes runs a test.  I have 
not looked in details yet but I think this could explain the behaviour observed 
- if those forked processes share some memory via SQL drivers then running 
import in dag processor manager could potentially reload some shared memory 
(for example mapping of object classes to actuall types of the entities). 
   
   I think I never saw it happening for sqlite, it only happens for the "real" 
databases, so there might be some clever handling of multi-processing that we 
are not aware of.
   
   Excerpt:
   
   ```
    ► 1     (root) [dumb-init] 02:14 /usr/bin/dumb-init -- /entrypoint
       ├─7     (root) [bash] 02:14 bash /entrypoint
       │ └─134   (root) [bash] 02:15 bash 
/opt/airflow/scripts/in_container/run_ci_tests.sh --verbosity=0 
--strict-markers --durations=100 --maxfail=50 --color=yes 
--pythonwarnings=ignore::DeprecationWarning 
--pythonwarnings=ignore::PendingDeprecationWarning 
--junitxml=/files/test_result-Core-mssql.xml --timeouts-order moi 
--setup-timeout=60 --execution-timeout=60 --teardown-timeout=60 -rfEX 
--with-db-init tests/core tests/executors tests/jobs tests/models 
tests/serialization tests/ti_deps tests/utils
       │   └─185   (root) [pytest] 02:15 /usr/local/bin/python 
/usr/local/bin/pytest --verbosity=0 --strict-markers --durations=100 
--maxfail=50 --color=yes --pythonwarnings=ignore::DeprecationWarning 
--pythonwarnings=ignore::PendingDeprecationWarning 
--junitxml=/files/test_result-Core-mssql.xml --timeouts-order moi 
--setup-timeout=60 --execution-timeout=60 --teardown-timeout=60 -rfEX 
--with-db-init tests/core tests/executors tests/jobs tests/models 
tests/serialization tests/ti_deps tests/utils
       │     ├─1448  (root) [python] 02:16 /usr/local/bin/python -B -c from 
multiprocessing.resource_tracker import main;main(30)
       │     ├─2127  (root) [pytest] 02:18 /usr/local/bin/python 
/usr/local/bin/pytest --verbosity=0 --strict-markers --durations=100 
--maxfail=50 --color=yes --pythonwarnings=ignore::DeprecationWarning 
--pythonwarnings=ignore::PendingDeprecationWarning 
--junitxml=/files/test_result-Core-mssql.xml --timeouts-order moi 
--setup-timeout=60 --execution-timeout=60 --teardown-timeout=60 -rfEX 
--with-db-init tests/core tests/executors tests/jobs tests/models 
tests/serialization tests/ti_deps tests/utils
       │     ├─2141  (root) [pytest] 02:18 /usr/local/bin/python 
/usr/local/bin/pytest --verbosity=0 --strict-markers --durations=100 
--maxfail=50 --color=yes --pythonwarnings=ignore::DeprecationWarning 
--pythonwarnings=ignore::PendingDeprecationWarning 
--junitxml=/files/test_result-Core-mssql.xml --timeouts-order moi 
--setup-timeout=60 --execution-timeout=60 --teardown-timeout=60 -rfEX 
--with-db-init tests/core tests/executors tests/jobs tests/models 
tests/serialization tests/ti_deps tests/utils
       │     ├─2157  (root) [pytest] 02:18 /usr/local/bin/python 
/usr/local/bin/pytest --verbosity=0 --strict-markers --durations=100 
--maxfail=50 --color=yes --pythonwarnings=ignore::DeprecationWarning 
--pythonwarnings=ignore::PendingDeprecationWarning 
--junitxml=/files/test_result-Core-mssql.xml --timeouts-order moi 
--setup-timeout=60 --execution-timeout=60 --teardown-timeout=60 -rfEX 
--with-db-init tests/core tests/executors tests/jobs tests/models 
tests/serialization tests/ti_deps tests/utils
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to