> Hi, > > There have been a lot of changes both to the C API and to internal > implementations to allow multiple interpreters in a single O/S process. > > These changes cause backwards compatibility changes, have a negative > performance impact, and cause a lot of churn. > > While I'm in favour of PEP 554, or some similar model for parallelism in > Python, I am opposed to the changes we are currently making to support it. > > > What are sub-interpreters? > -------------------------- > > A sub-interpreter is a logically independent Python process which > supports inter-interpreter communication built on shared memory and > channels. Passing of Python objects is supported, but only by copying, > not by reference. Data can be shared via buffers. > > > How can they be implemented to support parallelism? > --------------------------------------------------- > > There are two obvious options. > a) Many sub-interpreters in a single O/S process. I will call this the > many-to-one model (many interpreters in one O/S process). > b) One sub-interpreter per O/S process. This is what we currently have > for multiprocessing. I will call this the one-to-one model (one > interpreter in one O/S process). > > There seems to be an assumption amongst those working on PEP 554 that > the many-to-one model is the only way to support sub-interpreters that > can execute in parallel. > This isn't true. The one-to-one model has many advantages. > > > Advantages of the one-to-one model > ---------------------------------- > > 1. It's less bug prone. It is much easier to reason about code working > in a single address space. Most code assumes
I'm curious where reasoning about address spaces comes into writing Python code? I can't say that address space has ever been a concern to me when coding in Python. > 2. It's more secure. Separate O/S processes provide a much stronger > boundary between interpreters. This is why some browsers use separate > processes for browser tabs. > > 3. It can be implemented on top of the multiprocessing module, for > testing. A more efficient implementation can be developed once > sub-interpreters prove useful. > > 4. The required changes should have no negative performance impact. > > 5. Third party modules should continue to work as they do now. > > 6. It takes much less work :) > > > Performance > ----------- > > Creating O/S processes is usually considered to be slow. Whilst > processes are undoubtedly slower to create than threads, the absolute > time to create a process is small; well under 1ms on linux. > > Creating a new sub-interpreter typically requires importing quite a few > modules before any useful work can be done. > The time spent doing these imports will dominate the time to create an > O/S process or thread. > If sub-interpreters are to be used for parallelism, there is no need to > have many more sub-interpreters than CPU cores, so the overhead should > be small. For additional concurrency, threads or coroutines can be used. > > The one-to-one model is faster as it uses the hardware for interpreter > separation, whereas the many-to-one model must use software. > Process separation by the hardware virtual memory system has zero cost. > Separation done in software needs extra memory reads when doing > allocation or deallocation. > > Overall, for any interpreter that runs for a second or more, it is > likely that the one-to-one model would be faster. > > > Timings of multiprocessing & threads on my machine (6-core 2019 laptop) > ----------------------------------------------------------------------- > > #Threads > > def foo(): > pass > > def spawn_and_join(count): > threads = [ Thread(target=foo, args=()) for _ in range(count) ] > for t in threads: > t.start() > for t in threads: > t.join() > > spawn_and_join(1000) > > # Processes > > def spawn_and_join(count): > processes = [ Process(target=foo, args=()) for _ in range(count) ] > for p in processes: > p.start() > for p in processes: > p.join() > > spawn_and_join(1000) > > Wall clock time for threads: > 86ms. Less than 0.1ms per thread. > > Wall clock time for processes: > 370ms. Less than 0.4ms per process. > > Processes are slower, but plenty fast enough. > > > Cheers, > Mark. > > > > > _______________________________________________ > Python-Dev mailing list -- python-dev@python.org > To unsubscribe send an email to python-dev-le...@python.org > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at https://mail.python.org/archives/list/python- > d...@python.org/message/5YNWDIYECDQDYQ7IFYJS6K5HUDUAWTT6/ > Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ASE7D7Q62WL22QNC4XMUGI56HNGOWRWH/ Code of Conduct: http://python.org/psf/codeofconduct/