Re: [Python-Dev] GIL removal question

Sturla Molden Fri, 12 Aug 2011 12:09:07 -0700

Den 12.08.2011 18:57, skrev Rene Nejsum:

My two danish kroner on GIL issues….
I think I understand the background and need for GIL. Without itPython programs would have been cluttered with lock/synchronizedstatements and C-extensions would be harder to write. Thanks to SturlaMolden for he's explanation earlier in this thread.


I doesn't seem I managed to explain it :(

Yes, C extensions would be cluttered with synchronization statements,and that is annoying. But that was not my point all!

Even with fine-grained locking in place, a system using referencecounting will not scale on an multi-processor computer. Cache-linescontaining reference counts will become incoherent between theprocessors, causing traffic jam on the memory bus.


The technical term in parallel computing litterature is "false sharing".

However, the GIL is also from a time, where single threaded programsrunning in single core CPU's was the common case.
On a new MacBook Pro I have 8 core's and would expect my multithreadedPython program to run significantly fast than on a one-core CPU.
Instead the program slows down to a much worse performance than on aone-core CPU.

A multi-threaded program can be slower on a multi-processor computer aswell, if it suffered from extensive "false sharing" (which Pythonprograms nearly always will do).

That is, instead of doing useful work, the processors are stepping oneach others toes. So they spend the bulk of the time synchronizing cachelines with RAM instead of computing.

On a computer with a single processor, there cannot be any falsesharing. So even without a GIL, a multi-threaded program can often runfaster on a single-processor computer. That might seem counter-intuitiveat first. I seen this "inversed scaling" blamed on the GIL many times,but it's dead wrong.

Multi-threading is hard to get right, because the programmer must ensurethat processors don't access the same cache lines. This is one of thereasons why numerical programs based on MPI (multiple processes and IPC)are likely to perform better than numerical programs based on OpenMP(multiple threads and shared memory).

As for Python, it means that it is easier to make a program based onmultiprocessing scale well on a multi-processor computer, than a programbased on threading and releasing the GIL. And that has nothing to dowith the GIL! Albeit, I'd estimate 99% of Python programmers would blameit on the GIL. It has to do with what shared memory does if cache linesare shared. Intuition about what affects the performance of amulti-threaded program is very often wrong. If one needs parallelcomputing, multiple processes is much more likely to scale correctly.Threads are better reserved for things like non-blocking I/O.

The problem with the GIL is merely what people think it does -- not whatit actually does. It is so easy to blame a performance issue on the GIL,when it is actually the use of threads and shared memory per se that isthe problem.


Sturla
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] GIL removal question

Reply via email to