Den 12.08.2011 18:57, skrev Rene Nejsum:
My two danish kroner on GIL issues….

I think I understand the background and need for GIL. Without it Python programs would have been cluttered with lock/synchronized statements and C-extensions would be harder to write. Thanks to Sturla Molden for he's explanation earlier in this thread.

I doesn't seem I managed to explain it :(

Yes, C extensions would be cluttered with synchronization statements, and that is annoying. But that was not my point all!

Even with fine-grained locking in place, a system using reference counting will not scale on an multi-processor computer. Cache-lines containing reference counts will become incoherent between the processors, causing traffic jam on the memory bus.

The technical term in parallel computing litterature is "false sharing".


However, the GIL is also from a time, where single threaded programs running in single core CPU's was the common case.

On a new MacBook Pro I have 8 core's and would expect my multithreaded Python program to run significantly fast than on a one-core CPU.

Instead the program slows down to a much worse performance than on a one-core CPU.

A multi-threaded program can be slower on a multi-processor computer as well, if it suffered from extensive "false sharing" (which Python programs nearly always will do).

That is, instead of doing useful work, the processors are stepping on each others toes. So they spend the bulk of the time synchronizing cache lines with RAM instead of computing.

On a computer with a single processor, there cannot be any false sharing. So even without a GIL, a multi-threaded program can often run faster on a single-processor computer. That might seem counter-intuitive at first. I seen this "inversed scaling" blamed on the GIL many times, but it's dead wrong.

Multi-threading is hard to get right, because the programmer must ensure that processors don't access the same cache lines. This is one of the reasons why numerical programs based on MPI (multiple processes and IPC) are likely to perform better than numerical programs based on OpenMP (multiple threads and shared memory).

As for Python, it means that it is easier to make a program based on multiprocessing scale well on a multi-processor computer, than a program based on threading and releasing the GIL. And that has nothing to do with the GIL! Albeit, I'd estimate 99% of Python programmers would blame it on the GIL. It has to do with what shared memory does if cache lines are shared. Intuition about what affects the performance of a multi-threaded program is very often wrong. If one needs parallel computing, multiple processes is much more likely to scale correctly. Threads are better reserved for things like non-blocking I/O.

The problem with the GIL is merely what people think it does -- not what it actually does. It is so easy to blame a performance issue on the GIL, when it is actually the use of threads and shared memory per se that is the problem.

Sturla
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to