Python doesn't use a pthreads mutex for the GIL. It has always used a binary semaphore implemented with condition variables (or just a pthreads semaphore if available). The reason the performance is so bad is precisely due to the fact that it is using this implementation and the fact that there *IS* a FIFO queue of threads (associated with the condition variable). The I/O performance problem with the new GIL is gets much worse with many CPU-bound threads precisely because there is a FIFO queue involved. This has been covered in my past GIL presentations.
-Dave On Mar 16, 2010, at 5:52 AM, Kristján Valur Jónsson wrote: > How about attacking the original problem, then? > > The reason they thrash on pthreads implementation is that a pthreads mutex is > assumed to be a short-held resource. Therefore it will be optimized in the > following ways for multicore machines: > 1) There is a certain amount of spinning done, to try to acquire it before > blocking > 2) It will employ un-fair tactics to avoid lock-convoying, meaning that a > thread coming in to acquire the mutex may get in before others that are > queued. This is why "ticking" the GIL works so badly: The thread that > releases the lock is usually the one that reaquires it even though others may > be waiting. See e.g. > http://www.bluebytesoftware.com/blog/PermaLink,guid,e40c2675-43a3-410f-8f85-616ef7b031aa.aspx > for a discussion of this (albeit on windows.) > > On Windows, this isn't a problem. The reason is, that the GIL on windows is > implemented using Event objects that don't cut these corners. The Event > provides you with a strict FIFO queue of objects waiting for the event. > > If pthreads doesn't provide a synchronization primitive similar to that, > someone that doesn't thrash and has a true FIFO queue, it is possible to > construct such a thing using condition variables and critical sections. > Perhaps the posix semaphore api is more appropriate in this case. > > By the way, this also shows another problem with (old) python. There is only > one core locking primitive, the PyThread_type_lock. It is being used both as > a critical section in the traditional sense, and also as this sort-of-inverse > lock that the GIL is. In the modern world, where the intended behaviour of > these is quite different, there is no one-size-fits all. On windows in > particular, the use of the Event object based lock is not ideal for other > uses than the GIL. > > > In the new GIL, there appear to be several problems: > 1) There is no FIFO queue of threads wanting the queue, thus thread > scheduling becomes non-deterministic > 2) The "ticking" of the GIL is now controled by a condition variable timeout. > There appears to be no way to prevent many such timeouts to be in progress > at the same time, thus you may have an unnecessarily high rate of ticking > going on. > 3) There isn't an immediate gil request made when an IO thread requests the > gil back, only after an initial timeout. > > What we are trying to write here is a thread scheduler, and that is complex > business. > K > > > >> -----Original Message----- >> From: python-dev-bounces+kristjan=ccpgames....@python.org >> [mailto:python-dev-bounces+kristjan=ccpgames....@python.org] On Behalf >> Of David Beazley >> Sent: 15. mars 2010 03:07 >> To: python-dev@python.org >> Subject: Re: [Python-Dev] "Fixing" the new GIL >> >> happen to be performing CPU intensive work at the same time, it would >> be nice if they didn't thrash on multiple cores (the problem with the >> old GIL) and if I/O is > _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com