Re: How to go about. On read/write locks
Carl Banks pavlovevide...@gmail.com (CB) wrote: CB On Apr 6, 2:23 pm, Diez B. Roggisch de...@nospam.web.de wrote: This is a classical synchronization problem with a classical solution: You treat the readers as a group, and the writers individually. So you have a write lock that each writer has to acquire and release, but it is acquired only by the first reader and released by the last one. Therefore you need a counter of the number of readers, and manipulations of this counter must be protected by another lock. I was going to suggest a similar approach but refused to because of a problem I see with your code as well - if the readers are reading very fast (the OP didn't state what his actual application is, so it might not be a consumer-producer scheme which I don't think a dict would be the natural choice anyway) they can block the writer from writing alltogether. CB You could implement some kind of fair ordering where whoever requests CB a lock first is guaranteed to get it first, but I can't think of a way CB to do that without requiring all readers to acquire two locks. The original implementation (with writers starvation) comes from [1]. They also describe a solution where witers have priority, but it needs 5 semaphores and 2 counters (one for writers and one for readers). It can cause starvation for readers, however. For the OP this wouldn't be a problem because writers are rare in his situation. However, I found a solution [2] with just one additional counter for the number of writers and no additional semaphores. The manipulations of the writers counter are also protected by the same mutex. This solution is fair for both readers and writers. Translated in Python this would be: # from threading import Lock mutex = Lock() writelock = Lock() numreaders = 0 numwriters = 0 # # Reader code: mutex.acquire() if numwriters 0 or numreaders == 0: mutex.release() writelock.acquire() mutex.acquire() numreaders += 1 mutex.release() ## critical section for reader mutex.acquire() numreaders -= 1 if numreaders == 0: writelock.release() mutex.release() # # Writer code: mutex.acquire() numwriters += 1 mutex.release() writelock.acquire() ## critical section for writer mutex.acquire() numwriters -= 1 mutex.release() writelock.release() # I am going to put this in a class, with a context manager so that it can be use with the 'with' statement like the normal locks. I also found a solution with no additional counters but an additional semaphore, but I found it only in lecture notes [3,4]; I couldn't find any scientific publications about it. So I don't know for sure if it is correct, fair etc. [1] P. J. Courtois , F. Heymans , D. L. Parnas, Concurrent control with “readers” and “writers”, Communications of the ACM, v.14 n.10, p.667-668, Oct. 1971 [2] Jalal Kawash, Process Synchronization with Readers and Writers Revisited Journal of Computing and Information Technology - CIT 13, 2005, 1, 43–51 [3] http://pages.cs.wisc.edu/~haryadi/537/slides/lec18-semaphores.ppt [4] http://vorlon.case.edu/~jrh23/338/HW3.pdf -- Piet van Oostrum p...@cs.uu.nl URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org -- http://mail.python.org/mailman/listinfo/python-list
Re: How to go about. On read/write locks
Diez B. Roggisch de...@nospam.web.de (DBR) wrote: This is a classical synchronization problem with a classical solution: You treat the readers as a group, and the writers individually. So you have a write lock that each writer has to acquire and release, but it is acquired only by the first reader and released by the last one. Therefore you need a counter of the number of readers, and manipulations of this counter must be protected by another lock. DBR I was going to suggest a similar approach but refused to because of a DBR problem I see with your code as well - if the readers are reading very fast DBR (the OP didn't state what his actual application is, so it might not be a DBR consumer-producer scheme which I don't think a dict would be the natural DBR choice anyway) they can block the writer from writing alltogether. DBR Or do I miss something here? Yes, you missed note 2 at the end of my posting. -- Piet van Oostrum p...@cs.uu.nl URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org -- http://mail.python.org/mailman/listinfo/python-list
Re: How to go about. On read/write locks
Emanuele D'Arrigo schrieb: Hi everybody, I'm having a threading-related design issue and I suspect it has a name that I just don't know. Here's a description. Let's assume a resource (i.e. a dictionary) that needs to be accessed by multiple threads. A simple lock will do the job but in some circumstances it will create an unnecessary bottleneck. I.e. let's assume that most threads only need to have a -read- access to the resource, while only few threads actually change the dictionary. Ideally, the reading threads should not block each other. However, as soon as a threads intends to change the dictionary it should let all the reading threads finish, lock the access to the resource, change it, and then release the lock. I don't think it'd be difficult to implement but I'm wondering if something in this direction has been done already, if it has a name or if it's even well known design pattern. Anybody can shed some light? There are two answers to this - the general, and the CPython-specific. The general is this: if you need reading to finish before you write, you need a lock that guards every access to the dict. There is no cheap to obtain but only effective for some threads-lock. The CPython-specific answer is that the GIL takes care of that for you right now anyway. So unless you plan for a distant future where some kind of swallows fly around that don't have a GIL, you are safe to simply read and write in threads without any locking whatsoever. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: How to go about. On read/write locks
On Apr 6, 7:49 am, Diez B. Roggisch de...@nospam.web.de wrote: The CPython-specific answer is that the GIL takes care of that for you right now anyway. So unless you plan for a distant future where some kind of swallows fly around that don't have a GIL, you are safe to simply read and write in threads without any locking whatsoever. Diez, thanks for your reply. I didn't know what the GIL is. I did some research finding an interesting article that did clarify many multi- threading related concepts and issues: http://jessenoller.com/2009/02/01/python-threads-and-the-global-interpreter-lock/ Python's approach with the GIL is both reasonable and disappointing. Reasonable because I understand how it can make things easier for its internals. Disappointing because it means that standard python cannot take advantage of the parallelism that can more and more often be afforded by today's computers. I.e. I found only recently, almost by chance, that my wife's laptop has not one but two processors, even though it isn't a particularly high-end computer. I now understand that OS-level threading does use them both, but I understand that the GIL effectively prevents parallel operations. (Am I understanding correctly?) I do not completely understand your statement in the context of my original example though, the shared dictionary. As the GIL is released every X bytecode operations surely it can happen that as the dictionary is iterated through, i.e. in a for/in loop, a different thread might change it, with potentially catastrophic consequences. The GIL wouldn't be able to prevent this, wouldn't it? Manu -- http://mail.python.org/mailman/listinfo/python-list
Re: How to go about. On read/write locks
Emanuele D'Arrigo man...@gmail.com (ED) wrote: ED Hi everybody, ED I'm having a threading-related design issue and I suspect it has a ED name that I just don't know. Here's a description. ED Let's assume a resource (i.e. a dictionary) that needs to be accessed ED by multiple threads. A simple lock will do the job but in some ED circumstances it will create an unnecessary bottleneck. I.e. let's ED assume that most threads only need to have a -read- access to the ED resource, while only few threads actually change the dictionary. ED Ideally, the reading threads should not block each other. However, as ED soon as a threads intends to change the dictionary it should let all ED the reading threads finish, lock the access to the resource, change ED it, and then release the lock. ED I don't think it'd be difficult to implement but I'm wondering if ED something in this direction has been done already, if it has a name or ED if it's even well known design pattern. ED Anybody can shed some light? This is a classical synchronization problem with a classical solution: You treat the readers as a group, and the writers individually. So you have a write lock that each writer has to acquire and release, but it is acquired only by the first reader and released by the last one. Therefore you need a counter of the number of readers, and manipulations of this counter must be protected by another lock. # from threading import Lock mutex = Lock() writelock = Lock() numreaders = 0 # # Reader code: mutex.acquire() numreaders += 1 if numreaders == 1: writelock.acquire() mutex.release() ## critical section for reader mutex.acquire() numreaders -= 1 if numreaders == 0: writelock.release() mutex.release() # # Writer code: writelock.acquire() ## critical section for writer writer.release # Notes: 1. From Python 2.6 on you can use the with statement, which makes it more robust against exceptions: with mutex: numreaders += 1 if numreaders == 1: writelock.acquire() etc. In Python 2.5 you can also use this if you use: from __future__ import with_statement 2. The code above can cause starvation for writers if there are many readers (or if new readers come in before all other readers have finished. You need at least one more lock and probably a writer counter to solve this. 3. See also http://code.activestate.com/recipes/465156/ 4. The code has not been tested, not even for syntax errors. -- Piet van Oostrum p...@cs.uu.nl URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org -- http://mail.python.org/mailman/listinfo/python-list
Re: How to go about. On read/write locks
On Apr 6, 12:44 pm, Piet van Oostrum p...@cs.uu.nl wrote: 3. See also http://code.activestate.com/recipes/465156/ Thank you for the useful suggestions Piet. In particular I just had a look at the SharedLock class provided through the link above and it seems to fit the bill quite nicely. I'll give it a go! Thank you again! Manu -- http://mail.python.org/mailman/listinfo/python-list
Re: How to go about. On read/write locks
This is a classical synchronization problem with a classical solution: You treat the readers as a group, and the writers individually. So you have a write lock that each writer has to acquire and release, but it is acquired only by the first reader and released by the last one. Therefore you need a counter of the number of readers, and manipulations of this counter must be protected by another lock. I was going to suggest a similar approach but refused to because of a problem I see with your code as well - if the readers are reading very fast (the OP didn't state what his actual application is, so it might not be a consumer-producer scheme which I don't think a dict would be the natural choice anyway) they can block the writer from writing alltogether. Or do I miss something here? Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: How to go about. On read/write locks
Python's approach with the GIL is both reasonable and disappointing. Reasonable because I understand how it can make things easier for its internals. Disappointing because it means that standard python cannot take advantage of the parallelism that can more and more often be afforded by today's computers. I.e. I found only recently, almost by chance, that my wife's laptop has not one but two processors, even though it isn't a particularly high-end computer. I now understand that OS-level threading does use them both, but I understand that the GIL effectively prevents parallel operations. (Am I understanding correctly?) Not entirely. Yes, if your application is CPU-bound. No if it's IO-bound. And a lot of people think that threads are actually the wrong approach for concurrency anyway, so with python2.6 there comes the multiprocessing-module that lets you use the full capacity of your CPUs. I do not completely understand your statement in the context of my original example though, the shared dictionary. As the GIL is released every X bytecode operations surely it can happen that as the dictionary is iterated through, i.e. in a for/in loop, a different thread might change it, with potentially catastrophic consequences. The GIL wouldn't be able to prevent this, wouldn't it? You didn't give a concrete usage scenario for your shared dict - but I assumed that by reading and writing you meant mydict[key] = value value = mydict[key] which are both atomic through the GIL. More complex operations - such as iteration - might need more coarse grained locking. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: How to go about. On read/write locks
On Apr 6, 3:30 am, Emanuele D'Arrigo man...@gmail.com wrote: Python's approach with the GIL is both reasonable and disappointing. Reasonable because I understand how it can make things easier for its internals. Disappointing because it means that standard python cannot take advantage of the parallelism that can more and more often be afforded by today's computers. I.e. I found only recently, almost by chance, that my wife's laptop has not one but two processors, even though it isn't a particularly high-end computer. I now understand that OS-level threading does use them both, but I understand that the GIL effectively prevents parallel operations. (Am I understanding correctly?) Mostly, but keep in mind that non-Python code can run on a different core at the same time. This could be stuff like I/O or numerical calcuations written in C. I do not completely understand your statement in the context of my original example though, the shared dictionary. As the GIL is released every X bytecode operations surely it can happen that as the dictionary is iterated through, i.e. in a for/in loop, a different thread might change it, with potentially catastrophic consequences. The GIL wouldn't be able to prevent this, wouldn't it? It'll prevent catastrophic consequences such as segfaults, yes. It won't prevent incorrect results, or some other thread changing something under your feet, though. Also, I believe dicts know when they're being iterated over and will raise an exception on any attempt to add or remove a new key, so at worst you might get a value changed under your feet. I would say, therefore, that as long as you are only modifying values, and not adding or removing them, Diez is correct. (Someone more familiar with dict internals might want to verify.) Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: How to go about. On read/write locks
On Apr 6, 2:23 pm, Diez B. Roggisch de...@nospam.web.de wrote: This is a classical synchronization problem with a classical solution: You treat the readers as a group, and the writers individually. So you have a write lock that each writer has to acquire and release, but it is acquired only by the first reader and released by the last one. Therefore you need a counter of the number of readers, and manipulations of this counter must be protected by another lock. I was going to suggest a similar approach but refused to because of a problem I see with your code as well - if the readers are reading very fast (the OP didn't state what his actual application is, so it might not be a consumer-producer scheme which I don't think a dict would be the natural choice anyway) they can block the writer from writing alltogether. You could implement some kind of fair ordering where whoever requests a lock first is guaranteed to get it first, but I can't think of a way to do that without requiring all readers to acquire two locks. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list