Hi Timothy, On Tue, Jan 31, 2012 at 16:26, Timothy Baldridge <[email protected]> wrote: > def foo(d): > if "foo" in d: > del d["foo"] > > Will never cause a segmentation fault (due to multiple threads > accessing "d" at the same time), but it may throw a KeyError. That is, > all Python code will continue to be "thread-safe" as it is in CPython, > but race conditions will continue to exist and must be handled in > standard ways (locks, CAS, etc.).
No, precisely not. Such code will continue to work as it is, without race conditions or any of the messy multithread-induced headaches. Locks, CAS, etc. are all superflous. So what would work and not work? In one word: all "transactions" work exactly as if run serially, one after another. A transaction is just one unit of work; a callback. We use this working code for comparison on top of CPython or a non-STM PyPy: https://bitbucket.org/pypy/pypy/raw/stm/lib_pypy/transaction.py . You add transactions with the add() function, and execute them all with the run() function (which typically contains further add()s). The only source of non-determinism is in run() taking a random transaction as the next one. Of course this demo code runs the transactions serially, but the point is that even "pypy-stm" gives you the illusion of running them serially. So you stat by writing code that is *safe*, and then you have to think a bit in order to increase the parallelism, instead of the other way around when using traditional multithreading in non-Python languages. There are rules that are a bit subtle (but not too much) about when transactions can parallelize or not. Basically, as soon as a transaction does I/O, all the other transactions will be stalled; and if transactions very often change the same objects, then you will get a lot of conflicts and restarts. > "In PyPy, we look at STM like we would look at the GC. It may be > replaced in a week by a different one, but for the "end user" writing > pure Python code, it essentially doesn't make a difference. " I meant to say that STM, in our case, is just (hopefully) an optimization that lets some programs run on multiple CPUs --- the ones that are based on the 'transaction' module. But it's still just an optimization in the sense that the programs run exactly as if using the transaction.py I described above. In yet other words: notice that transaction.py doesn't even use the 'thread' module. So if we get the same behavior with pypy-stm's built-in 'transaction' module, it means that the example you described is perfectly safe as it is. (Update: today we have a "pypy-stm" that works exactly like transaction.py and exhibits multiple-CPU usage. It's just terribly slow and doesn't free any memory ever :-) But it runs http://paste.pocoo.org/show/543646/ , which is a simple epoll-based server creating new transactions in order to do the CPU-intensive portions of answering the requests. In the code there is no trace of CAS, locks, 'multiprocessing', etc.) A bientôt, Armin. _______________________________________________ pypy-dev mailing list [email protected] http://mail.python.org/mailman/listinfo/pypy-dev
