Author: Armin Rigo <ar...@tunes.org> Branch: stm-thread Changeset: r54985:c33c8f8595e4 Date: 2012-05-09 15:33 +0200 http://bitbucket.org/pypy/pypy/changeset/c33c8f8595e4/
Log: Complete diff --git a/pypy/doc/stm.rst b/pypy/doc/stm.rst --- a/pypy/doc/stm.rst +++ b/pypy/doc/stm.rst @@ -9,24 +9,33 @@ PyPy can be translated in a special mode based on Software Transactional Memory (STM). This mode is not compatible with the JIT so far, and moreover -adds a constant run-time overhead in the range 2x to 5x. The benefit is -that the resulting ``pypy-stm`` can execute multiple threads of Python code -in parallel. +adds a constant run-time overhead, expected to be in the range 2x to 5x. +(XXX for now it is bigger, but past experience show it can be reduced.) +The benefit is that the resulting ``pypy-stm`` can execute multiple +threads of Python code in parallel. + +* ``pypy-stm`` is fully compatible with a GIL-based PyPy; you can use it + as a drop-in replacement and multithreaded programs will run on multiple + cores. + +* ``pypy-stm`` adds a low-level API in the ``thread`` module, namely + ``thread.atomic``, that can be used as described below. This is meant + to improve existing multithread-based programs. It is also meant to + be used to build higher-level interfaces on top of it. + +* A number of higher-level interfaces are planned, using internally + threads and ``thread.atomic``. They are meant to be used in + non-thread-based programs. Given the higher level, we also recommend + using them in new programs instead of structuring your program to use + raw threads. High-level interface ==================== -At the lowest levels, the Global Interpreter Lock (GIL) was just -replaced with STM techniques. This gives a ``pypy-stm`` that should -behave identically to a regular GIL-enabled PyPy, but run multithreaded -programs in a way that scales with the number of cores. The details of -the implementation are explained below. - -However, what we are pushing for is *not writing multithreaded programs* -at all. It is possible to use higher-level interfaces. The basic one -is found in the ``transaction`` module (XXX name to change). Minimal -example of usage:: +The basic high-level interface is planned in the ``transaction`` module +(XXX name can change). A minimal example of usage will be along the +lines of:: for i in range(10): transaction.add(do_stuff, i) @@ -34,9 +43,9 @@ This schedules and runs all ten ``do_stuff(i)``. Each one appears to run serially, but in random order. It is also possible to ``add()`` -more transactions within each transaction, to schedule additional pieces -of work. The call to ``run()`` returns when all transactions have -completed. +more transactions within each transaction, causing additional pieces of +work to be scheduled. The call to ``run()`` returns when all +transactions have completed. The module is written in pure Python (XXX not written yet, add url). See the source code to see how it is based on the `low-level interface`_. @@ -45,20 +54,20 @@ Low-level interface =================== -``pypy-stm`` offers one additional low-level API: ``thread.atomic``. -This is a context manager to use in a ``with`` statement. Any code -running in the ``with thread.atomic`` block is guaranteed to be fully -serialized with respect to any code run by other threads (so-called -*strong isolation*). +Besides replacing the GIL with STM techniques, ``pypy-stm`` offers one +additional explicit low-level API: ``thread.atomic``. This is a context +manager to use in a ``with`` statement. Any code running in the ``with +thread.atomic`` block is guaranteed to be fully serialized with respect +to any code run by other threads (so-called *strong isolation*). Note that this is a guarantee of observed behavior: under the conditions -described below, multiple ``thread.atomic`` blocks can actually run in -parallel. +described below, a ``thread.atomic`` block can actually run in parallel +with other threads, whether they are in a ``thread.atomic`` or not. Classical minimal example: in a thread, you want to pop an item from ``list1`` and append it to ``list2``, knowing that both lists can be -mutated concurrently by other threads. Using ``thread.atomic`` this -can be done without careful usage of locks:: +mutated concurrently by other threads. Using ``thread.atomic`` this can +be done without careful usage of locks on any mutation of the lists:: with thread.atomic: x = list1.pop() @@ -91,10 +100,9 @@ inside ``thread.atomic`` blocks. Writing this kind of code:: with thread.atomic: - print "hello, the value is:" - print "\t", value + print "hello, the value is:", value -actually also helps ensuring that the whole line or lines are printed +actually also helps ensuring that the whole line (or lines) is printed atomically, instead of being broken up with interleaved output from other threads. @@ -115,8 +123,8 @@ Each thread is actually running as a sequence of "transactions", which are separated by "transaction breaks". The execution of the whole -multithreaded program works as if all transactions were serialized, but -actually executing the transactions in parallel. +multithreaded program works as if all transactions were serialized. +You don't see the transactions actually running in parallel. This works as long as two principles are respected. The first one is that the transactions must not *conflict* with each other. The most @@ -140,17 +148,17 @@ Transaction breaks *never* occur in ``thread.atomic`` mode. -Every transaction can further be in one of two modes: either "normal" or -"inevitable". To simplify, a transaction starts in "normal" mode, but -switches to "inevitable" as soon as it performs input/output. If we -have an inevitable transaction, all other transactions are paused; this -effect is similar to the GIL. +Additionally, every transaction can further be in one of two modes: +either "normal" or "inevitable". To simplify, a transaction starts in +"normal" mode, but switches to "inevitable" as soon as it performs +input/output. If we have an inevitable transaction, all other +transactions are paused; this effect is similar to the GIL. In the absence of ``thread.atomic``, inevitable transactions only have a small effect. Indeed, as soon as the current bytecode finishes, the interpreter notices that the transaction is inevitable and immediately introduces a transaction break in order to switch back to a normal-mode -transaction. It means that inevitable transactions only run for a short +transaction. It means that inevitable transactions only run for a small fraction of the time. With ``thread.atomic`` however you have to be a bit careful, because the @@ -158,7 +166,15 @@ ``with thread.atomic``. Basically, you should organize your code in such a way that for any ``thread.atomic`` block that runs for a noticable time, any I/O is done near the end of it, not when there is -still a lot of CPU time ahead. +still a lot of CPU (or I/O) time ahead. + +In particular, this means that you should ideally avoid blocking I/O +operations in ``thread.atomic`` blocks. They work, but because the +transaction is turned inevitable *before* the I/O is performed, they +will prevent any parallel work at all. (This may look like +``thread.atomic`` blocks reverse the usual effects of the GIL: if the +block is computation-intensive it will nicely be parallelized, but doing +any long I/O prevents any parallel work.) Implementation _______________________________________________ pypy-commit mailing list pypy-commit@python.org http://mail.python.org/mailman/listinfo/pypy-commit