Author: Armin Rigo <ar...@tunes.org>
Branch: extradoc
Changeset: r4666:41c358b89418
Date: 2012-08-17 13:48 +0200
http://bitbucket.org/pypy/extradoc/changeset/41c358b89418/

Log:    Updates

diff --git a/talk/stm2012/stmimpl.rst b/talk/stm2012/stmimpl.rst
--- a/talk/stm2012/stmimpl.rst
+++ b/talk/stm2012/stmimpl.rst
@@ -20,9 +20,9 @@
 original global object is *not* changed --- it is really immutable.  But
 the copy becomes global, and the old global object's header is updated
 with a pointer to the new global object.  We thus make a chained list
-of global versions.
+of global revisions.
 
-It is the job of the GC to collect the older versions when they are
+It is the job of the GC to collect the older revisions when they are
 not referenced any more by any thread.
 
 
@@ -55,22 +55,33 @@
 - h_global (boolean)
 - h_possibly_outdated (boolean)
 - h_written (boolean)
-- h_version (unsigned integer)
+- h_revision (unsigned integer)
 
-The h_version is an unsigned "version number".  More about it below.
-The other fields are flags.  (In practice they are just bits inside the
-GC h_tid field.)
+The h_revision is an unsigned "revision number" that can also
+alternatively contain a pointer.  The other fields are flags.  (In
+practice they are just bits inside the GC h_tid field.)
 
 - ``h_global`` means that the object is a global object.
 
 - ``h_possibly_outdated`` is used as an optimization: it means that the
   object is possibly outdated.  It is False for all local objects.  It
   is also False if the object is a global object, is the most recent of
-  its chained list of versions, and is known to have no modified local
+  its chained list of revisions, and is known to have no modified local
   version in any transaction.
 
 - ``h_written`` is set on local objects that have been written to.
 
+- ``h_revision`` on local objects points to the global object that they
+  come from, if any; otherwise it is NULL.
+
+- ``h_revision`` on global objects depends on whether the object is the
+  head of the chained list of revisions or not.  If it is, then
+  ``h_revision`` contains the "timestamp" of the revision at which this
+  version of the object was committed.  For non-head revisions,
+  ``h_revision`` is a pointer to a more recent revision.  To distinguish
+  these two cases we set the lowest bit of ``h_revision`` in the latter
+  case.
+
 
 Transaction details
 -------------------
@@ -92,32 +103,60 @@
 their corresponding local objects.
 
 ``list_of_read_objects`` is a set of all global objects read from, in
-the version that was used for reading.  It is actually implemented as a
-list, but the order or repeated presence of elements in the list is
-irrelevant.
+the revision that was used for reading.  It is actually implemented as a
+list, but the order or repetition of elements in the list is irrelevant.
 
 ``recent_reads_cache`` is a fixed-size cache that remembers recent
 additions to the preceeding list, in order to avoid inserting too much
 repeated entries into the list, as well as keep lightweight statistics.
 
 
-Pseudo-code: read/write barriers
+Read/write barriers design
 ---------------------------------------
 
-Variable names:
+The read/write barriers are designed with the following goals in mind:
 
-* ``P`` is a pointer to any object.
+- In the source code (graphs from RPython), variables containing
+  pointers can be annotated as beloning to one of 6 categories:
 
-* ``G`` is a pointer to a *global* object.
+  * ``P`` is a pointer to any object.
 
-* ``R`` is a pointer to an object that was checked for being
-  *read-ready*: reading its fields is ok.
+  * ``G`` is a pointer to a *global* object.
 
-* ``L`` is a pointer to a *local* object.  We can always read from
-  but not necessarily write to local objects.
+  * ``R`` is a pointer to an object that was checked for being
+    *read-ready*: reading its fields is ok.
 
-* ``W`` is a pointer to a *writable* local object.
+  * ``O`` is an *old* pointer that used to be read-ready, but in which
+    we may have written to in the meantime
 
+  * ``L`` is a pointer to a *local* object.  We can always read from
+    but not necessarily write to local objects.
+
+  * ``W`` is a pointer to a *writable* local object.
+
+- The goal is to insert calls to the following write barriers so that we
+  only ever read from objects in the ``R``, ``L`` or ``W`` categories,
+  and only ever write to objects in the ``W`` category.
+
+- The read barriers themselves need to ensure that
+  ``list_of_read_objects`` contains exactly the set of global objects
+  that have been read from.  These objects must all be of the most
+  recent revision that is not more recent than ``start_time``.  If an
+  object has got a revision more recent than ``start_time``, then the
+  current transaction is in conflict.  The transaction is aborted as
+  soon as this case is detected.
+
+- The write barriers make sure that all modified objects are local and
+  the ``h_written`` flag is set.
+
+- All barriers ensure that ``global_to_local`` satisfies the following
+  property for any local object ``L``: either ``L`` was created by
+  this transaction (``L->h_revision == NULL``) or else satisfy
+  ``global_to_local[L->h_revision] == L``.
+
+
+Pseudo-code for read/write barriers
+---------------------------------------
 
 ``W = Allocate(size)`` allocates a local object::
 
@@ -126,24 +165,24 @@
         W->h_global = False
         W->h_possibly_outdated = False
         W->h_written = True
-        W->h_version = 0
+        W->h_revision = 0
         return W
 
 
-``R = LatestGlobalVersion(G)`` takes a pointer ``G`` to a global object,
-and if necessary follows the chain of newer versions, until it reaches
-the most recent version ``R``.  Then it checks the version number of
+``R = LatestGlobalRevision(G)`` takes a pointer ``G`` to a global object,
+and if necessary follows the chain of newer revisions, until it reaches
+the most recent revision ``R``.  Then it checks the revision number of
 ``R`` to see that it was not created after ``start_time``.
 Pseudo-code::
 
-    def LatestGlobalVersion(G, ...):
+    def LatestGlobalRevision(G, ...):
         R = G
-        while (v := R->h_version) & 1:    # "has a more recent version"
+        while (v := R->h_revision) & 1:    # "has a more recent revision"
             R = v & ~ 1
-        if v > start_time:                # object too recent?
-            ValidateFast()                # try to move start_time forward
-            return LatestGlobalVersion(G) # restart searching from G
-        PossiblyUpdateChain(G, R, ...)    # see below
+        if v > start_time:                 # object too recent?
+            ValidateFast()                 # try to move start_time forward
+            return LatestGlobalRevision(G) # restart searching from G
+        PossiblyUpdateChain(G, R, ...)     # see below
         return R
 
 
@@ -159,7 +198,7 @@
         if not P->h_possibly_outdated:
             R = P
         else:
-            R = LatestGlobalVersion(P, ...)
+            R = LatestGlobalRevision(P, ...)
             if R->h_possibly_outdated and R in global_to_local:
                 L = ReadGlobalToLocal(R, ...)  # see below
                 return L
@@ -167,20 +206,21 @@
         return R
 
 
-A simple optimization is possible.  If ``R`` is returned by a previous
-call to ``DirectReadBarrier`` and the current transaction is still
-running, but we could have written to ``R`` in the meantime, then we
-need to repeat only part of the logic, because we don't need
-``AddInReadSet`` again.  It gives this::
+A simple optimization is possible.  Assume that ``O`` is a pointer
+returned by a previous call to ``DirectReadBarrier`` and the current
+transaction is still running, but we could have written to ``O`` in the
+meantime.  Then we need to repeat only part of the logic, because we
+don't need ``AddInReadSet`` again.  It gives this::
 
-    def RepeatReadBarrier(R, ...):
-        if not R->h_possibly_outdated:       # fast-path
-            return R
-        # LatestGlobalVersion(R) would either return R or abort
+    def RepeatReadBarrier(O, ...):
+        if not O->h_possibly_outdated:       # fast-path
+            return O
+        # LatestGlobalRevision(R) would either return R or abort
         # the whole transaction, so omitting it is not wrong
-        if R in global_to_local:
-            L = ReadGlobalToLocal(R, ...)    # see below
+        if O in global_to_local:
+            L = ReadGlobalToLocal(O, ...)    # see below
             return L
+        R = O
         return R
 
 
@@ -194,7 +234,7 @@
         L->h_global = False
         L->h_possibly_outdated = False
         L->h_written = False
-        L->h_version = R          # back-reference to the original
+        L->h_revision = R          # back-reference to the original
         L->objectbody... = R->objectbody...
         global_to_local[R] = L
         return L
@@ -207,7 +247,7 @@
         if not P->h_global:       # fast-path
             return P
         if P->h_possibly_outdated:
-            R = LatestGlobalVersion(P)
+            R = LatestGlobalRevision(P)
         else:
             R = P
         W = Localize(R)
@@ -292,18 +332,23 @@
 
 
 Finally, a similar optimization can be applied in
-``LatestGlobalVersion``.  After it follows the chain of global versions,
-it can "compress" that chain in case it contained several hops, and also
-update the original container's field to point directly to the latest
-version::
+``LatestGlobalRevision``.  After it follows the chain of global
+revisions, it can "compress" that chain in case it contained several
+hops, and also update the original container's field to point directly
+to the latest version::
 
     def PossiblyUpdateChain(G, R, R_Container, FieldName):
         if R != G:
             # compress the chain
-            while G->h_version != R | 1:
-                G_next = G->h_version & ~ 1
-                G->h_version = R | 1
+            while G->h_revision != R | 1:
+                G_next = G->h_revision & ~ 1
+                G->h_revision = R | 1
                 G = G_next
             # update the original field
             R_Container->FieldName = R
 
+
+Committing
+------------------------------------
+
+xxxx
_______________________________________________
pypy-commit mailing list
pypy-commit@python.org
http://mail.python.org/mailman/listinfo/pypy-commit

Reply via email to