Author: Armin Rigo <ar...@tunes.org> Branch: extradoc Changeset: r4666:41c358b89418 Date: 2012-08-17 13:48 +0200 http://bitbucket.org/pypy/extradoc/changeset/41c358b89418/
Log: Updates diff --git a/talk/stm2012/stmimpl.rst b/talk/stm2012/stmimpl.rst --- a/talk/stm2012/stmimpl.rst +++ b/talk/stm2012/stmimpl.rst @@ -20,9 +20,9 @@ original global object is *not* changed --- it is really immutable. But the copy becomes global, and the old global object's header is updated with a pointer to the new global object. We thus make a chained list -of global versions. +of global revisions. -It is the job of the GC to collect the older versions when they are +It is the job of the GC to collect the older revisions when they are not referenced any more by any thread. @@ -55,22 +55,33 @@ - h_global (boolean) - h_possibly_outdated (boolean) - h_written (boolean) -- h_version (unsigned integer) +- h_revision (unsigned integer) -The h_version is an unsigned "version number". More about it below. -The other fields are flags. (In practice they are just bits inside the -GC h_tid field.) +The h_revision is an unsigned "revision number" that can also +alternatively contain a pointer. The other fields are flags. (In +practice they are just bits inside the GC h_tid field.) - ``h_global`` means that the object is a global object. - ``h_possibly_outdated`` is used as an optimization: it means that the object is possibly outdated. It is False for all local objects. It is also False if the object is a global object, is the most recent of - its chained list of versions, and is known to have no modified local + its chained list of revisions, and is known to have no modified local version in any transaction. - ``h_written`` is set on local objects that have been written to. +- ``h_revision`` on local objects points to the global object that they + come from, if any; otherwise it is NULL. + +- ``h_revision`` on global objects depends on whether the object is the + head of the chained list of revisions or not. If it is, then + ``h_revision`` contains the "timestamp" of the revision at which this + version of the object was committed. For non-head revisions, + ``h_revision`` is a pointer to a more recent revision. To distinguish + these two cases we set the lowest bit of ``h_revision`` in the latter + case. + Transaction details ------------------- @@ -92,32 +103,60 @@ their corresponding local objects. ``list_of_read_objects`` is a set of all global objects read from, in -the version that was used for reading. It is actually implemented as a -list, but the order or repeated presence of elements in the list is -irrelevant. +the revision that was used for reading. It is actually implemented as a +list, but the order or repetition of elements in the list is irrelevant. ``recent_reads_cache`` is a fixed-size cache that remembers recent additions to the preceeding list, in order to avoid inserting too much repeated entries into the list, as well as keep lightweight statistics. -Pseudo-code: read/write barriers +Read/write barriers design --------------------------------------- -Variable names: +The read/write barriers are designed with the following goals in mind: -* ``P`` is a pointer to any object. +- In the source code (graphs from RPython), variables containing + pointers can be annotated as beloning to one of 6 categories: -* ``G`` is a pointer to a *global* object. + * ``P`` is a pointer to any object. -* ``R`` is a pointer to an object that was checked for being - *read-ready*: reading its fields is ok. + * ``G`` is a pointer to a *global* object. -* ``L`` is a pointer to a *local* object. We can always read from - but not necessarily write to local objects. + * ``R`` is a pointer to an object that was checked for being + *read-ready*: reading its fields is ok. -* ``W`` is a pointer to a *writable* local object. + * ``O`` is an *old* pointer that used to be read-ready, but in which + we may have written to in the meantime + * ``L`` is a pointer to a *local* object. We can always read from + but not necessarily write to local objects. + + * ``W`` is a pointer to a *writable* local object. + +- The goal is to insert calls to the following write barriers so that we + only ever read from objects in the ``R``, ``L`` or ``W`` categories, + and only ever write to objects in the ``W`` category. + +- The read barriers themselves need to ensure that + ``list_of_read_objects`` contains exactly the set of global objects + that have been read from. These objects must all be of the most + recent revision that is not more recent than ``start_time``. If an + object has got a revision more recent than ``start_time``, then the + current transaction is in conflict. The transaction is aborted as + soon as this case is detected. + +- The write barriers make sure that all modified objects are local and + the ``h_written`` flag is set. + +- All barriers ensure that ``global_to_local`` satisfies the following + property for any local object ``L``: either ``L`` was created by + this transaction (``L->h_revision == NULL``) or else satisfy + ``global_to_local[L->h_revision] == L``. + + +Pseudo-code for read/write barriers +--------------------------------------- ``W = Allocate(size)`` allocates a local object:: @@ -126,24 +165,24 @@ W->h_global = False W->h_possibly_outdated = False W->h_written = True - W->h_version = 0 + W->h_revision = 0 return W -``R = LatestGlobalVersion(G)`` takes a pointer ``G`` to a global object, -and if necessary follows the chain of newer versions, until it reaches -the most recent version ``R``. Then it checks the version number of +``R = LatestGlobalRevision(G)`` takes a pointer ``G`` to a global object, +and if necessary follows the chain of newer revisions, until it reaches +the most recent revision ``R``. Then it checks the revision number of ``R`` to see that it was not created after ``start_time``. Pseudo-code:: - def LatestGlobalVersion(G, ...): + def LatestGlobalRevision(G, ...): R = G - while (v := R->h_version) & 1: # "has a more recent version" + while (v := R->h_revision) & 1: # "has a more recent revision" R = v & ~ 1 - if v > start_time: # object too recent? - ValidateFast() # try to move start_time forward - return LatestGlobalVersion(G) # restart searching from G - PossiblyUpdateChain(G, R, ...) # see below + if v > start_time: # object too recent? + ValidateFast() # try to move start_time forward + return LatestGlobalRevision(G) # restart searching from G + PossiblyUpdateChain(G, R, ...) # see below return R @@ -159,7 +198,7 @@ if not P->h_possibly_outdated: R = P else: - R = LatestGlobalVersion(P, ...) + R = LatestGlobalRevision(P, ...) if R->h_possibly_outdated and R in global_to_local: L = ReadGlobalToLocal(R, ...) # see below return L @@ -167,20 +206,21 @@ return R -A simple optimization is possible. If ``R`` is returned by a previous -call to ``DirectReadBarrier`` and the current transaction is still -running, but we could have written to ``R`` in the meantime, then we -need to repeat only part of the logic, because we don't need -``AddInReadSet`` again. It gives this:: +A simple optimization is possible. Assume that ``O`` is a pointer +returned by a previous call to ``DirectReadBarrier`` and the current +transaction is still running, but we could have written to ``O`` in the +meantime. Then we need to repeat only part of the logic, because we +don't need ``AddInReadSet`` again. It gives this:: - def RepeatReadBarrier(R, ...): - if not R->h_possibly_outdated: # fast-path - return R - # LatestGlobalVersion(R) would either return R or abort + def RepeatReadBarrier(O, ...): + if not O->h_possibly_outdated: # fast-path + return O + # LatestGlobalRevision(R) would either return R or abort # the whole transaction, so omitting it is not wrong - if R in global_to_local: - L = ReadGlobalToLocal(R, ...) # see below + if O in global_to_local: + L = ReadGlobalToLocal(O, ...) # see below return L + R = O return R @@ -194,7 +234,7 @@ L->h_global = False L->h_possibly_outdated = False L->h_written = False - L->h_version = R # back-reference to the original + L->h_revision = R # back-reference to the original L->objectbody... = R->objectbody... global_to_local[R] = L return L @@ -207,7 +247,7 @@ if not P->h_global: # fast-path return P if P->h_possibly_outdated: - R = LatestGlobalVersion(P) + R = LatestGlobalRevision(P) else: R = P W = Localize(R) @@ -292,18 +332,23 @@ Finally, a similar optimization can be applied in -``LatestGlobalVersion``. After it follows the chain of global versions, -it can "compress" that chain in case it contained several hops, and also -update the original container's field to point directly to the latest -version:: +``LatestGlobalRevision``. After it follows the chain of global +revisions, it can "compress" that chain in case it contained several +hops, and also update the original container's field to point directly +to the latest version:: def PossiblyUpdateChain(G, R, R_Container, FieldName): if R != G: # compress the chain - while G->h_version != R | 1: - G_next = G->h_version & ~ 1 - G->h_version = R | 1 + while G->h_revision != R | 1: + G_next = G->h_revision & ~ 1 + G->h_revision = R | 1 G = G_next # update the original field R_Container->FieldName = R + +Committing +------------------------------------ + +xxxx _______________________________________________ pypy-commit mailing list pypy-commit@python.org http://mail.python.org/mailman/listinfo/pypy-commit