Author: Remi Meier <remi.me...@gmail.com>
Branch: c8-reshare-pages
Changeset: r2016:afef19229966
Date: 2017-02-27 17:00 +0100
http://bitbucket.org/pypy/stmgc/changeset/afef19229966/

Log:    merge default

diff too long, truncating to 2000 out of 3600 lines

diff --git a/README.txt b/README.md
rename from README.txt
rename to README.md
--- a/README.txt
+++ b/README.md
@@ -1,28 +1,250 @@
 
-STM-GC
-======
+# STM-GC
 
 Welcome!
 
 This is a C library that combines a GC with STM capabilities.
 It is meant to be a general library that can be used in C programs.
 
-The library interface is in "c4/stmgc.h".
+The library interface is in `c8/stmgc.h`.
 
 Progress (these revisions are roughly stable versions, pick the last one):
-- 3aea86a96daf: last rev of "c3", the previous version
-- f1ccf5bbcb6f: first step, working but with no GC
-- 8da924453f29: minor collection seems to be working, no major GC
-- e7249873dcda: major collection seems to be working
 
-The file "c4/doc-objects.txt" contains some low-level explanations.
+ - 3af462f
 
-Run tests with "py.test".
+Run tests with `py.test`.
 
-A demo program can be found in "c4/demo1.c".
-It can be built with "make debug-demo1" or "make build-demo1".
+Demo programs can be found in `c8/demo/`.
 
 The plan is to use this C code directly with PyPy, and not write
 manually the many calls to the shadow stack and the barrier functions.
 But the manual way is possible too, say when writing a small interpreter
 directly in C.
+
+
+# Other resources
+
+http://doc.pypy.org/en/latest/stm.html
+
+# How to run things
+
+## Get PyPy 
+
+ 1. `hg clone https://bitbucket.org/pypy/pypy` (this will take a while, but you
+    can continue with the instructions below)
+ 2. `hg checkout stmgc-c8`
+
+
+## Get STMGC
+
+ 1. `hg clone https://bitbucket.org/pypy/stmgc`
+ 2. `gcc-seg-gs/README.txt` mentions which GCC version should work. Maybe -fXXX
+    flags mentioned at the end are still needed for compiling full PyPy-STM
+
+The folder `c8` contains the current version of the STMGC library
+
+### Project layout of c8
+
+ - `stmgc.h`: the main header file for the library
+ - `stm/`: 
+ 
+    For the GC part: 
+    
+     - `nursery`: minor collection
+     - `gcpage`: major collection
+     - `largemalloc`, `smallmalloc`: object allocation
+     - `finalizer`: object finalizer support
+     - `weakref`: weak references support
+     
+    For the STM part:
+    
+     - `core`: commit, abort, barrier logic of STM
+     - `sync`: segment management and thread support
+     - `pages`: management of page metadata
+     - `signal_handler`: manages pages together with `pages`
+     - `locks`: a set of locks to protect segments
+     - `rewind_setjmp`: setjmp/longjmp implementation that supports arbitrary 
rollback
+     - `forksupport`: support for forking an STM process
+     - `extra`: on-commit and on-abort callback mechanism
+     - `detach`: transaction detach mechanism (optimised transactional zones)
+     - `setup`: sets up the memory layout and segments
+     
+    Misc:
+    
+     - `fprintcolor`: colourful debug output
+     - `hash_id`: PyPy-compatible identity and identity-hash functionality
+     - `hashtable`: transactional hash table implementation
+     - `queue`: transactional work-queue implementation
+     - `list`: simple growable list implementation
+     - `marker`, `prof`: mechanism to record events
+     - `misc`: mostly debug and testing interface
+     - `pagecopy`: fast copy implementation for pages
+     - `prebuilt`: logic for PyPy's prebuilt objects
+       
+
+
+### Running Tests
+
+Tests are written in Python that calls the C-library through CFFI (Python 
package).
+
+ 1. install `pytest` and `cffi` packages for Python (via `pip`)
+ 2. running `py.test` in `c8/test` should run all the tests (alternatively, the
+    PyPy-checkout has a pytest.py script in its root-folder, which should work
+    too)
+
+### Running Demos
+
+Demos are small C programs that use the STMGC library directly. They sometimes
+expose real data-races that the sequential Python tests cannot expose.
+
+ 1. for 
+ example: `make build-demo_random2`
+ 2. then run `./build-demo_random2`
+ 
+ 
+### Debugging
+
+GDB works fine for debugging programs with the STMGC library. However, you have
+to tell GDB to ignore `SIGSEGV` by default. A `.gdbinit` could look like this:
+
+    handle SIGSEGV nostop pass noprint
+
+    define sigon
+        handle SIGSEGV stop nopass print
+    end
+
+    define sigoff
+        handle SIGSEGV nostop pass noprint
+    end
+
+    define lon
+        set scheduler-locking on
+    end
+    define loff
+        set scheduler-locking off
+    end
+    
+    # run until crash
+    define runloop
+        set pagination off
+        p $_exitcode = 0
+        while $_exitcode == 0
+            p $_exitcode = -1
+            r
+        end
+        set pagination on
+    end
+
+
+The commands `sigon` and `sigoff` enable and disable `SIGSEGV`-handling. `lon`
+and `loff` enables and disables stopping of other threads while stepping 
through
+one of them. After reaching a breakpoint in GDB, I usually run `sigon` and 
`lon`
+to enable GDB to handle real `SIGSEGV` (e.g., while printing) and to stop other
+threads.
+
+`runloop` re-runs a program until there is a crash (useful for reproducing rare
+race conditions).
+
+Furthermore, there are some useful GDB extensions under `/c7/gdb/gdb_stm.py`
+that allow for inspecting segment-local pointers. To enable them, add the
+following line to your `.gdbinit`:
+
+    python exec(open('PATH-TO-STMGC/c7/gdb/gdb_stm.py').read())
+    
+
+
+
+## Building PyPy-STM
+
+The STM branch of PyPy contains a *copy* of the STMGC library. After changes to
+STMGC, run the `import_stmgc.py` script in `/rpython/translator/stm/`. In the
+following, `/` is the root of your PyPy checkout.
+
+ 0. Follow the [build instructions](http://doc.pypy.org/en/latest/build.html)
+    for PyPy until you get to the point to run the translation.
+
+ 1. The Makefile expects a `gcc-seg-gs` executable to be on the `$PATH`. This
+    should be a GCC that is either patched or a wrapper to GCC 6.1 that passes
+    the necessary options. In my case, this is a script that points to my 
custom
+    build of GCC with the following content:
+    
+        :::bash
+        #!/bin/bash
+        BUILD=/home/remi/work/bin/gcc-build
+        exec $BUILD/gcc/xgcc -B $BUILD/gcc -fno-ivopts -fno-tree-vectorize 
-fno-tree-loop-distribute-patterns "$@"
+    
+    
+ 2. `cd /pypy/goal/`
+ 
+ 3. A script to translate PyPy-STM (adapt all paths):
+ 
+        :::bash
+        #!/bin/bash
+        export PYPY_USESSION_KEEP=200
+        export PYPY_USESSION_DIR=~/pypy-usession
+        
+        STM=--stm #--stm
+        JIT=-Ojit #-Ojit #-O2
+        VERSION=$(hg id -i)
+        ionice -c3 pypy ~/pypy_dir/rpython/bin/rpython --no-shared --source 
$STM $JIT  targetpypystandalone.py
+        # --no-allworkingmodules
+        
+        notify-send "PyPy" "C source generated."
+        
+        cd ~/pypy-usession/usession-$(hg branch)-remi/testing_1/
+        ionice -c3 make -Bj4
+        
+        TIME=$(date +%y-%m-%d-%H:%M)
+        cp pypy-c ~/pypy_dir/pypy/goal/pypy-c-$STM-$JIT-$VERSION-$TIME
+        cp pypy-c ~/pypy_dir/pypy/goal/pypy-c
+        
+        notify-send "PyPy" "Make finished."
+    
+    The usession-folder will keep the produced C source files. You will need
+    them whenever you do a change to the STMGC library only (no need to
+    retranslate the full PyPy). In that case:
+    
+     1. Go to `~/pypy-usession/usession-stmgc-c8-$USER/testing_1/`
+     2. `make clean && make -j8` will rebuild all C sources
+        
+        Faster alternative that works in most cases: `rm ../module_cache/*.o`
+        instead of `make clean`. This will remove the compiled STMGC files,
+        forcing a rebuild from the *copy* in the `/rpython/translator/stm`
+        folder.
+        
+ 4. The script puts a `pypy-c` into `/pypy/goal/` that should be ready to run.
+
+
+### Log tools
+
+STMGC produces an event-log, if requested. Some tools to parse and analyse 
these
+logs are in the PyPy repository under `/pypy/stm/`. To produce a log, set the
+environment variable `PYPYSTM` to a file name. E.g.:
+
+`env PYPYSTM=log.pypystm pypy-c program.py`
+
+and then see some statistics with 
+
+`/pypy/stm/print_stm_log.py log.pypystm`
+
+
+### Benchmarks
+
+In PyPy's benchmark repository (`https://bitbucket.org/pypy/benchmarks`) under
+`multithread` is a collection of multi-threaded Python programs to measure
+performance.
+
+One way to run them is to check out the branch `multithread-runner` and do the
+following:
+
+`./runner.py pypy-c config-raytrace.json result.json`
+
+This will use the configuration in the JSON file and run a few iterations; then
+write the result into a JSON file again. It will also print the command-line
+used to run the benchmark, in case you don't want to use the runner. The
+`getresults.py` script can be used to compare two versions of PyPy against each
+other, but it is very limited.
+
+
+
+
diff --git a/c8/TODO b/c8/TODO
--- a/c8/TODO
+++ b/c8/TODO
@@ -1,3 +1,103 @@
+
+- investigate if userfaultfd() helps:
+  
http://kernelnewbies.org/Linux_4.3#head-3deefea7b0add8c1b171b0e72ce3b69c5ed35cb0
+
+  AFAICS, we could avoid the current in-kernel-pagefault+SIGSEGV-handler for
+  making pages accessible on-demand, and replace that mechanism with a
+  user-pagefault "handler". That should save on the number of needed VMAs and
+  possibly be faster (although I'm quite unsure of that).
+
+- investigate if membarrier() helps:
+  http://man7.org/linux/man-pages/man2/membarrier.2.html
+
+
+##################
+Issue: lee_router_tm in one of its versions uses a temporary grid (big list)
+to do some calculations. This grid gets cleared everytime the router does
+one transaction (lay_next_track). Hence, the transaction writes a big amount
+of *old* memory.
+ * For one, clearing an array causes tons of stm_write_card() and creating
+   tons of small backup slices.
+ * Also, all of this stuff goes to the commit log, as we "modify" an old
+   object.
+ * Of course this is all completely unecessary, as the temporary grid is
+   basically thread-local, always gets cleared at the start of an atomic
+   block ("application-level transaction"), and therefore wouldn't need
+   to be mentioned in the commit log (and in theory wouldn't even need
+   reverting on abort).
+
+Here is a slice of the perf-profile:
+-   Total     Self
+-   14.39%    13.62%  pypy-c-clcollec  libc-2.19.so        [.] 
__memcpy_sse2_unaligned
+   - __memcpy_sse2_unaligned
+      + 80.47% import_objects.constprop.50
+      + 10.72% make_bk_slices_for_range
++   16.82%     1.43%  pypy-c-clcollec  pypy-c-clcollector  [.] make_bk_slices
++   14.71%     4.44%  pypy-c-clcollec  pypy-c-clcollector  [.] 
make_bk_slices_for_range
++    4.63%     4.62%  pypy-c-clcollec  pypy-c-clcollector  [.] go_to_the_past
+
+On this benchmark, pypy-stm is ~4x slower than pypy-default. It also doesn't
+scale at all (probably because a lot of the things above are actually protected
+by the privatization locks, which seem to be quite contended).
+Probably around 10% of the time is spent importing the changes done to
+a thread-local object. So here are a few ideas:
+ * Play with card-marking card size to speed up make_bk_slices?
+ * After marking a significant percentage of cards of an obj, maybe just
+   mark all of them (do a full write barrier)?
+ * Should modification-heavy transactions run longer (with high priority)?
+ * Special allocation area for thread-local objects which is always mapped
+   shared between all segments.
+    + changes to objs in this area do not need to be added to the commit log
+    - need to guarantee that only one thread/segment at a time accesses an
+      obj at a time
+    - needs explicit designation as thread-local at allocation time (by 
programmer)
+    - doesn't avoid creating backup copies
+ * Special objs that are transaction-local
+    + no need for backup copies, should always "reset" automatically
+    - needs programmer support
+    - limited applicability
+ * As long as a slice is in a page that is only mapped in one segment,
+   we can depend on the slice not being imported in other segs since
+   the page is not mapped. At least it avoids importing to seg0 before
+   major GC and also pushing overflow objs to seg0 on commit. However,
+   this requires the elimination of seg0 as it is now, which is hard:
+    * main concern is that major GC needs to trace objs somewhere, and right
+      now it traces them in seg0. We probably need a way to tell in which seg
+      an obj is accessible.
+    * one way would be to say an obj is always fully accessible in the seg it
+      was first allocated in, but that makes page resharing more difficult.
+      Still, we could record this information in the largemalloc-header, which
+      we first need to find (per obj check for where obj-header is accessible).
+      A check if all pages of an obj are accessible in some segment is probably
+      too slow, as is checking on-the-fly during tracing...
+    * largemalloc currently keeps its data structures in seg0. We would need
+      to keep the data structure up-to-date in all segments (e.g. by always
+      writing to all accessible segments, but this requires us to have the
+      privatization readlock when allocating, and privatization locks are
+      already contended in some cases)
+    * smallmalloc is a bit simpler since objs are always in a single page.
+      But I guess we still need to find the accessible page for each obj...
+  Overall:
+   + modifications to single-mapped pages would not be copied if they stay
+     single-mapped
+   + fully automatic/transparent
+   - provides only page-level granularity
+   - potentially slow complication of largemalloc and major GC
+   - requires us to put effort into making threads always run in the same
+     segment to increase effectivness
+   - doesn't avoid creating backup copies
+ * One could also argue that keeping around a tempgrid is not good and
+   the programmer should just re-create it in every transaction
+   (counter-intuitive). This indeed speeds up the benchmark ("only" 2x
+   slower than pypy-default), but causes tons of major GCs. These
+   major GCs completely prevent scaling, as they are stop-the-world.
+   So this opens up a whole new can of worms (concurrent, parallel,
+   incremental GC?).
+
+
+##################
+
+
 - stm_identityhash spends a good time figuring out if an obj is prebuilt
   (40% of its time). maybe after setup_prebuilt, we could defer the test
   of GCFLAG_HAS_SHADOW in id_or_identityhash to after an address comparison.
diff --git a/c8/demo/Makefile b/c8/demo/Makefile
--- a/c8/demo/Makefile
+++ b/c8/demo/Makefile
@@ -2,9 +2,9 @@
 # Makefile for the demos.
 #
 
-DEBUG_EXE = debug-demo2
-BUILD_EXE = build-demo2
-RELEASE_EXE = release-demo2
+DEBUG_EXE = debug-demo_simple
+BUILD_EXE = build-demo_simple
+RELEASE_EXE = release-demo_simple
 
 debug: $(DEBUG_EXE)       # with prints and asserts
 build: $(BUILD_EXE)       # without prints, but with asserts
diff --git a/c8/stm/atomic.h b/c8/stm/atomic.h
--- a/c8/stm/atomic.h
+++ b/c8/stm/atomic.h
@@ -24,16 +24,16 @@
 
 #if defined(__i386__) || defined(__amd64__)
 
-  static inline void spin_loop(void) { asm("pause" : : : "memory"); }
-  static inline void write_fence(void) { asm("" : : : "memory"); }
+  static inline void stm_spin_loop(void) { asm("pause" : : : "memory"); }
+  static inline void stm_write_fence(void) { asm("" : : : "memory"); }
 /*# define atomic_exchange(ptr, old, new)  do {         \
           (old) = __sync_lock_test_and_set(ptr, new);   \
       } while (0)*/
 
 #else
 
-  static inline void spin_loop(void) { asm("" : : : "memory"); }
-  static inline void write_fence(void) { __sync_synchronize(); }
+  static inline void stm_spin_loop(void) { asm("" : : : "memory"); }
+  static inline void stm_write_fence(void) { __sync_synchronize(); }
 
 /*# define atomic_exchange(ptr, old, new)  do {           \
           (old) = *(ptr);                                 \
@@ -42,19 +42,19 @@
 #endif
 
 
-static inline void _spinlock_acquire(uint8_t *plock) {
+static inline void _stm_spinlock_acquire(uint8_t *plock) {
  retry:
     if (__builtin_expect(__sync_lock_test_and_set(plock, 1) != 0, 0)) {
-        spin_loop();
+        stm_spin_loop();
         goto retry;
     }
 }
-static inline void _spinlock_release(uint8_t *plock) {
+static inline void _stm_spinlock_release(uint8_t *plock) {
     assert(*plock == 1);
     __sync_lock_release(plock);
 }
-#define spinlock_acquire(lock) _spinlock_acquire(&(lock))
-#define spinlock_release(lock) _spinlock_release(&(lock))
+#define stm_spinlock_acquire(lock) _stm_spinlock_acquire(&(lock))
+#define stm_spinlock_release(lock) _stm_spinlock_release(&(lock))
 
 
 #endif  /* _STM_ATOMIC_H */
diff --git a/c8/stm/core.c b/c8/stm/core.c
--- a/c8/stm/core.c
+++ b/c8/stm/core.c
@@ -1,5 +1,6 @@
 #ifndef _STM_CORE_H_
 # error "must be compiled via stmgc.c"
+# include "core.h"  // silence flymake
 #endif
 
 char *stm_object_pages;
@@ -113,212 +114,6 @@
 }
 
 
-/* ############# signal handler ############# */
-
-static void copy_bk_objs_in_page_from(int from_segnum, uintptr_t pagenum,
-                                      bool only_if_not_modified)
-{
-    /* looks at all bk copies of objects overlapping page 'pagenum' and
-       copies the part in 'pagenum' back to the current segment */
-    dprintf(("copy_bk_objs_in_page_from(%d, %ld, %d)\n",
-             from_segnum, (long)pagenum, only_if_not_modified));
-
-    assert(modification_lock_check_rdlock(from_segnum));
-    struct list_s *list = get_priv_segment(from_segnum)->modified_old_objects;
-    struct stm_undo_s *undo = (struct stm_undo_s *)list->items;
-    struct stm_undo_s *end = (struct stm_undo_s *)(list->items + list->count);
-
-    import_objects(only_if_not_modified ? -2 : -1,
-                   pagenum, undo, end);
-}
-
-static void go_to_the_past(uintptr_t pagenum,
-                           struct stm_commit_log_entry_s *from,
-                           struct stm_commit_log_entry_s *to)
-{
-    assert(modification_lock_check_wrlock(STM_SEGMENT->segment_num));
-    assert(from->rev_num >= to->rev_num);
-    /* walk BACKWARDS the commit log and update the page 'pagenum',
-       initially at revision 'from', until we reach the revision 'to'. */
-
-    /* XXXXXXX Recursive algo for now, fix this! */
-    if (from != to) {
-        struct stm_commit_log_entry_s *cl = to->next;
-        go_to_the_past(pagenum, from, cl);
-
-        struct stm_undo_s *undo = cl->written;
-        struct stm_undo_s *end = cl->written + cl->written_count;
-
-        import_objects(-1, pagenum, undo, end);
-    }
-}
-
-
-long ro_to_acc = 0;
-static void handle_segfault_in_page(uintptr_t pagenum)
-{
-    /* assumes page 'pagenum' is ACCESS_NONE, privatizes it,
-       and validates to newest revision */
-    dprintf(("handle_segfault_in_page(%lu), seg %d\n", pagenum, 
STM_SEGMENT->segment_num));
-
-    /* XXX: bad, but no deadlocks: */
-    acquire_all_privatization_locks();
-
-    long i;
-    int my_segnum = STM_SEGMENT->segment_num;
-
-    uint8_t page_status = get_page_status_in(my_segnum, pagenum);
-    assert(page_status == PAGE_NO_ACCESS
-           || page_status == PAGE_READONLY);
-
-    if (page_status == PAGE_READONLY) {
-        /* make our page write-ready */
-        page_mark_accessible(my_segnum, pagenum);
-
-        dprintf((" > found READONLY, make others NO_ACCESS\n"));
-        /* our READONLY copy *has* to have the current data, no
-           copy necessary */
-        /* make READONLY pages in other segments NO_ACCESS */
-        for (i = 1; i < NB_SEGMENTS; i++) {
-            if (i == my_segnum)
-                continue;
-
-            if (get_page_status_in(i, pagenum) == PAGE_READONLY)
-                page_mark_inaccessible(i, pagenum);
-        }
-
-        ro_to_acc++;
-
-        release_all_privatization_locks();
-        return;
-    }
-
-    /* find who has the most recent revision of our page */
-    /* XXX: uh, *more* recent would be enough, right? */
-    int copy_from_segnum = -1;
-    uint64_t most_recent_rev = 0;
-    bool was_readonly = false;
-    for (i = 1; i < NB_SEGMENTS; i++) {
-        if (i == my_segnum)
-            continue;
-
-        if (!was_readonly && get_page_status_in(i, pagenum) == PAGE_READONLY) {
-            was_readonly = true;
-            break;
-        }
-
-        struct stm_commit_log_entry_s *log_entry;
-        log_entry = get_priv_segment(i)->last_commit_log_entry;
-        if (get_page_status_in(i, pagenum) != PAGE_NO_ACCESS
-            && (copy_from_segnum == -1 || log_entry->rev_num > 
most_recent_rev)) {
-            copy_from_segnum = i;
-            most_recent_rev = log_entry->rev_num;
-        }
-    }
-    OPT_ASSERT(copy_from_segnum != my_segnum);
-
-    if (was_readonly) {
-        assert(page_status == PAGE_NO_ACCESS);
-        /* this case could be avoided by making all NO_ACCESS to READONLY
-           when resharing pages (XXX: better?).
-           We may go from NO_ACCESS->READONLY->ACCESSIBLE on write with
-           2 SIGSEGV in a row.*/
-        dprintf((" > make a previously NO_ACCESS page READONLY\n"));
-        page_mark_readonly(my_segnum, pagenum);
-
-        release_all_privatization_locks();
-        return;
-    }
-
-    /* make our page write-ready */
-    page_mark_accessible(my_segnum, pagenum);
-
-    /* account for this page now: XXX */
-    /* increment_total_allocated(4096); */
-
-
-    if (copy_from_segnum == -1) {
-        dprintf((" > found newly allocated page: copy from seg0\n"));
-
-        /* this page is only accessible in the sharing segment seg0 so far (new
-           allocation). We can thus simply mark it accessible here. */
-        pagecopy(get_virtual_page(my_segnum, pagenum),
-                 get_virtual_page(0, pagenum));
-        release_all_privatization_locks();
-        return;
-    }
-
-    dprintf((" > import data from seg %d\n", copy_from_segnum));
-
-    /* before copying anything, acquire modification locks from our and
-       the other segment */
-    uint64_t to_lock = (1UL << copy_from_segnum);
-    acquire_modification_lock_set(to_lock, my_segnum);
-    pagecopy(get_virtual_page(my_segnum, pagenum),
-             get_virtual_page(copy_from_segnum, pagenum));
-
-    /* if there were modifications in the page, revert them. */
-    copy_bk_objs_in_page_from(copy_from_segnum, pagenum, false);
-
-    /* we need to go from 'src_version' to 'target_version'.  This
-       might need a walk into the past. */
-    struct stm_commit_log_entry_s *src_version, *target_version;
-    src_version = get_priv_segment(copy_from_segnum)->last_commit_log_entry;
-    target_version = STM_PSEGMENT->last_commit_log_entry;
-
-
-    dprintf(("handle_segfault_in_page: rev %lu to rev %lu\n",
-             src_version->rev_num, target_version->rev_num));
-    /* adapt revision of page to our revision:
-       if our rev is higher than the page we copy from, everything
-       is fine as we never read/modified the page anyway
-     */
-    if (src_version->rev_num > target_version->rev_num)
-        go_to_the_past(pagenum, src_version, target_version);
-
-    release_modification_lock_set(to_lock, my_segnum);
-    release_all_privatization_locks();
-}
-
-static void _signal_handler(int sig, siginfo_t *siginfo, void *context)
-{
-    assert(_stm_segfault_expected > 0);
-
-    int saved_errno = errno;
-    char *addr = siginfo->si_addr;
-    dprintf(("si_addr: %p\n", addr));
-    if (addr == NULL || addr < stm_object_pages ||
-        addr >= stm_object_pages+TOTAL_MEMORY) {
-        /* actual segfault, unrelated to stmgc */
-        fprintf(stderr, "Segmentation fault: accessing %p\n", addr);
-        detect_shadowstack_overflow(addr);
-        abort();
-    }
-
-    int segnum = get_segment_of_linear_address(addr);
-    OPT_ASSERT(segnum != 0);
-    if (segnum != STM_SEGMENT->segment_num) {
-        fprintf(stderr, "Segmentation fault: accessing %p (seg %d) from"
-                " seg %d\n", addr, segnum, STM_SEGMENT->segment_num);
-        abort();
-    }
-    dprintf(("-> segment: %d\n", segnum));
-
-    char *seg_base = STM_SEGMENT->segment_base;
-    uintptr_t pagenum = ((char*)addr - seg_base) / 4096UL;
-    if (pagenum < END_NURSERY_PAGE) {
-        fprintf(stderr, "Segmentation fault: accessing %p (seg %d "
-                        "page %lu)\n", addr, segnum, pagenum);
-        abort();
-    }
-
-    DEBUG_EXPECT_SEGFAULT(false);
-    handle_segfault_in_page(pagenum);
-    DEBUG_EXPECT_SEGFAULT(true);
-
-    errno = saved_errno;
-    /* now return and retry */
-}
 
 /* ############# commit log ############# */
 
@@ -361,6 +156,7 @@
 }
 
 static void reset_modified_from_backup_copies(int segment_num, object_t 
*only_obj);  /* forward */
+static void undo_modifications_to_single_obj(int segment_num, object_t 
*only_obj); /* forward */
 
 static bool _stm_validate(void)
 {
@@ -442,7 +238,7 @@
                     for (; undo < end; undo++) {
                         object_t *obj;
 
-                        if (undo->type != TYPE_POSITION_MARKER) {
+                        if (LIKELY(undo->type != TYPE_POSITION_MARKER)) {
                             /* common case: 'undo->object' was written to
                                in this past commit, so we must check that
                                it was not read by us. */
@@ -472,13 +268,17 @@
                                an abort. However, from now on, we also assume
                                that an abort would not roll-back to what is in
                                the backup copy, as we don't trace the bkcpy
-                               during major GCs.
+                               during major GCs. (Seg0 may contain the version
+                               found in the other segment and thus not have to
+                               content of our bk_copy)
+
                                We choose the approach to reset all our changes
                                to this obj here, so that we can throw away the
                                backup copy completely: */
                             /* XXX: this browses through the whole list of 
modified
                                fragments; this may become a problem... */
-                            reset_modified_from_backup_copies(my_segnum, obj);
+                            undo_modifications_to_single_obj(my_segnum, obj);
+
                             continue;
                         }
 
@@ -604,11 +404,15 @@
 */
 static void _validate_and_attach(struct stm_commit_log_entry_s *new)
 {
+    uintptr_t cle_length = 0;
     struct stm_commit_log_entry_s *old;
 
     OPT_ASSERT(new != NULL);
     OPT_ASSERT(new != INEV_RUNNING);
 
+    cle_length = list_count(STM_PSEGMENT->modified_old_objects);
+    assert(cle_length == new->written_count * 3);
+
     soon_finished_or_inevitable_thread_segment();
 
  retry_from_start:
@@ -617,6 +421,16 @@
         stm_abort_transaction();
     }
 
+    if (cle_length != list_count(STM_PSEGMENT->modified_old_objects)) {
+        /* something changed the list of modified objs during _stm_validate; or
+         * during a major GC that also does _stm_validate(). That "something"
+         * can only be a reset of a noconflict obj. Thus, we recreate the CL
+         * entry */
+        free_cle(new);
+        new = _create_commit_log_entry();
+        cle_length = list_count(STM_PSEGMENT->modified_old_objects);
+    }
+
 #if STM_TESTS
     if (STM_PSEGMENT->transaction_state != TS_INEVITABLE
         && STM_PSEGMENT->last_commit_log_entry->next == INEV_RUNNING) {
@@ -815,6 +629,10 @@
     size_t start_offset;
     if (first_call) {
         start_offset = 0;
+
+        /* flags like a never-touched obj */
+        assert(obj->stm_flags & GCFLAG_WRITE_BARRIER);
+        assert(!(obj->stm_flags & GCFLAG_WB_EXECUTED));
     } else {
         start_offset = -1;
     }
@@ -1249,8 +1067,7 @@
     assert(tree_is_cleared(STM_PSEGMENT->nursery_objects_shadows));
     assert(tree_is_cleared(STM_PSEGMENT->callbacks_on_commit_and_abort[0]));
     assert(tree_is_cleared(STM_PSEGMENT->callbacks_on_commit_and_abort[1]));
-    assert(list_is_empty(STM_PSEGMENT->young_objects_with_light_finalizers));
-    assert(STM_PSEGMENT->finalizers == NULL);
+    assert(list_is_empty(STM_PSEGMENT->young_objects_with_destructors));
     assert(STM_PSEGMENT->active_queues == NULL);
 #ifndef NDEBUG
     /* this should not be used when objects_pointing_to_nursery == NULL */
@@ -1259,6 +1076,13 @@
 
     check_nursery_at_transaction_start();
 
+    if (tl->mem_reset_on_abort) {
+        assert(!!tl->mem_stored_for_reset_on_abort);
+        memcpy(tl->mem_stored_for_reset_on_abort, tl->mem_reset_on_abort,
+               tl->mem_bytes_to_reset_on_abort);
+    }
+
+
     /* Change read-version here, because if we do stm_validate in the
        safe-point below, we should not see our old reads from the last
        transaction. */
@@ -1282,7 +1106,7 @@
 }
 
 #ifdef STM_NO_AUTOMATIC_SETJMP
-static int did_abort = 0;
+int did_abort = 0;
 #endif
 
 long _stm_start_transaction(stm_thread_local_t *tl)
@@ -1294,6 +1118,12 @@
 #else
     long repeat_count = stm_rewind_jmp_setjmp(tl);
 #endif
+    if (repeat_count) {
+        /* only if there was an abort, we need to reset the memory: */
+        if (tl->mem_reset_on_abort)
+            memcpy(tl->mem_reset_on_abort, tl->mem_stored_for_reset_on_abort,
+                   tl->mem_bytes_to_reset_on_abort);
+    }
     _do_start_transaction(tl);
 
     if (repeat_count == 0) {  /* else, 'nursery_mark' was already set
@@ -1419,6 +1249,7 @@
     push_large_overflow_objects_to_other_segments();
     /* push before validate. otherwise they are reachable too early */
 
+
     /* before releasing _stm_detached_inevitable_from_thread, perform
        the commit. Otherwise, the same thread whose (inev) transaction we try
        to commit here may start a new one in another segment *but* w/o
@@ -1436,7 +1267,7 @@
         /* but first, emit commit-event of this thread: */
         timing_event(STM_SEGMENT->running_thread, STM_TRANSACTION_COMMIT);
         STM_SEGMENT->running_thread = NULL;
-        write_fence();
+        stm_write_fence();
         assert(_stm_detached_inevitable_from_thread == -1);
         _stm_detached_inevitable_from_thread = 0;
     }
@@ -1481,6 +1312,48 @@
         invoke_general_finalizers(tl);
 }
 
+static void undo_modifications_to_single_obj(int segment_num, object_t *obj)
+{
+    /* special function used for noconflict objs to reset all their
+     * modifications and make them appear untouched in the current transaction.
+     * I.e., reset modifications and remove from all lists. */
+
+    struct stm_priv_segment_info_s *pseg = get_priv_segment(segment_num);
+
+    reset_modified_from_backup_copies(segment_num, obj);
+
+    /* reset read marker (must not be considered read either) */
+    ((struct stm_read_marker_s *)
+     (pseg->pub.segment_base + (((uintptr_t)obj) >> 4)))->rm = 0;
+
+    /* reset possibly marked cards */
+    if (get_page_status_in(segment_num, (uintptr_t)obj / 4096) == 
PAGE_ACCESSIBLE
+        && obj_should_use_cards(pseg->pub.segment_base, obj)) {
+        /* if header is not accessible, we didn't mark any cards */
+        _reset_object_cards(pseg, obj, CARD_CLEAR, false, false);
+    }
+
+    /* remove from all other lists */
+    LIST_FOREACH_R(pseg->old_objects_with_cards_set, object_t * /*item*/,
+       {
+           if (item == obj) {
+               /* copy last element over this one (HACK) */
+               _lst->count -= 1;
+               _lst->items[_i] = _lst->items[_lst->count];
+               break;
+           }
+       });
+    LIST_FOREACH_R(pseg->objects_pointing_to_nursery, object_t * /*item*/,
+       {
+           if (item == obj) {
+               /* copy last element over this one (HACK) */
+               _lst->count -= 1;
+               _lst->items[_i] = _lst->items[_lst->count];
+               break;
+           }
+       });
+}
+
 static void reset_modified_from_backup_copies(int segment_num, object_t 
*only_obj)
 {
 #pragma push_macro("STM_PSEGMENT")
@@ -1490,6 +1363,9 @@
     assert(modification_lock_check_wrlock(segment_num));
     DEBUG_EXPECT_SEGFAULT(false);
 
+    /* WARNING: resetting the obj will remove the WB flag. Make sure you either
+     * re-add it or remove it from lists where it was added based on the flag. 
*/
+
     struct stm_priv_segment_info_s *pseg = get_priv_segment(segment_num);
     struct list_s *list = pseg->modified_old_objects;
     struct stm_undo_s *undo = (struct stm_undo_s *)list->items;
@@ -1500,7 +1376,7 @@
             continue;
 
         object_t *obj = undo->object;
-        if (only_obj != NULL && obj != only_obj)
+        if (UNLIKELY(only_obj != NULL) && LIKELY(obj != only_obj))
             continue;
 
         char *dst = REAL_ADDRESS(pseg->pub.segment_base, obj);
@@ -1515,19 +1391,15 @@
 
         free_bk(undo);
 
-        if (only_obj != NULL) {
-            assert(IMPLY(only_obj != NULL,
-                         (((struct object_s *)dst)->stm_flags
-                          & (GCFLAG_NO_CONFLICT
-                             | GCFLAG_WRITE_BARRIER
-                             | GCFLAG_WB_EXECUTED))
-                         == (GCFLAG_NO_CONFLICT | GCFLAG_WRITE_BARRIER)));
+        if (UNLIKELY(only_obj != NULL)) {
+            assert(((struct object_s *)dst)->stm_flags & GCFLAG_NO_CONFLICT);
+
             /* copy last element over this one */
             end--;
             list->count -= 3;
-            if (undo < end)
-                *undo = *end;
-            undo--;  /* next itr */
+            *undo = *end;
+            /* to neutralise the increment for the next iter: */
+            undo--;
         }
     }
 
@@ -1652,6 +1524,13 @@
 
     if (tl->mem_clear_on_abort)
         memset(tl->mem_clear_on_abort, 0, tl->mem_bytes_to_clear_on_abort);
+    if (tl->mem_reset_on_abort) {
+        /* temporarily set the memory of mem_reset_on_abort to zeros since in 
the
+           case of vmprof, the old value is really wrong if we didn't do the 
longjmp
+           back yet (that restores the C stack). We restore the memory in
+           _stm_start_transaction() */
+        memset(tl->mem_reset_on_abort, 0, tl->mem_bytes_to_reset_on_abort);
+    }
 
     invoke_and_clear_user_callbacks(1);   /* for abort */
 
@@ -1760,7 +1639,7 @@
        0. We have to wait for this to happen bc. otherwise, eg.
        _stm_detach_inevitable_transaction is not safe to do yet */
     while (_stm_detached_inevitable_from_thread == -1)
-        spin_loop();
+        stm_spin_loop();
     assert(_stm_detached_inevitable_from_thread == 0);
 
     soon_finished_or_inevitable_thread_segment();
@@ -1830,13 +1709,13 @@
     assert(STM_PSEGMENT->privatization_lock);
     assert(obj->stm_flags & GCFLAG_WRITE_BARRIER);
     assert(!(obj->stm_flags & GCFLAG_WB_EXECUTED));
+    assert(!(obj->stm_flags & GCFLAG_CARDS_SET));
 
     ssize_t obj_size = stmcb_size_rounded_up(
         (struct object_s *)REAL_ADDRESS(STM_SEGMENT->segment_base, obj));
     OPT_ASSERT(obj_size >= 16);
 
     if (LIKELY(is_small_uniform(obj))) {
-        assert(!(obj->stm_flags & GCFLAG_CARDS_SET));
         OPT_ASSERT(obj_size <= GC_LAST_SMALL_SIZE);
         _synchronize_fragment((stm_char *)obj, obj_size);
         return;
diff --git a/c8/stm/core.h b/c8/stm/core.h
--- a/c8/stm/core.h
+++ b/c8/stm/core.h
@@ -1,3 +1,9 @@
+#ifndef _STMGC_H
+# error "must be compiled via stmgc.c"
+# include "../stmgc.h"  // silence flymake
+#endif
+
+
 #define _STM_CORE_H_
 
 #include <stdlib.h>
@@ -7,7 +13,8 @@
 #include <errno.h>
 #include <pthread.h>
 #include <signal.h>
-
+#include <stdbool.h>
+#include "list.h"
 
 /************************************************************/
 
@@ -139,9 +146,9 @@
     pthread_t running_pthread;
 #endif
 
-    /* light finalizers */
-    struct list_s *young_objects_with_light_finalizers;
-    struct list_s *old_objects_with_light_finalizers;
+    /* destructors */
+    struct list_s *young_objects_with_destructors;
+    struct list_s *old_objects_with_destructors;
 
     /* regular finalizers (objs from the current transaction only) */
     struct finalizers_s *finalizers;
@@ -304,6 +311,14 @@
 static bool _stm_validate(void);
 static void _core_commit_transaction(bool external);
 
+static void import_objects(
+        int from_segnum,            /* or -1: from undo->backup,
+                                       or -2: from undo->backup if not 
modified */
+        uintptr_t pagenum,          /* or -1: "all accessible" */
+        struct stm_undo_s *undo,
+        struct stm_undo_s *end);
+
+
 static inline bool was_read_remote(char *base, object_t *obj)
 {
     uint8_t other_transaction_read_version =
@@ -326,12 +341,12 @@
 
 static inline void acquire_privatization_lock(int segnum)
 {
-    spinlock_acquire(get_priv_segment(segnum)->privatization_lock);
+    stm_spinlock_acquire(get_priv_segment(segnum)->privatization_lock);
 }
 
 static inline void release_privatization_lock(int segnum)
 {
-    spinlock_release(get_priv_segment(segnum)->privatization_lock);
+    stm_spinlock_release(get_priv_segment(segnum)->privatization_lock);
 }
 
 static inline bool all_privatization_locks_acquired(void)
diff --git a/c8/stm/detach.c b/c8/stm/detach.c
--- a/c8/stm/detach.c
+++ b/c8/stm/detach.c
@@ -1,5 +1,6 @@
 #ifndef _STM_CORE_H_
 # error "must be compiled via stmgc.c"
+# include "core.h"  // silence flymake
 #endif
 
 #include <errno.h>
@@ -107,7 +108,7 @@
                is reset to a value different from -1 */
             dprintf(("reattach_transaction: busy wait...\n"));
             while (_stm_detached_inevitable_from_thread == -1)
-                spin_loop();
+                stm_spin_loop();
 
             /* then retry */
             goto restart;
@@ -157,7 +158,7 @@
         /* busy-loop: wait until _stm_detached_inevitable_from_thread
            is reset to a value different from -1 */
         while (_stm_detached_inevitable_from_thread == -1)
-            spin_loop();
+            stm_spin_loop();
         goto restart;
     }
     if (!__sync_bool_compare_and_swap(&_stm_detached_inevitable_from_thread,
@@ -209,7 +210,7 @@
         /* busy-loop: wait until _stm_detached_inevitable_from_thread
            is reset to a value different from -1 */
         while (_stm_detached_inevitable_from_thread == -1)
-            spin_loop();
+            stm_spin_loop();
         goto restart;
     }
 }
diff --git a/c8/stm/extra.c b/c8/stm/extra.c
--- a/c8/stm/extra.c
+++ b/c8/stm/extra.c
@@ -1,5 +1,6 @@
 #ifndef _STM_CORE_H_
 # error "must be compiled via stmgc.c"
+# include "core.h"  // silence flymake
 #endif
 
 
diff --git a/c8/stm/finalizer.c b/c8/stm/finalizer.c
--- a/c8/stm/finalizer.c
+++ b/c8/stm/finalizer.c
@@ -1,68 +1,100 @@
-
+#ifndef _STM_CORE_H_
+# error "must be compiled via stmgc.c"
+# include "core.h"  // silence flymake
+#endif
+#include "finalizer.h"
+#include "fprintcolor.h"
+#include "nursery.h"
+#include "gcpage.h"
 
 /* callbacks */
-void (*stmcb_light_finalizer)(object_t *);
+void (*stmcb_destructor)(object_t *);
 void (*stmcb_finalizer)(object_t *);
 
 
 static void init_finalizers(struct finalizers_s *f)
 {
     f->objects_with_finalizers = list_create();
-    f->count_non_young = 0;
-    f->run_finalizers = NULL;
-    f->running_next = NULL;
+    f->probably_young_objects_with_finalizers = list_create();
+    f->run_finalizers = list_create();
+    f->lock = 0;
+    f->running_trigger_now = NULL;
 }
 
 static void setup_finalizer(void)
 {
     init_finalizers(&g_finalizers);
+
+    for (long j = 1; j < NB_SEGMENTS; j++) {
+        struct stm_priv_segment_info_s *pseg = get_priv_segment(j);
+
+        assert(pseg->finalizers == NULL);
+        struct finalizers_s *f = malloc(sizeof(struct finalizers_s));
+        if (f == NULL)
+            stm_fatalerror("out of memory in create_finalizers");   /* XXX */
+        init_finalizers(f);
+        pseg->finalizers = f;
+    }
 }
 
-static void teardown_finalizer(void)
+void stm_setup_finalizer_queues(int number, stm_finalizer_trigger_fn *triggers)
 {
-    if (g_finalizers.run_finalizers != NULL)
-        list_free(g_finalizers.run_finalizers);
-    list_free(g_finalizers.objects_with_finalizers);
+    assert(g_finalizer_triggers.count == 0);
+    assert(g_finalizer_triggers.triggers == NULL);
+
+    g_finalizer_triggers.count = number;
+    g_finalizer_triggers.triggers = (stm_finalizer_trigger_fn *)
+        malloc(number * sizeof(stm_finalizer_trigger_fn));
+
+    for (int qindex = 0; qindex < number; qindex++) {
+        g_finalizer_triggers.triggers[qindex] = triggers[qindex];
+        dprintf(("setup_finalizer_queue(qindex=%d,fun=%p)\n", qindex, 
triggers[qindex]));
+    }
+}
+
+static void teardown_finalizer(void) {
+    LIST_FREE(g_finalizers.run_finalizers);
+    LIST_FREE(g_finalizers.objects_with_finalizers);
+    LIST_FREE(g_finalizers.probably_young_objects_with_finalizers);
     memset(&g_finalizers, 0, sizeof(g_finalizers));
+
+    if (g_finalizer_triggers.triggers)
+        free(g_finalizer_triggers.triggers);
+    memset(&g_finalizer_triggers, 0, sizeof(g_finalizer_triggers));
 }
 
 static void _commit_finalizers(void)
 {
     /* move finalizer lists to g_finalizers for major collections */
     while (__sync_lock_test_and_set(&g_finalizers.lock, 1) != 0) {
-        spin_loop();
+        stm_spin_loop();
     }
 
-    if (STM_PSEGMENT->finalizers->run_finalizers != NULL) {
+    struct finalizers_s *local_fs = STM_PSEGMENT->finalizers;
+    if (!list_is_empty(local_fs->run_finalizers)) {
         /* copy 'STM_PSEGMENT->finalizers->run_finalizers' into
            'g_finalizers.run_finalizers', dropping any initial NULLs
            (finalizers already called) */
-        struct list_s *src = STM_PSEGMENT->finalizers->run_finalizers;
-        uintptr_t frm = 0;
-        if (STM_PSEGMENT->finalizers->running_next != NULL) {
-            frm = *STM_PSEGMENT->finalizers->running_next;
-            assert(frm <= list_count(src));
-            *STM_PSEGMENT->finalizers->running_next = (uintptr_t)-1;
-        }
-        if (frm < list_count(src)) {
-            if (g_finalizers.run_finalizers == NULL)
-                g_finalizers.run_finalizers = list_create();
+        struct list_s *src = local_fs->run_finalizers;
+        if (list_count(src)) {
             g_finalizers.run_finalizers = list_extend(
                 g_finalizers.run_finalizers,
-                src, frm);
+                src, 0);
         }
-        list_free(src);
     }
+    LIST_FREE(local_fs->run_finalizers);
 
     /* copy the whole 'STM_PSEGMENT->finalizers->objects_with_finalizers'
        into 'g_finalizers.objects_with_finalizers' */
     g_finalizers.objects_with_finalizers = list_extend(
         g_finalizers.objects_with_finalizers,
-        STM_PSEGMENT->finalizers->objects_with_finalizers, 0);
-    list_free(STM_PSEGMENT->finalizers->objects_with_finalizers);
+        local_fs->objects_with_finalizers, 0);
+    LIST_FREE(local_fs->objects_with_finalizers);
+    assert(list_is_empty(local_fs->probably_young_objects_with_finalizers));
+    LIST_FREE(local_fs->probably_young_objects_with_finalizers);
 
-    free(STM_PSEGMENT->finalizers);
-    STM_PSEGMENT->finalizers = NULL;
+    // re-init
+    init_finalizers(local_fs);
 
     __sync_lock_release(&g_finalizers.lock);
 }
@@ -71,24 +103,22 @@
 {
     /* like _commit_finalizers(), but forget everything from the
        current transaction */
-    if (pseg->finalizers != NULL) {
-        if (pseg->finalizers->run_finalizers != NULL) {
-            if (pseg->finalizers->running_next != NULL) {
-                *pseg->finalizers->running_next = (uintptr_t)-1;
-            }
-            list_free(pseg->finalizers->run_finalizers);
-        }
-        list_free(pseg->finalizers->objects_with_finalizers);
-        free(pseg->finalizers);
-        pseg->finalizers = NULL;
-    }
+    LIST_FREE(pseg->finalizers->run_finalizers);
+    LIST_FREE(pseg->finalizers->objects_with_finalizers);
+    LIST_FREE(pseg->finalizers->probably_young_objects_with_finalizers);
+    // re-init
+    init_finalizers(pseg->finalizers);
+
+    // if we were running triggers, release the lock:
+    if (g_finalizers.running_trigger_now == pseg)
+        g_finalizers.running_trigger_now = NULL;
 
     /* call the light finalizers for objects that are about to
        be forgotten from the current transaction */
     char *old_gs_register = STM_SEGMENT->segment_base;
     bool must_fix_gs = old_gs_register != pseg->pub.segment_base;
 
-    struct list_s *lst = pseg->young_objects_with_light_finalizers;
+    struct list_s *lst = pseg->young_objects_with_destructors;
     long i, count = list_count(lst);
     if (lst > 0) {
         for (i = 0; i < count; i++) {
@@ -98,15 +128,15 @@
                 set_gs_register(pseg->pub.segment_base);
                 must_fix_gs = false;
             }
-            stmcb_light_finalizer(obj);
+            stmcb_destructor(obj);
         }
         list_clear(lst);
     }
 
     /* also deals with overflow objects: they are at the tail of
-       old_objects_with_light_finalizers (this list is kept in order
+       old_objects_with_destructors (this list is kept in order
        and we cannot add any already-committed object) */
-    lst = pseg->old_objects_with_light_finalizers;
+    lst = pseg->old_objects_with_destructors;
     count = list_count(lst);
     while (count > 0) {
         object_t *obj = (object_t *)list_item(lst, --count);
@@ -117,7 +147,7 @@
             set_gs_register(pseg->pub.segment_base);
             must_fix_gs = false;
         }
-        stmcb_light_finalizer(obj);
+        stmcb_destructor(obj);
     }
 
     if (STM_SEGMENT->segment_base != old_gs_register)
@@ -125,44 +155,42 @@
 }
 
 
-void stm_enable_light_finalizer(object_t *obj)
+void stm_enable_destructor(object_t *obj)
 {
     if (_is_young(obj)) {
-        LIST_APPEND(STM_PSEGMENT->young_objects_with_light_finalizers, obj);
+        LIST_APPEND(STM_PSEGMENT->young_objects_with_destructors, obj);
     }
     else {
         assert(_is_from_same_transaction(obj));
-        LIST_APPEND(STM_PSEGMENT->old_objects_with_light_finalizers, obj);
+        LIST_APPEND(STM_PSEGMENT->old_objects_with_destructors, obj);
     }
 }
 
-object_t *stm_allocate_with_finalizer(ssize_t size_rounded_up)
+
+void stm_enable_finalizer(int queue_index, object_t *obj)
 {
-    object_t *obj = _stm_allocate_external(size_rounded_up);
-
-    if (STM_PSEGMENT->finalizers == NULL) {
-        struct finalizers_s *f = malloc(sizeof(struct finalizers_s));
-        if (f == NULL)
-            stm_fatalerror("out of memory in create_finalizers");   /* XXX */
-        init_finalizers(f);
-        STM_PSEGMENT->finalizers = f;
+    if (_is_young(obj)) {
+        
LIST_APPEND(STM_PSEGMENT->finalizers->probably_young_objects_with_finalizers, 
obj);
+        
LIST_APPEND(STM_PSEGMENT->finalizers->probably_young_objects_with_finalizers, 
queue_index);
     }
-    assert(STM_PSEGMENT->finalizers->count_non_young
-           <= list_count(STM_PSEGMENT->finalizers->objects_with_finalizers));
-    LIST_APPEND(STM_PSEGMENT->finalizers->objects_with_finalizers, obj);
-    return obj;
+    else {
+        assert(_is_from_same_transaction(obj));
+        LIST_APPEND(STM_PSEGMENT->finalizers->objects_with_finalizers, obj);
+        LIST_APPEND(STM_PSEGMENT->finalizers->objects_with_finalizers, 
queue_index);
+    }
 }
 
 
+
 /************************************************************/
-/*  Light finalizers
+/*  Destructors
 */
 
-static void deal_with_young_objects_with_finalizers(void)
+static void deal_with_young_objects_with_destructors(void)
 {
-    /* for light finalizers: executes finalizers for objs that don't survive
+    /* for destructors: executes destructors for objs that don't survive
        this minor gc */
-    struct list_s *lst = STM_PSEGMENT->young_objects_with_light_finalizers;
+    struct list_s *lst = STM_PSEGMENT->young_objects_with_destructors;
     long i, count = list_count(lst);
     for (i = 0; i < count; i++) {
         object_t *obj = (object_t *)list_item(lst, i);
@@ -171,28 +199,29 @@
         object_t *TLPREFIX *pforwarded_array = (object_t *TLPREFIX *)obj;
         if (pforwarded_array[0] != GCWORD_MOVED) {
             /* not moved: the object dies */
-            stmcb_light_finalizer(obj);
+            stmcb_destructor(obj);
         }
         else {
             obj = pforwarded_array[1]; /* moved location */
             assert(!_is_young(obj));
-            LIST_APPEND(STM_PSEGMENT->old_objects_with_light_finalizers, obj);
+            LIST_APPEND(STM_PSEGMENT->old_objects_with_destructors, obj);
         }
     }
     list_clear(lst);
 }
 
-static void deal_with_old_objects_with_finalizers(void)
+static void deal_with_old_objects_with_destructors(void)
 {
-    /* for light finalizers */
+    /* for destructors */
     int old_gs_register = STM_SEGMENT->segment_num;
     int current_gs_register = old_gs_register;
     long j;
-    
assert(list_is_empty(get_priv_segment(0)->old_objects_with_light_finalizers));
+    assert(list_is_empty(get_priv_segment(0)->old_objects_with_destructors));
     for (j = 1; j < NB_SEGMENTS; j++) {
         struct stm_priv_segment_info_s *pseg = get_priv_segment(j);
 
-        struct list_s *lst = pseg->old_objects_with_light_finalizers;
+        assert(list_is_empty(pseg->young_objects_with_destructors));
+        struct list_s *lst = pseg->old_objects_with_destructors;
         long i, count = list_count(lst);
         lst->count = 0;
         for (i = 0; i < count; i++) {
@@ -214,7 +243,7 @@
                     set_gs_register(get_segment_base(j));
                     current_gs_register = j;
                 }
-                stmcb_light_finalizer(obj);
+                stmcb_destructor(obj);
             }
             else {
                 /* object survives */
@@ -227,6 +256,7 @@
 }
 
 
+
 /************************************************************/
 /*  Algorithm for regular (non-light) finalizers.
     Follows closely pypy/doc/discussion/finalizer-order.rst
@@ -325,20 +355,23 @@
 
     struct list_s *marked = list_create();
 
+    assert(list_is_empty(f->probably_young_objects_with_finalizers));
     struct list_s *lst = f->objects_with_finalizers;
     long i, count = list_count(lst);
     lst->count = 0;
-    f->count_non_young = 0;
 
-    for (i = 0; i < count; i++) {
+    for (i = 0; i < count; i += 2) {
         object_t *x = (object_t *)list_item(lst, i);
+        uintptr_t qindex = list_item(lst, i + 1);
 
         assert(_finalization_state(x) != 1);
         if (_finalization_state(x) >= 2) {
             list_set_item(lst, lst->count++, (uintptr_t)x);
+            list_set_item(lst, lst->count++, qindex);
             continue;
         }
         LIST_APPEND(marked, x);
+        LIST_APPEND(marked, qindex);
 
         struct list_s *pending = _finalizer_pending;
         LIST_APPEND(pending, x);
@@ -370,27 +403,29 @@
     struct list_s *run_finalizers = f->run_finalizers;
 
     long i, count = list_count(marked);
-    for (i = 0; i < count; i++) {
+    for (i = 0; i < count; i += 2) {
         object_t *x = (object_t *)list_item(marked, i);
+        uintptr_t qindex = list_item(marked, i + 1);
 
         int state = _finalization_state(x);
         assert(state >= 2);
         if (state == 2) {
-            if (run_finalizers == NULL)
-                run_finalizers = list_create();
             LIST_APPEND(run_finalizers, x);
+            LIST_APPEND(run_finalizers, qindex);
             _recursively_bump_finalization_state_from_2_to_3(pseg, x);
         }
         else {
             struct list_s *lst = f->objects_with_finalizers;
             list_set_item(lst, lst->count++, (uintptr_t)x);
+            list_set_item(lst, lst->count++, qindex);
         }
     }
-    list_free(marked);
+    LIST_FREE(marked);
 
     f->run_finalizers = run_finalizers;
 }
 
+
 static void deal_with_objects_with_finalizers(void)
 {
     /* for non-light finalizers */
@@ -433,11 +468,10 @@
 static void mark_visit_from_finalizer1(
     struct stm_priv_segment_info_s *pseg, struct finalizers_s *f)
 {
-    if (f != NULL && f->run_finalizers != NULL) {
-        LIST_FOREACH_R(f->run_finalizers, object_t * /*item*/,
-                       ({
-                           mark_visit_possibly_overflow_object(item, pseg);
-                       }));
+    long i, count = list_count(f->run_finalizers);
+    for (i = 0; i < count; i += 2) {
+        object_t *x = (object_t *)list_item(f->run_finalizers, i);
+        mark_visit_possibly_overflow_object(x, pseg);
     }
 }
 
@@ -451,40 +485,6 @@
     mark_visit_from_finalizer1(get_priv_segment(0), &g_finalizers);
 }
 
-static void _execute_finalizers(struct finalizers_s *f)
-{
-    if (f->run_finalizers == NULL)
-        return;   /* nothing to do */
-
- restart:
-    if (f->running_next != NULL)
-        return;   /* in a nested invocation of execute_finalizers() */
-
-    uintptr_t next = 0, total = list_count(f->run_finalizers);
-    f->running_next = &next;
-
-    while (next < total) {
-        object_t *obj = (object_t *)list_item(f->run_finalizers, next);
-        list_set_item(f->run_finalizers, next, 0);
-        next++;
-
-        stmcb_finalizer(obj);
-    }
-    if (next == (uintptr_t)-1) {
-        /* transaction committed: the whole 'f' was freed */
-        return;
-    }
-    f->running_next = NULL;
-
-    if (f->run_finalizers->count > total) {
-        memmove(f->run_finalizers->items,
-                f->run_finalizers->items + total,
-                (f->run_finalizers->count - total) * sizeof(uintptr_t));
-        goto restart;
-    }
-
-    LIST_FREE(f->run_finalizers);
-}
 
 /* XXX: according to translator.backendopt.finalizer, getfield_gc
         for primitive types is a safe op in light finalizers.
@@ -492,43 +492,185 @@
         getfield on *dying obj*).
 */
 
+static void _trigger_finalizer_queues(struct finalizers_s *f)
+{
+    /* runs triggers of finalizer queues that have elements in the queue. May
+       NOT run outside of a transaction, but triggers never leave the
+       transactional zone.
+
+       returns true if there are also old-style finalizers to run */
+    assert(in_transaction(STM_PSEGMENT->pub.running_thread));
+
+    bool *to_trigger = (bool*)alloca(g_finalizer_triggers.count * 
sizeof(bool));
+    memset(to_trigger, 0, g_finalizer_triggers.count * sizeof(bool));
+
+    while (__sync_lock_test_and_set(&f->lock, 1) != 0) {
+        /* somebody is adding more finalizers (_commit_finalizer()) */
+        stm_spin_loop();
+    }
+
+    int count = list_count(f->run_finalizers);
+    for (int i = 0; i < count; i += 2) {
+        int qindex = (int)list_item(f->run_finalizers, i + 1);
+        dprintf(("qindex=%d\n", qindex));
+        to_trigger[qindex] = true;
+    }
+
+    __sync_lock_release(&f->lock);
+
+    // trigger now:
+    for (int i = 0; i < g_finalizer_triggers.count; i++) {
+        if (to_trigger[i]) {
+            dprintf(("invoke-finalizer-trigger(qindex=%d)\n", i));
+            g_finalizer_triggers.triggers[i]();
+        }
+    }
+}
+
+static bool _has_oldstyle_finalizers(struct finalizers_s *f)
+{
+    int count = list_count(f->run_finalizers);
+    for (int i = 0; i < count; i += 2) {
+        int qindex = (int)list_item(f->run_finalizers, i + 1);
+        if (qindex == -1)
+            return true;
+    }
+    return false;
+}
+
+static void _invoke_local_finalizers()
+{
+    /* called inside a transaction; invoke local triggers, process old-style
+     * local finalizers */
+    dprintf(("invoke_local_finalizers %lu\n", 
list_count(STM_PSEGMENT->finalizers->run_finalizers)));
+    if (list_is_empty(STM_PSEGMENT->finalizers->run_finalizers)
+        && list_is_empty(g_finalizers.run_finalizers))
+        return;
+
+    struct stm_priv_segment_info_s *pseg = 
get_priv_segment(STM_SEGMENT->segment_num);
+    //try to run local triggers
+    if (STM_PSEGMENT->finalizers->running_trigger_now == NULL) {
+        // we are not recursively running them
+        STM_PSEGMENT->finalizers->running_trigger_now = pseg;
+        _trigger_finalizer_queues(STM_PSEGMENT->finalizers);
+        STM_PSEGMENT->finalizers->running_trigger_now = NULL;
+    }
+
+    // try to run global triggers
+    if (__sync_lock_test_and_set(&g_finalizers.running_trigger_now, pseg) == 
NULL) {
+        // nobody is already running these triggers (recursively)
+        _trigger_finalizer_queues(&g_finalizers);
+        g_finalizers.running_trigger_now = NULL;
+    }
+
+    if (!_has_oldstyle_finalizers(STM_PSEGMENT->finalizers))
+        return; // no oldstyle to run
+
+    object_t *obj;
+    while ((obj = stm_next_to_finalize(-1)) != NULL) {
+        stmcb_finalizer(obj);
+    }
+}
+
 static void _invoke_general_finalizers(stm_thread_local_t *tl)
 {
-    /* called between transactions */
+    /* called between transactions
+     * triggers not called here, since all should have been called already in 
_invoke_local_finalizers!
+     * run old-style finalizers (q_index=-1)
+     * queues that are not empty. */
+    dprintf(("invoke_general_finalizers %lu\n", 
list_count(g_finalizers.run_finalizers)));
+    if (list_is_empty(g_finalizers.run_finalizers))
+        return;
+
+    if (!_has_oldstyle_finalizers(&g_finalizers))
+        return; // no oldstyle to run
+
+    // run old-style finalizers:
     rewind_jmp_buf rjbuf;
     stm_rewind_jmp_enterframe(tl, &rjbuf);
     _stm_start_transaction(tl);
-    /* XXX: become inevitable, bc. otherwise, we would need to keep
-       around the original g_finalizers.run_finalizers to restore it
-       in case of an abort. */
-    _stm_become_inevitable(MSG_INEV_DONT_SLEEP);
-    /* did it work? */
-    if (STM_PSEGMENT->transaction_state != TS_INEVITABLE) {   /* no */
-        /* avoid blocking here, waiting for another INEV transaction.
-           If we did that, application code could not proceed (start the
-           next transaction) and it will not be obvious from the profile
-           why we were WAITing. */
-        _stm_commit_transaction();
-        stm_rewind_jmp_leaveframe(tl, &rjbuf);
-        return;
-    }
 
-    while (__sync_lock_test_and_set(&g_finalizers.lock, 1) != 0) {
-        /* somebody is adding more finalizers (_commit_finalizer()) */
-        spin_loop();
-    }
-    struct finalizers_s copy = g_finalizers;
-    assert(copy.running_next == NULL);
-    g_finalizers.run_finalizers = NULL;
-    /* others may add to g_finalizers again: */
-    __sync_lock_release(&g_finalizers.lock);
-
-    if (copy.run_finalizers != NULL) {
-        _execute_finalizers(&copy);
+    dprintf(("invoke_oldstyle_finalizers %lu\n", 
list_count(g_finalizers.run_finalizers)));
+    object_t *obj;
+    while ((obj = stm_next_to_finalize(-1)) != NULL) {
+        assert(STM_PSEGMENT->transaction_state == TS_INEVITABLE);
+        stmcb_finalizer(obj);
     }
 
     _stm_commit_transaction();
     stm_rewind_jmp_leaveframe(tl, &rjbuf);
+}
 
-    LIST_FREE(copy.run_finalizers);
+object_t* stm_next_to_finalize(int queue_index) {
+    assert(STM_PSEGMENT->transaction_state != TS_NONE);
+
+    /* first check local run_finalizers queue, then global */
+    if (!list_is_empty(STM_PSEGMENT->finalizers->run_finalizers)) {
+        struct list_s *lst = STM_PSEGMENT->finalizers->run_finalizers;
+        int count = list_count(lst);
+        for (int i = 0; i < count; i += 2) {
+            int qindex = (int)list_item(lst, i + 1);
+            if (qindex == queue_index) {
+                /* no need to become inevitable for local ones */
+                /* Remove obj from list and return it. */
+                object_t *obj = (object_t*)list_item(lst, i);
+                int remaining = count - i - 2;
+                if (remaining > 0) {
+                    memmove(&lst->items[i],
+                            &lst->items[i + 2],
+                            remaining * sizeof(uintptr_t));
+                }
+                lst->count -= 2;
+                return obj;
+            }
+        }
+    }
+
+    /* no local finalizers found, continue in global list */
+
+    while (__sync_lock_test_and_set(&g_finalizers.lock, 1) != 0) {
+        /* somebody is adding more finalizers (_commit_finalizer()) */
+        stm_spin_loop();
+    }
+
+    struct list_s *lst = g_finalizers.run_finalizers;
+    int count = list_count(lst);
+    for (int i = 0; i < count; i += 2) {
+        int qindex = (int)list_item(lst, i + 1);
+        if (qindex == queue_index) {
+            /* XXX: become inevitable, bc. otherwise, we would need to keep
+               around the original g_finalizers.run_finalizers to restore it
+               in case of an abort. */
+            if (STM_PSEGMENT->transaction_state != TS_INEVITABLE) {
+                _stm_become_inevitable(MSG_INEV_DONT_SLEEP);
+                /* did it work? */
+                if (STM_PSEGMENT->transaction_state != TS_INEVITABLE) {   /* 
no */
+                    /* avoid blocking here, waiting for another INEV 
transaction.
+                       If we did that, application code could not proceed 
(start the
+                       next transaction) and it will not be obvious from the 
profile
+                       why we were WAITing. XXX: still true? */
+                    __sync_lock_release(&g_finalizers.lock);
+                    return NULL;
+                }
+            }
+
+            /* Remove obj from list and return it. */
+            object_t *obj = (object_t*)list_item(lst, i);
+            int remaining = count - i - 2;
+            if (remaining > 0) {
+                memmove(&lst->items[i],
+                        &lst->items[i + 2],
+                        remaining * sizeof(uintptr_t));
+            }
+            lst->count -= 2;
+
+            __sync_lock_release(&g_finalizers.lock);
+            return obj;
+        }
+    }
+
+    /* others may add to g_finalizers again: */
+    __sync_lock_release(&g_finalizers.lock);
+
+    return NULL;
 }
diff --git a/c8/stm/finalizer.h b/c8/stm/finalizer.h
--- a/c8/stm/finalizer.h
+++ b/c8/stm/finalizer.h
@@ -1,16 +1,20 @@
+#ifndef _STM_FINALIZER_H_
+#define _STM_FINALIZER_H_
+
+#include <stdint.h>
 
 /* see deal_with_objects_with_finalizers() for explanation of these fields */
 struct finalizers_s {
     long lock;
+    struct stm_priv_segment_info_s * running_trigger_now; /* our PSEG, if we 
are running triggers */
     struct list_s *objects_with_finalizers;
-    uintptr_t count_non_young;
+    struct list_s *probably_young_objects_with_finalizers; /* empty on 
g_finalizers! */
     struct list_s *run_finalizers;
-    uintptr_t *running_next;
 };
 
 static void mark_visit_from_finalizer_pending(void);
-static void deal_with_young_objects_with_finalizers(void);
-static void deal_with_old_objects_with_finalizers(void);
+static void deal_with_young_objects_with_destructors(void);
+static void deal_with_old_objects_with_destructors(void);
 static void deal_with_objects_with_finalizers(void);
 
 static void setup_finalizer(void);
@@ -27,19 +31,22 @@
 
 /* regular finalizers (objs from already-committed transactions) */
 static struct finalizers_s g_finalizers;
+static struct {
+    int count;
+    stm_finalizer_trigger_fn *triggers;
+} g_finalizer_triggers;
+
 
 static void _invoke_general_finalizers(stm_thread_local_t *tl);
+static void _invoke_local_finalizers(void);
 
 #define invoke_general_finalizers(tl)    do {   \
-    if (g_finalizers.run_finalizers != NULL)    \
-        _invoke_general_finalizers(tl);         \
+     _invoke_general_finalizers(tl);         \
 } while (0)
 
-static void _execute_finalizers(struct finalizers_s *f);
 
-#define any_local_finalizers() (STM_PSEGMENT->finalizers != NULL &&         \
-                               STM_PSEGMENT->finalizers->run_finalizers != 
NULL)
 #define exec_local_finalizers()  do {                   \
-    if (any_local_finalizers())                         \
-        _execute_finalizers(STM_PSEGMENT->finalizers);  \
+     _invoke_local_finalizers();                     \
 } while (0)
+
+#endif
diff --git a/c8/stm/forksupport.c b/c8/stm/forksupport.c
--- a/c8/stm/forksupport.c
+++ b/c8/stm/forksupport.c
@@ -1,5 +1,6 @@
 #ifndef _STM_CORE_H_
 # error "must be compiled via stmgc.c"
+# include "core.h"  // silence flymake
 #endif
 
 #include <fcntl.h>           /* For O_* constants */
diff --git a/c8/stm/fprintcolor.h b/c8/stm/fprintcolor.h
--- a/c8/stm/fprintcolor.h
+++ b/c8/stm/fprintcolor.h
@@ -1,3 +1,7 @@
+#ifndef _FPRINTCOLOR_H
+#define _FPRINTCOLOR_H
+
+
 /* ------------------------------------------------------------ */
 #ifdef STM_DEBUGPRINT
 /* ------------------------------------------------------------ */
@@ -40,3 +44,5 @@
 __attribute__((unused))
 static void stm_fatalerror(const char *format, ...)
      __attribute__((format (printf, 1, 2), noreturn));
+
+#endif
diff --git a/c8/stm/gcpage.c b/c8/stm/gcpage.c
--- a/c8/stm/gcpage.c
+++ b/c8/stm/gcpage.c
@@ -1,5 +1,6 @@
 #ifndef _STM_CORE_H_
 # error "must be compiled via stmgc.c"
+# include "core.h"  // silence flymake
 #endif
 
 static struct tree_s *tree_prebuilt_objs = NULL;     /* XXX refactor */
@@ -75,7 +76,7 @@
 
 
     /* uncommon case: need to initialize some more pages */
-    spinlock_acquire(lock_growth_large);
+    stm_spinlock_acquire(lock_growth_large);
 
     char *start = uninitialized_page_start;
     if (addr + size > start) {
@@ -99,7 +100,7 @@
 
     ((struct object_s*)addr)->stm_flags = 0;
 
-    spinlock_release(lock_growth_large);
+    stm_spinlock_release(lock_growth_large);
     return (stm_char*)(addr - stm_object_pages);
 }
 
@@ -188,7 +189,7 @@
     DEBUG_EXPECT_SEGFAULT(true);
     release_all_privatization_locks();
 
-    write_fence();     /* make sure 'nobj' is fully initialized from
+    stm_write_fence();     /* make sure 'nobj' is fully initialized from
                           all threads here */
     return (object_t *)nobj;
 }
@@ -976,9 +977,9 @@
 
     LIST_FREE(marked_objects_to_trace);
 
-    /* weakrefs and execute old light finalizers */
+    /* weakrefs and execute old destructors */
     stm_visit_old_weakrefs();
-    deal_with_old_objects_with_finalizers();
+    deal_with_old_objects_with_destructors();
 
     /* cleanup */
     clean_up_segment_lists();
diff --git a/c8/stm/gcpage.h b/c8/stm/gcpage.h
--- a/c8/stm/gcpage.h
+++ b/c8/stm/gcpage.h
@@ -1,3 +1,7 @@
+#ifndef _STM_GCPAGE_H_
+#define _STM_GCPAGE_H_
+
+#include <stdbool.h>
 
 /* Granularity when grabbing more unused pages: take 20 at a time */
 #define GCPAGE_NUM_PAGES   20
@@ -22,3 +26,9 @@
 static void major_collection_with_mutex(void);
 static bool largemalloc_keep_object_at(char *data);   /* for largemalloc.c */
 static bool smallmalloc_keep_object_at(char *data);   /* for smallmalloc.c */
+
+static inline bool mark_visited_test(object_t *obj);
+static bool is_overflow_obj_safe(struct stm_priv_segment_info_s *pseg, 
object_t *obj);
+static void mark_visit_possibly_overflow_object(object_t *obj, struct 
stm_priv_segment_info_s *pseg);
+
+#endif
diff --git a/c8/stm/hash_id.c b/c8/stm/hash_id.c
--- a/c8/stm/hash_id.c
+++ b/c8/stm/hash_id.c
@@ -1,5 +1,6 @@
 #ifndef _STM_CORE_H_
 # error "must be compiled via stmgc.c"
+# include "core.h"  // silence flymake
 #endif
 
 
diff --git a/c8/stm/hashtable.c b/c8/stm/hashtable.c
--- a/c8/stm/hashtable.c
+++ b/c8/stm/hashtable.c
@@ -40,7 +40,7 @@
 
 Inspired by: http://ppl.stanford.edu/papers/podc011-bronson.pdf
 */
-
+#include <stdint.h>
 
 uint32_t stm_hashtable_entry_userdata;
 
@@ -216,7 +216,7 @@
     }
     biggertable->resize_counter = rc;
 
-    write_fence();   /* make sure that 'biggertable' is valid here,
+    stm_write_fence();   /* make sure that 'biggertable' is valid here,
                         and make sure 'table->resize_counter' is updated
                         ('table' must be immutable from now on). */
     VOLATILE_HASHTABLE(hashtable)->table = biggertable;
@@ -278,7 +278,7 @@
        just now.  In both cases, this thread must simply spin loop.
     */
     if (IS_EVEN(rc)) {
-        spin_loop();
+        stm_spin_loop();
         goto restart;
     }
     /* in the other cases, we need to grab the RESIZING_LOCK.
@@ -348,7 +348,7 @@
             hashtable->additions++;
         }
         table->items[i] = entry;
-        write_fence();     /* make sure 'table->items' is written here */
+        stm_write_fence();     /* make sure 'table->items' is written here */
         VOLATILE_TABLE(table)->resize_counter = rc - 6;    /* unlock */
         stm_read((object_t*)entry);
         return entry;
@@ -437,7 +437,7 @@
     table = VOLATILE_HASHTABLE(hashtable)->table;
     rc = VOLATILE_TABLE(table)->resize_counter;
     if (IS_EVEN(rc)) {
-        spin_loop();
+        stm_spin_loop();
         goto restart;
     }
 
diff --git a/c8/stm/largemalloc.c b/c8/stm/largemalloc.c
--- a/c8/stm/largemalloc.c
+++ b/c8/stm/largemalloc.c
@@ -1,5 +1,6 @@
 #ifndef _STM_CORE_H_
 # error "must be compiled via stmgc.c"
+# include "core.h"  // silence flymake
 #endif
 
 /* This contains a lot of inspiration from malloc() in the GNU C Library.
@@ -116,12 +117,12 @@
 
 static void lm_lock(void)
 {
-    spinlock_acquire(lm.lock);
+    stm_spinlock_acquire(lm.lock);
 }
 
 static void lm_unlock(void)
 {
-    spinlock_release(lm.lock);
+    stm_spinlock_release(lm.lock);
 }
 
 
diff --git a/c8/stm/list.c b/c8/stm/list.c
--- a/c8/stm/list.c
+++ b/c8/stm/list.c
@@ -1,5 +1,6 @@
 #ifndef _STM_CORE_H_
 # error "must be compiled via stmgc.c"
+# include "core.h"  // silence flymake
 #endif
 
 
diff --git a/c8/stm/list.h b/c8/stm/list.h
--- a/c8/stm/list.h
+++ b/c8/stm/list.h
@@ -1,5 +1,11 @@
+#ifndef _LIST_H
+#define _LIST_H
+
+
 #include <stdlib.h>
 #include <stdbool.h>
+#include <stdint.h>
+
 
 /************************************************************/
 
@@ -11,13 +17,13 @@
 
 static struct list_s *list_create(void) __attribute__((unused));
 
-static inline void list_free(struct list_s *lst)
+static inline void _list_free(struct list_s *lst)
 {
     free(lst);
 }
 
 #define LIST_CREATE(lst)  ((lst) = list_create())
-#define LIST_FREE(lst)  (list_free(lst), (lst) = NULL)
+#define LIST_FREE(lst)  (_list_free(lst), (lst) = NULL)
 
 
 static struct list_s *_list_grow(struct list_s *, uintptr_t);
@@ -245,3 +251,5 @@
     TREE_FIND(tree, addr, result, return false);
     return true;
 }
+
+#endif
diff --git a/c8/stm/marker.c b/c8/stm/marker.c
--- a/c8/stm/marker.c
+++ b/c8/stm/marker.c
@@ -1,5 +1,6 @@
 #ifndef _STM_CORE_H_
 # error "must be compiled via stmgc.c"
+# include "core.h"  // silence flymake
 #endif
 
 
diff --git a/c8/stm/misc.c b/c8/stm/misc.c
--- a/c8/stm/misc.c
+++ b/c8/stm/misc.c
@@ -1,5 +1,6 @@
 #ifndef _STM_CORE_H_
 # error "must be compiled via stmgc.c"
+# include "core.h"  // silence flymake
 #endif
 
 
diff --git a/c8/stm/nursery.c b/c8/stm/nursery.c
--- a/c8/stm/nursery.c
+++ b/c8/stm/nursery.c
_______________________________________________
pypy-commit mailing list
pypy-commit@python.org
https://mail.python.org/mailman/listinfo/pypy-commit

Reply via email to