Hi all,

Here's a patch to replace the locative table with the same approach we
use for weak pair tracking.  Not much to say, the idea is simple, the
patch a bit bigger (because it rips out the locative table).

There's no manual or NEWS entry because this should not be user-visible
unless code creates lots of garbage locatives (as those would not be
traversed during GC, unlike before).

As I point out in the commit message, I think it would be worthwile for
CHICKEN 6 to change the representation of locatives.  Let me elaborate:

Currently, we store a direct C pointer to the target object's slot
in the locative's first slot.  Because we also need to update said
pointer during GC, we also store the offset (in the second slot) and
do lots of calculations on *every* GC, for *every* locative, to find
the original object, chase its forwarding pointer and then re-apply
the offset.  There's a fourth slot for the type of object (which we
use to determine the offset's "stride" and when dereferencing to
figure out how to "convert" it. Finally, there's a fifth slot which
holds either the object (again!) or #f if it's a weak locative.

Graphically:

   ------------------------------------------------
  | HEADER | PTR | FX offset | FX subtype | OBJ/#f |
   ------------------------------------------------

If we change it such that we *always* store the object
in the first slot, we can make the distinction of weak/strong refs
like we do for pairs: set the C_SPECIALBLOCK_BIT conditionally
when the reference is weak, and clear it if it's strong.

Graphically:

   ---------------------------------------
  | HEADER | OBJ | FX offset | FX subtype |
   ---------------------------------------

This change has two benefits:

- locatives are smaller, so (slightly) less GC pressure if
  lots of these are made.  But more importantly:
- when traversing locatives, we *only* have to process
  weak locatives specially in the GC, and there's no need to
  recalculate the pointer.  This means if a program uses only
  strong locatives, there is *no additional penalty* incurred
  after a garbage collection run.

The only disadvantage is that every access of a locative will
now have to do the calculation to find the base pointer.  This
might slow down locative-ref somewhat.  But I think it's
manageable - we already have a switch statement in locative_ref,
so adding the offset calculation for all the different types of
locatives is trivial and should only add *very* minor overhead,
while the GCs should all be faster.

The change should be relatively minor, but because it's a change
in the representation, and there may be code out there that
relies on the current representation, I think this change is
better left for CHICKEN 6.

Cheers,
Peter
From 8f753a2d031a3c23198c397223f55ce3ae7ec087 Mon Sep 17 00:00:00 2001
From: Peter Bex <pe...@more-magic.net>
Date: Tue, 27 Jun 2023 08:13:59 +0200
Subject: [PATCH] Replace locative table with simpler "weak chain" solution

Instead of keeping track of every locative in a table, we instead use
the same approach as for tracking weak pairs: during GC, as we
encounter live locatives, build up a chain which we traverse when the
GC has completed.  We "recycle" the first slot of the locative when it
is turned into a forwarding pointer for storing the chain pointer.

Unlike weak pairs, we have to traverse both strong *and* weak
locatives, because their pointer slots need to be fixed up.  This
could be improved if we change the representation of locatives to be
object+offset instead of pointer+offset(+object), and have the
C_SPECIALBLOCK_BIT set depending on whether it is weak/strong.  This
would be a fundamental representational change, so this would be
better left for CHICKEN 6.
---
 runtime.c                   | 205 +++++++++++-------------------------
 tests/weak-pointer-test.scm |  75 ++++++++++++-
 2 files changed, 136 insertions(+), 144 deletions(-)

diff --git a/runtime.c b/runtime.c
index 5ff08f3f..0dc05feb 100644
--- a/runtime.c
+++ b/runtime.c
@@ -153,7 +153,6 @@ static C_TLS int timezone;
 #define DEFAULT_HEAP_MIN_FREE          (4 * 1024 * 1024)
 #define HEAP_SHRINK_COUNTS             10
 #define DEFAULT_FORWARDING_TABLE_SIZE  32
-#define DEFAULT_LOCATIVE_TABLE_SIZE    32
 #define DEFAULT_COLLECTIBLES_SIZE      1024
 #define DEFAULT_TRACE_BUFFER_SIZE      16
 #define MIN_TRACE_BUFFER_SIZE          3
@@ -402,8 +401,8 @@ static C_TLS C_word
   **mutation_stack_limit,
   **mutation_stack_top,
   *stack_bottom,
-  *locative_table,
   weak_pair_chain,
+  locative_chain,
   error_location,
   interrupt_hook_symbol,
   current_thread_symbol,
@@ -468,8 +467,6 @@ static C_TLS double
 static C_TLS LF_LIST *lf_list;
 static C_TLS int signal_mapping_table[ NSIG ];
 static C_TLS int
-  locative_table_size,
-  locative_table_count,
   live_finalizer_count,
   allocated_finalizer_count,
   pending_finalizer_count,
@@ -544,8 +541,8 @@ static void C_fcall mark_nested_objects(C_byte 
*heap_scan_top, C_byte *tgt_space
 static void C_fcall mark_live_objects(C_byte *tgt_space_start, C_byte 
**tgt_space_top, C_byte *tgt_space_limit) C_regparm;
 static void C_fcall mark_live_heap_only_objects(C_byte *tgt_space_start, 
C_byte **tgt_space_top, C_byte *tgt_space_limit) C_regparm;
 static C_word C_fcall intern0(C_char *name) C_regparm;
-static void C_fcall update_locative_table(int mode) C_regparm;
 static void C_fcall update_weak_pairs(int mode) C_regparm;
+static void C_fcall update_locatives(int mode) C_regparm;
 static LF_LIST *find_module_handle(C_char *name);
 static void set_profile_timer(C_uword freq);
 static void take_profile_sample();
@@ -759,14 +756,6 @@ int CHICKEN_initialize(int heap, int stack, int symbols, 
void *toplevel)
   *forwarding_table = 0;
   forwarding_table_size = DEFAULT_FORWARDING_TABLE_SIZE;
 
-  /* Initialize locative table: */
-  locative_table = (C_word *)C_malloc(DEFAULT_LOCATIVE_TABLE_SIZE * 
sizeof(C_word));
-   
-  if(locative_table == NULL) return 0;
- 
-  locative_table_size = DEFAULT_LOCATIVE_TABLE_SIZE;
-  locative_table_count = 0;
-
   /* Setup collectibles: */
   collectibles = (C_word **)C_malloc(sizeof(C_word *) * 
DEFAULT_COLLECTIBLES_SIZE);
 
@@ -832,6 +821,7 @@ int CHICKEN_initialize(int heap, int stack, int symbols, 
void *toplevel)
   current_module_handle = NULL;
   callback_continuation_level = 0;
   weak_pair_chain = (C_word)NULL;
+  locative_chain = (C_word)NULL;
   gc_ms = 0;
   if (!random_state_initialized) {
     srand(time(NULL));
@@ -3441,6 +3431,7 @@ C_regparm void C_fcall C_reclaim(void *trampoline, C_word 
c)
   tgt_space_top = &C_fromspace_top;
   tgt_space_limit = C_fromspace_limit;
   weak_pair_chain = (C_word)NULL;
+  locative_chain = (C_word)NULL;
 
   start = C_fromspace_top;
 
@@ -3477,6 +3468,7 @@ C_regparm void C_fcall C_reclaim(void *trampoline, C_word 
c)
     tgt_space_top = &tospace_top;
     tgt_space_limit= tospace_limit;
     weak_pair_chain = (C_word)NULL; /* only chain up weak pairs forwarded into 
tospace */
+    locative_chain = (C_word)NULL;  /* same for locatives */
 
     cell.val = "GC_MAJOR";
     C_debugger(&cell, 0, NULL);
@@ -3503,7 +3495,7 @@ C_regparm void C_fcall C_reclaim(void *trampoline, C_word 
c)
     count = (C_uword)C_fromspace_top - (C_uword)start;
     ++gc_count_1;
     ++gc_count_1_total;
-    update_locative_table(GC_MINOR);
+    update_locatives(GC_MINOR);
     update_weak_pairs(GC_MINOR);
   }
   else {
@@ -3591,7 +3583,7 @@ C_regparm void C_fcall C_reclaim(void *trampoline, C_word 
c)
       }
     }
 
-    update_locative_table(gc_mode);
+    update_locatives(gc_mode);
     update_weak_pairs(gc_mode);
 
     count = (C_uword)tospace_top - (C_uword)tospace_start; // Actual used, < 
heap_size/2
@@ -3678,8 +3670,6 @@ C_regparm void C_fcall C_reclaim(void *trampoline, C_word 
c)
     C_dbg("GC", C_text("   to\t" UWORD_FORMAT_STRING "\t" UWORD_FORMAT_STRING 
"\t" UWORD_FORMAT_STRING" \n"), 
          (C_uword)tospace_start, (C_uword)tospace_top, 
          (C_uword)tospace_limit);
-
-    C_dbg("GC", C_text("%d locatives (from %d)\n"), locative_table_count, 
locative_table_size);
   }
 
   /* GC will have copied any live objects out of scratch space: clear it */
@@ -3908,6 +3898,9 @@ static C_regparm void C_fcall really_mark(C_word *x, 
C_byte *tgt_space_start, C_
   if (h == C_WEAK_PAIR_TAG && !C_immediatep(p2->data[0])) {
     p->data[0] = weak_pair_chain; /* "Recycle" the weak pair's CAR to point to 
prev head */
     weak_pair_chain = (C_word)p;  /* Make this fwd ptr the new head of the 
weak pair chain */
+  } else if (h == C_LOCATIVE_TAG) {
+    p->data[0] = locative_chain; /* "Recycle" the locative pointer field to 
point to prev head */
+    locative_chain = (C_word)p;  /* Make this fwd ptr the new head of the 
locative chain */
   }
 }
 
@@ -3993,6 +3986,7 @@ C_regparm void C_fcall C_rereclaim2(C_uword size, int 
relative_resize)
   new_tospace_limit = new_tospace_start + size;
   start = new_tospace_top;
   weak_pair_chain = (C_word)NULL; /* only chain up weak pairs forwarded into 
new heap */
+  locative_chain = (C_word)NULL;  /* same for locatives */
 
   /* Mark standard live objects in nursery and heap */
   mark_live_objects(new_tospace_start, &new_tospace_top, new_tospace_limit);
@@ -4009,14 +4003,9 @@ C_regparm void C_fcall C_rereclaim2(C_uword size, int 
relative_resize)
     remark(&gcrp->value);
   }
 
-  /* Mark locative table (like finalizers, all objects are kept alive in 
GC_REALLOC): */
-  for(i = 0; i < locative_table_count; ++i)
-    remark(&locative_table[ i ]);
-
-  update_locative_table(GC_REALLOC);
-
   /* Mark nested values in already moved (marked) blocks in breadth-first 
manner: */
   mark_nested_objects(start, new_tospace_start, &new_tospace_top, 
new_tospace_limit);
+  update_locatives(GC_REALLOC);
   update_weak_pairs(GC_REALLOC);
 
   heap_free (heapspace1, heapspace1_size);
@@ -4048,106 +4037,6 @@ C_regparm void C_fcall C_rereclaim2(C_uword size, int 
relative_resize)
 }
 
 
-C_regparm void C_fcall update_locative_table(int mode)
-{
-  int i, hi = 0, invalidated = 0;
-  C_header h;
-  C_word loc, obj, obj2, offset, loc2, ptr;
-  C_uword ptr2;
-
-  for(i = 0; i < locative_table_count; ++i) {
-    loc = locative_table[ i ];
-
-    if(loc != C_SCHEME_UNDEFINED) {
-      h = C_block_header(loc);
-
-      switch(mode) {
-      case GC_MINOR:
-        if(is_fptr(h))         /* forwarded? update l-table entry */
-          loc = locative_table[ i ] = fptr_to_ptr(h);
-        /* otherwise it must have been GC'd (since this is a minor one) */
-        else if(C_in_stackp(loc)) {
-          locative_table[ i ] = C_SCHEME_UNDEFINED;
-          C_set_block_item(loc, 0, 0);
-         ++invalidated;
-          break;
-        }
-
-        /* forwarded. fix up ptr and check pointed-at object for being 
forwarded... */
-        ptr = C_block_item(loc, 0);
-        offset = C_unfix(C_block_item(loc, 1));
-        obj = ptr - offset;
-        h = C_block_header(obj);
-
-        if(is_fptr(h)) {       /* pointed-at object forwarded? update */
-          C_set_block_item(loc, 0, (C_uword)fptr_to_ptr(h) + offset);
-         hi = i + 1;
-       }
-        else if(C_in_stackp(obj)) { /* pointed-at object GC'd, locative is 
invalid */
-          locative_table[ i ] = C_SCHEME_UNDEFINED;
-          C_set_block_item(loc, 0, 0);
-        }
-       else hi = i + 1;
-        
-        break;
-
-      case GC_MAJOR:
-        if(is_fptr(h))         /* forwarded? update l-table entry */
-          loc = locative_table[ i ] = fptr_to_ptr(h);
-        else {                 /* otherwise, throw away */
-          locative_table[ i ] = C_SCHEME_UNDEFINED;
-          C_set_block_item(loc, 0, 0);
-         ++invalidated;
-          break;
-        }
-
-        h = C_block_header(loc);
-        
-        if(is_fptr(h))         /* new instance is forwarded itself? update 
again */
-          loc = locative_table[ i ] = fptr_to_ptr(h);
-
-        ptr = C_block_item(loc, 0); /* fix up ptr */
-        offset = C_unfix(C_block_item(loc, 1));
-        obj = ptr - offset;
-        h = C_block_header(obj);
-
-        if(is_fptr(h)) {       /* pointed-at object has been forwarded? */
-         ptr2 = (C_uword)fptr_to_ptr(h);
-         h = C_block_header(ptr2);
-
-         if(is_fptr(h)) {      /* secondary forwarding check for pointed-at 
object */
-           ptr2 = (C_uword)fptr_to_ptr(h) + offset;
-           C_set_block_item(loc, 0, ptr2);
-         }
-         else C_set_block_item(loc, 0, ptr2 + offset); /* everything's fine, 
fixup pointer */
-
-         hi = i + 1;
-        }
-        else {
-          locative_table[ i ] = C_SCHEME_UNDEFINED; /* pointed-at object is 
dead */
-          C_set_block_item(loc, 0, 0);
-         ++invalidated;
-        }
-        
-        break;
-
-      case GC_REALLOC:
-        ptr = C_block_item(loc, 0); /* just update ptr's pointed-at objects */
-        offset = C_unfix(C_block_item(loc, 1));
-        obj = ptr - offset;
-        remark(&obj);
-        C_set_block_item(loc, 0, obj + offset);        
-        break;
-      }
-    }
-  }
-
-  if(gc_report_flag && invalidated > 0)
-    C_dbg(C_text("GC"), C_text("locative-table entries reclaimed: %d\n"), 
invalidated);
-
-  if(mode != GC_REALLOC) locative_table_count = hi;
-}
-
 /* When a weak pair is encountered by GC, it turns it into a
  * forwarding reference as usual, but then it re-uses the now-defunct
  * pair's CAR field.  It clobbers that field with a plain C pointer to
@@ -4213,6 +4102,56 @@ static C_regparm void C_fcall update_weak_pairs(int mode)
     C_dbg("GC", C_text("%d recoverable weak pairs found\n"), weakn);
 }
 
+/* Same as weak pairs (see above), but for locatives.  Note that this
+ * also includes non-weak locatives, as these point *into* an object,
+ * so the updating of that pointer is not handled by the GC proper
+ * (which only deals with full objects).
+ */
+static C_regparm void C_fcall update_locatives(int mode)
+{
+  int weakn = 0;
+  C_word p, loc, ptr, obj, h, offset;
+
+  for (p = locative_chain; p != (C_word)NULL; p = *((C_word 
*)C_data_pointer(p))) {
+    h = C_block_header(p);
+    assert(is_fptr(h));
+    loc = fptr_to_ptr(h);
+    assert(!is_fptr(C_block_header(loc)));
+
+    /* The locative object itself should be live */
+    assert((mode == GC_MINOR && !C_in_stackp(loc)) ||
+           (mode == GC_MAJOR && !C_in_stackp(loc) && !C_in_fromspacep(loc)) ||
+           (mode == GC_REALLOC && !C_in_stackp(loc) && !C_in_heapp(loc))); /* 
NB: *old* heap! */
+
+    ptr = C_block_item(loc, 0); /* fix up ptr */
+    offset = C_unfix(C_block_item(loc, 1));
+    obj = ptr - offset;
+
+    h = C_block_header(obj);
+    while (is_fptr(h)) {
+      obj = fptr_to_ptr(h);
+      h = C_block_header(obj);
+    }
+
+    /* If the object is unreferenced by anyone else, it wasn't moved by GC.  
So drop it: */
+    if((mode == GC_MINOR && C_in_stackp(obj)) ||
+       (mode == GC_MAJOR && (C_in_stackp(obj) || C_in_fromspacep(obj))) ||
+       (mode == GC_REALLOC && (C_in_stackp(obj) || C_in_heapp(obj)))) { /* NB: 
*old* heap! */
+
+      /* NOTE: This does *not* use BROKEN_WEAK_POINTER.  This slot
+       * holds an unaligned raw C pointer, not a Scheme object */
+      C_set_block_item(loc, 0, 0);
+      ++weakn;
+    } else {
+      /* Might have moved, re-set the object to the target value */
+      C_set_block_item(loc, 0, obj + offset);
+    }
+  }
+  locative_chain = (C_word)NULL;
+  if(gc_report_flag && weakn)
+    C_dbg("GC", C_text("%d recoverable weak locatives found\n"), weakn);
+}
+
 
 void handle_interrupt(void *trampoline)
 {
@@ -12048,26 +11987,6 @@ C_regparm C_word C_fcall C_a_i_make_locative(C_word 
**a, int c, C_word type, C_w
   loc[ 3 ] = type;
   loc[ 4 ] = C_truep(weak) ? C_SCHEME_FALSE : object;
 
-  for(i = 0; i < locative_table_count; ++i)
-    if(locative_table[ i ] == C_SCHEME_UNDEFINED) {
-      locative_table[ i ] = (C_word)loc;
-      return (C_word)loc;
-    }
-
-  if(locative_table_count >= locative_table_size) {
-    if(debug_mode == 2)
-      C_dbg(C_text("debug"), C_text("resizing locative table from %d to %d 
(count is %d)\n"), 
-           locative_table_size, locative_table_size * 2, locative_table_count);
-
-    locative_table = (C_word *)C_realloc(locative_table, locative_table_size * 
2 * sizeof(C_word));
-
-    if(locative_table == NULL) 
-      panic(C_text("out of memory - cannot resize locative table"));
-
-    locative_table_size *= 2;
-  }
-
-  locative_table[ locative_table_count++ ] = (C_word)loc;
   return (C_word)loc;
 }
 
diff --git a/tests/weak-pointer-test.scm b/tests/weak-pointer-test.scm
index e53f2ade..d13ccfe2 100644
--- a/tests/weak-pointer-test.scm
+++ b/tests/weak-pointer-test.scm
@@ -1,6 +1,6 @@
 ;; weak-pointer-test.scm
 
-(import (chicken gc) (chicken port))
+(import (chicken gc) (chicken port) (chicken locative))
 
 (include "test.scm")
 
@@ -131,4 +131,77 @@
     (test-assert "car of third weak cons is not a broken weak pair" (not 
(bwp-object? (car ref-c))))
     (test-assert "cdr of third weak cons is not a broken weak pair" (not 
(bwp-object? (cdr ref-c))))))
 
+
+(test-group "Testing that strong locatives get their object updated"
+  (gc #t) ; Improve chances we don't get a minor GC in between
+  (let* ((not-held-onto-value (vector 42))
+        (held-onto-vector (vector 'this-one-stays))
+        (vec-0 (vector 0))
+        (vec-1 (vector 1))
+        (vec-2 (vector 2))
+
+        (nested-not-held-onto-value (vector vec-0 vec-1 vec-2))
+        (nested-held-onto-value (vector (vector 'x) (vector 'y) (vector 'z)))
+        (vec-ohai (vector 'ohai))
+        (vec-fubar (vector 'fubar))
+
+        (loc1 (make-locative not-held-onto-value 0))
+        (loc2 (make-locative (vector 'ohai 'fubar) 1))
+        (loc3 (make-locative held-onto-vector 0))
+
+        (loc4 (make-locative nested-not-held-onto-value 1))
+        (loc5 (make-locative (vector vec-ohai vec-fubar) 1))
+        (loc6 (make-locative nested-held-onto-value 1)))
+
+    ;; break other references to the values
+    (set! not-held-onto-value #f)
+    (set! nested-not-held-onto-value #f)
+
+    (gc)
+
+    (test-equal "First locative is updated" (locative-ref loc1) 42)
+    (test-equal "Second locative is updated" (locative-ref loc2) 'fubar)
+    (test-equal "Third locative is updated" (locative-ref loc3) 
'this-one-stays)
+
+    (test-equal "Fourth locative is updated" (locative-ref loc4) vec-1)
+    (test-equal "Fifth locative is updated" (locative-ref loc5) vec-fubar)
+    (test-equal "Sixth locative is updated" (locative-ref loc6) (vector-ref 
nested-held-onto-value 1))))
+
+
+(test-group "Testing that weak locatives get their object reclaimed"
+  (gc #t) ; Improve chances we don't get a minor GC in between
+  (let* ((not-held-onto-value (vector 42))
+        (held-onto-vector (vector 'this-one-stays))
+        (vec-0 (vector 0))
+        (vec-1 (vector 1))
+        (vec-2 (vector 2))
+
+        (nested-not-held-onto-value (vector vec-0 vec-1 vec-2))
+        (nested-held-onto-value (vector (vector 'x) (vector 'y) (vector 'z)))
+        (vec-ohai (vector 'ohai))
+        (vec-fubar (vector 'fubar))
+
+        (loc1 (make-weak-locative not-held-onto-value 0))
+        (loc2 (make-weak-locative (vector 'ohai 'fubar) 1))
+        (loc3 (make-weak-locative held-onto-vector 0))
+
+        (loc4 (make-weak-locative nested-not-held-onto-value 1))
+        (loc5 (make-weak-locative (vector vec-ohai vec-fubar) 1))
+        (loc6 (make-weak-locative nested-held-onto-value 1)))
+
+    ;; break other references to the values
+    (set! not-held-onto-value #f)
+    (set! nested-not-held-onto-value #f)
+
+    (gc)
+
+    (test-error "First locative is reclaimed" (locative-ref loc1))
+    (test-error "Second locative is reclaimed" (locative-ref loc2))
+    ;; NOTE: It seems we have to go "through" the original vector to ensure 
reference is kept
+    (test-equal "Third locative is NOT reclaimed" (locative-ref loc3) 
(vector-ref held-onto-vector 0))
+
+    (test-error "Fourth locative is reclaimed" (locative-ref loc4))
+    (test-error "Fifth locative is reclaimed" (locative-ref loc5))
+    (test-equal "Sixth locative is NOT reclaimed" (locative-ref loc6) 
(vector-ref nested-held-onto-value 1))))
+
 (test-exit)
-- 
2.40.1

Attachment: signature.asc
Description: PGP signature

Reply via email to