Re: Shortcomings of the API

John Cowan Sat, 14 Oct 2023 14:14:42 -0700

On Sat, Oct 14, 2023 at 3:05 PM Marc Feeley (via srfi-246 list) <
[email protected]> wrote:


I’m not particularly fond of the proposed API.  Abstractly a guardian is an
> interface to the garbage collector.  At any point in time the guardian can
> be queried to determine if some object registered with the guardian has
> become “unreachable”


That's not actually true.  If I have an object x, it is by definition not
(strongly) unreachable.  So there is both no way and no need to interrogate
a guardian to know if a particular object is reachable.

(side note: I prefer the term “not strongly reachable” which distinguishes
> the kind of reachability we are talking about, because it is still “weakly
> reachable” through the guardian).
>

Weakness in the sense of weak pointers is orthogonal to guardians, except
that if an object is in a guardian, then there can never be a reason to
break a weak pointer to it.  So I prefer to reserve "weakly reachable" to
mean "reachable through a weak pointer".

>
> The main problems with guardians is that they:
>
> 1) Use a custom protocol for iterating over the currently not strongly
> reachable registered objects.  This protocol is yet another way to iterate
> over a sequence (there are already lists, generators, ports, etc).  It
> would be nice to use one of the existing protocols so that guardians can
> reuse familiar iteration patterns.
>

I (following Chez) am careful not to assert that there is ordering inside a
guardian.  The protocol simply returns some available unreachable object.
In practice the unreachable objects are typically kept in a queue a la SRFI
117, but that should not determine the API.

If we decide to stick with generators, I would be happy to change the SRFI
to return an eof object rather than #f, since there is no reason to put an
eof object in a guardian any more than there is reason to put #f there.
That would mean it was using the lightweight generator protocol.

2) There’s a scalability issue because guardians are passive.  In order to
> implement a finalization mechanism there’s a need for some thread of
> execution to “poll” the guardian once in a while to process the newly not
> strongly reachable objects.  An alternative it to use a mechanism (not
> proposed by this SRFI but available with Chez Scheme) to be notified of a
> garbage collection in order to check the new state of the guardian(s) and
> proceed with the required finalizations.  Note that this still requires
> polling, but only when a GC notification is received.  This does not scale
> to large numbers of guardians.
>
> A better API would be to view a guardian as a stream of objects which
> represents the order in which the registered objects have been discovered
> to be not strongly reachable by the GC.


I don't understand what semantics this sequence is supposed to have.

For simplicity lets just say a guardian is a port.  Then a thread could be
> given the responsability of reading the next object from the guardian, in a
> loop, to process the required finalization of these objects.  Essentially:
>
>   (define g (make-guardian))
>
>   (thread-start! ;; start a “finalization” thread
>    (make-thread
>     (lambda ()
>       (let loop ()
>         (let ((obj (read g))) ;; get next not strongly reachable object
> from guardian (block if there is none)
>           (finalize! obj)     ;; finalize it
>           (loop))))))
>
> The important point here is that the garbage collector uses the guardian
> as a mechanism to notify the finalization thread, that their operation is
> asynchronous, and that polling is no longer needed.
>
> To me this is a cleaner API because it corresponds with the reality that
> the GC and main programs are separate threads (indeed this is consistent
> with the “collector process” and “mutator process” vocabulary used when
> talking about garbage collection algorithms).
>

They aren't always, nor is it always necessary to finalize so eagerly.  The
Abstract Example is a way to use guardians that doesn't depend on the GC,
provided you can accept relaxed finalization.  As pointed out in SRFI 124,
there is *never* any guarantee that GC runs at all.


> A variation on this API would be to add an operation on the guardian that
> blocks the calling thread until there is at least one available not
> strongly reachable object for that guardian.  Unfortunately the proposed
> representation of guardians as procedures does not make it easy or elegant
> to do this.  For this reason guardians should be their own type such that
> (make-guardian) returns this type and the operations on a guardian are done
> with specific procedures:
>
> (guardian-register guardian obj [rep])  ;; equivalent to the proposed
> (guardian obj [rep]) operation
> (guardian-unregister guardian)          ;; equivalent to the proposed
> (unregister-guardian guardian)
> (guardian-next guardian)                ;; equivalent to the proposed
> (guardian) operation
> (guardian-wait guardian)                ;; new operation that blocks the
> calling thread until there is at least one not strongly reachable object
>

I'm not sure why Chez does not do it this way: it seems intuitively
obvious.  I just sent Dybvig an email about it.  Note however that subtypes
of procedure can support an external API.

>
> This makes it easier to add operations if needed in the future.
>
> This set of operations allows a program to use a polling approach if
> that’s appropriate (for example low number of guardians), or one based on
> notification (that blocks the finalization thread when it has nothing to
> do).
>

Adding guardian-wait makes sense to me rather than bringing in the whole
Chez events API.

Re: Shortcomings of the API

Reply via email to