On Sat, Oct 14, 2023 at 3:05 PM Marc Feeley (via srfi-246 list) < [email protected]> wrote:
I’m not particularly fond of the proposed API. Abstractly a guardian is an > interface to the garbage collector. At any point in time the guardian can > be queried to determine if some object registered with the guardian has > become “unreachable” That's not actually true. If I have an object x, it is by definition not (strongly) unreachable. So there is both no way and no need to interrogate a guardian to know if a particular object is reachable. (side note: I prefer the term “not strongly reachable” which distinguishes > the kind of reachability we are talking about, because it is still “weakly > reachable” through the guardian). > Weakness in the sense of weak pointers is orthogonal to guardians, except that if an object is in a guardian, then there can never be a reason to break a weak pointer to it. So I prefer to reserve "weakly reachable" to mean "reachable through a weak pointer". > > The main problems with guardians is that they: > > 1) Use a custom protocol for iterating over the currently not strongly > reachable registered objects. This protocol is yet another way to iterate > over a sequence (there are already lists, generators, ports, etc). It > would be nice to use one of the existing protocols so that guardians can > reuse familiar iteration patterns. > I (following Chez) am careful not to assert that there is ordering inside a guardian. The protocol simply returns some available unreachable object. In practice the unreachable objects are typically kept in a queue a la SRFI 117, but that should not determine the API. If we decide to stick with generators, I would be happy to change the SRFI to return an eof object rather than #f, since there is no reason to put an eof object in a guardian any more than there is reason to put #f there. That would mean it was using the lightweight generator protocol. 2) There’s a scalability issue because guardians are passive. In order to > implement a finalization mechanism there’s a need for some thread of > execution to “poll” the guardian once in a while to process the newly not > strongly reachable objects. An alternative it to use a mechanism (not > proposed by this SRFI but available with Chez Scheme) to be notified of a > garbage collection in order to check the new state of the guardian(s) and > proceed with the required finalizations. Note that this still requires > polling, but only when a GC notification is received. This does not scale > to large numbers of guardians. > > A better API would be to view a guardian as a stream of objects which > represents the order in which the registered objects have been discovered > to be not strongly reachable by the GC. I don't understand what semantics this sequence is supposed to have. For simplicity lets just say a guardian is a port. Then a thread could be > given the responsability of reading the next object from the guardian, in a > loop, to process the required finalization of these objects. Essentially: > > (define g (make-guardian)) > > (thread-start! ;; start a “finalization” thread > (make-thread > (lambda () > (let loop () > (let ((obj (read g))) ;; get next not strongly reachable object > from guardian (block if there is none) > (finalize! obj) ;; finalize it > (loop)))))) > > The important point here is that the garbage collector uses the guardian > as a mechanism to notify the finalization thread, that their operation is > asynchronous, and that polling is no longer needed. > > To me this is a cleaner API because it corresponds with the reality that > the GC and main programs are separate threads (indeed this is consistent > with the “collector process” and “mutator process” vocabulary used when > talking about garbage collection algorithms). > They aren't always, nor is it always necessary to finalize so eagerly. The Abstract Example is a way to use guardians that doesn't depend on the GC, provided you can accept relaxed finalization. As pointed out in SRFI 124, there is *never* any guarantee that GC runs at all. > A variation on this API would be to add an operation on the guardian that > blocks the calling thread until there is at least one available not > strongly reachable object for that guardian. Unfortunately the proposed > representation of guardians as procedures does not make it easy or elegant > to do this. For this reason guardians should be their own type such that > (make-guardian) returns this type and the operations on a guardian are done > with specific procedures: > > (guardian-register guardian obj [rep]) ;; equivalent to the proposed > (guardian obj [rep]) operation > (guardian-unregister guardian) ;; equivalent to the proposed > (unregister-guardian guardian) > (guardian-next guardian) ;; equivalent to the proposed > (guardian) operation > (guardian-wait guardian) ;; new operation that blocks the > calling thread until there is at least one not strongly reachable object > I'm not sure why Chez does not do it this way: it seems intuitively obvious. I just sent Dybvig an email about it. Note however that subtypes of procedure can support an external API. > > This makes it easier to add operations if needed in the future. > > This set of operations allows a program to use a polling approach if > that’s appropriate (for example low number of guardians), or one based on > notification (that blocks the finalization thread when it has nothing to > do). > Adding guardian-wait makes sense to me rather than bringing in the whole Chez events API.
