Am Fr., 15. Aug. 2025 um 20:32 Uhr schrieb Daphne Preston-Kendal <d...@nonceword.org>: > > Thank you! > > I am working on a revised and consolidated comparator spec which supports > SRFI 254 by incorporating the notion of an ‘unstable hash function’.
That's great! Consider making ordered and hash comparators disjoint types; practice has shown that a comparator that may possess an ordering function or a hash function is not very useful. Also, please consider making a comparator a two- or three-value type instead of a single-value record type. This avoids needless allocation. With a set of syntactic forms, this can be as convenient as the old compound type. > Here are some more practical questions about how this situation might work – > forgive the somewhat inchoate train of thought: > > > If the pair itself is added to the transport cell guardian, will the guardian > notify if its car or cdr only moved, but not the pair itself? > Presumably with a typical generational collector, yes – as long as the car > and cdr were not set by modification, because then they may be older than the > pair itself, and therefore they might move without the pair itself moving? Yes, if the pair is younger than the car or cdr, every move of the car or cdr after the pair creation will mean that the GC notifies the transport cell guardian about the pair. If the car or cdr is modified so that it becomes younger than the pair, a generational GC will have to move the pair into the younger generation (because it has to update it when the car or cdr moves). This is one reason why mutation is costly. This handling is independent of SRFI 254, but it guarantees that the pair will never be older than its contents. Although this particular scenario works, modifying a key that was inserted in any hash table should be considered UB. Marc > > The implicit restriction would then be that in general, that a transport cell > guardian has to notify whenever an object *or any object it recursively > points to* is changed, regardless of how the object got there. > > How much of a constraint is that (the asterisk-stropped part) actually, on a > real garbage collector implementation? > Is it likely to cause pointless re-hashing or only useful re-hashing in a > data structure? > > The alternative is that an ‘unstable hash function’ gets a transport guardian > as its second argument, and has to add any objects it needs the current > location of to the guardian itself, maybe? Hmm. > > > Daphne > > > On 15 Aug 2025, at 17:38, Marc Nieper-Wißkirchen <marc.nie...@gmail.com> > > wrote: > > > > Thanks for the question! > > > > To make it concrete, an example could be a hash table where the keys > > are pairs of objects such that two pairs are equivalent if and only if > > their car and cdr are the same (in the sense of eq?), respectively. > > Your hash function would be > > > > (define (current-pair-hash p) > > (combine (current-hash (car p)) (current-hash (cdr p)))) > > > > where "combine" is some integer function. > > > > A hash table implementation that wants to insert such a key would > > first place it in a transport cell guarding, which yields a transport > > cell holding that key. Then the hash of the key would be calculated by > > "current-pair-hash" and the transport cell is put in the corresponding > > bucket. When "current-pair-hash" returns, the hash may already be > > invalid - either because both car and cdr have been moved or because > > only one (as in your question) has been moved by the GC. In any case, > > the main point is that such an invalidation is noted by the transport > > guardian because the transport cell is older than any relevant call to > > "current-hash" here; when the car or cdr (or both) have been moved, > > the transport cell must have been touched as well by the GC. > > > > When the key is later looked up in the hash table, its hash value is > > again calculated by "current-pair-hash". If a call can be found with > > the correct key in the bucket corresponding to the calculated hash, > > everything is fine. Otherwise, the transport guardian will be asked > > for transport cells that may have to be rehashed. It reinserts them at > > the right place and repeats the lookup process until either the key is > > found or no more transport cells are in the guardian. > > > > In other words, your scenario doesn't lead to a particular raise > > condition. The algorithm works as long as the probability of the > > garbage collector not moving a finite number of objects within a > > bounded number of instructions ("one loop of the lookup algorithm") is > > non-zero. > > > > Am Fr., 15. Aug. 2025 um 11:00 Uhr schrieb Daphne Preston-Kendal > > <d...@nonceword.org>: > >> > >> Is there a race condition if I use current-hash to find the hashes of two > >> different objects, combine those hashes into a single hash value, but the > >> garbage collector runs between the first and second invocation of > >> current-hash so the locations of the two reflect different compactions of > >> the heap? > >> > >> > >> Daphne > >> >