On 22 Sep 2024, at 12:12, Marc Nieper-Wißkirchen <marc.nie...@gmail.com> wrote:

> Why would you ever want to pass *any* object into an ephemeron or guardian?

I have already explained this.

Here is a real example, from a JavaScript library I was once working on that 
wrapped an Emscripten-compiled C regular expression library, because the 
regular expression library had features I needed which the JS builtin regexps 
don’t have. You can imagine a similar situation in a Scheme implementation with 
an FFI.

JS strings had to be transcoded into UTF-whatever and put into the Emscripten 
memory space. Since users will likely often want to run several regular 
expression searches over the same string (e.g. when lexing), it makes sense to 
do this once when encountering an input string for the first time and not 
repeat the process every time. One can take advantage of the fact that strings 
in JS are immutable and maintain a weak mapping from strings to Emscripten 
pointers; when the JS string is released by the garbage collector, the C free() 
function gets called by some kind of finalizer to ensure they also disappear 
from the Emscripten memory space, otherwise there’s a memory leak.

(At the time, JavaScript’s weak referencing/finalization options were not up to 
this, because its WeakMap and WeakSet are not iterable so as not to expose too 
much non-determinism. Since Emscripten got popular and this obviously messed up 
applications like this, they now have ways to do this. But at the time I had to 
give up on this idea.)

In the Scheme parallel version of this, strings are mutable, so I have two 
options: 1. pretend they’re not mutable and make the behaviour undefined if 
someone mutates a string which has been used with the C regexp library; 2. add 
an extra layer of indirection and intern all the strings with string->symbol, 
and put those in my weak/guardian mapping instead.

Regardless what I choose, both options potentially break with a blanket 
implementation-dependent ban on ephemeron/guardian keys without location. The 
empty string does not necessarily have location, so I would have to special 
case that for portable code. Literal strings which are small enough to fit in a 
uintptr with a tag might also not have location (though I don’t know any 
current implementation that does this). But this is, as I mentioned in my 
previous mail, already the case for symbols with short names in some Scheme 
implementations, so that strategy would require even more care.


Daphne

Reply via email to