Mattias Engdegård <mattias.engdeg...@gmail.com> writes: > 29 aug. 2025 kl. 12.37 skrev Daniel Mendler <m...@daniel-mendler.de>: > >>> Exactly, and this is not an uncommon bug in string-mutating code: `aset` on >>> unibyte strings 'worked' for ASCII and Unicode above 255, but not for >>> 128..255. >>> Thus the org-habit code was basically always broken in this sense. The >>> recent >>> mutation reform is an attempt to straighten things up. >> >> Well, but it worked before with your string resizing code? > > No, you always got the \257 raw byte instead of an actual `·`. > (Unless, maybe, your buffer were encoded in latin-1.)
Well, in the agenda buffer the habit graph looked correctly if I used `·`, but only before the string resizing. I haven't configured Latin-1. It should be all Unicode. I am not sure what was going on. >> According to my understand, one can still define sensible indices and >> access functions. Either use code point or graphemes as units. But access >> won't be O(1) if the string is stored in bytes as the underlying unit. >> Accessing UTF-8 strings via byte indices indeed makes no sense, like one >> might do in C when mutating a char[] array. > > Actually byte indexing would make sense but not for accessing > individual bytes. Ideally the index wouldn't be a plain number but a > distinct Lisp type. Such indices could be obtained from iterating, > searching and pattern matching, and used to extract substrings or > individual code points. Sure, an opaque iterator type is another reasonable alternative, but maybe more if a language implementation is started from scratch. It might not integrate so well into Elisp. > Whether it's a sufficient improvement to merit a parallel string API > in addition to the position-based one is a different matter. > >> I think there could be some potential to introduce frozen (immutable) >> objects, in the light of potential garbage collector optimization. > > Yes, it's been discussed. Specifically making some strings immutable > would indeed be useful but there are quite a few technicalities here. What about other objects? Do you see runtime benefits there, besides preventing bugs, if we can enforce immutability? >> Frozen symbol names would resolve the crash issue. Ruby went such a >> route with its string literals - they started out mutable, became >> optionally frozen via a magic frozen_string_literal comment (similar to >> our lexical-binding cookie) and are frozen by default in more recent >> language versions. > > Actually string literals have long been considered de-facto immutable > in Elisp although this isn't actively enforced. The byte compiler > warns for some simple cases. Personally I have never used string mutation in Elisp, so I can only agree. :) Daniel