Hello! Mike Gran <spk...@yahoo.com> writes:
> Hi. I know there has been a lot of talk about wide characters and > Unicode over the years. I'd like to see it happen because how the are > implemented will determine the future of a couple of my side-projects. > I could pitch in, if you needed some help. Indeed, it looks like you have some experience with GuCu! ;-) I agree it would be really nice to have Unicode support, but I'm not aware of any "plan", so please go ahead! :-) A few considerations regarding the inevitable debate about the internal string representation: 1. IMO it'd be nice to have ASCII strings special-cased so that they are always encoded in ASCII. This would allow for memory savings since, e.g., most symbols are expected to contain only ASCII characters. It might also simplify interaction with C in certain cases; for instance, it would make it easy to have statically initialized ASCII Scheme strings [0]. 2. O(1) `string-{ref,set!}' is somewhat mandated by parts of SRFI-13. For instance, `substring' takes indices as parameters, `string-index' returns an index, etc. (John Cowan once argued that an abstract type to represent the position would remove this limitation [1], but the fact is that we have to live with SRFI-13). 3. GLib et al. like UTF-8, and it'd be nice to minimize the overhead when interfacing with these libs (e.g., by avoiding translations from one string representation to another). 4. It might be nice to be friendly to `wchar_t' and friends. Interestingly, some of these things are contradictory. Will Clinger has a good summary of a range of possible implementations: https://trac.ccs.neu.edu/trac/larceny/wiki/StringRepresentations Thanks, Ludo'. [0] http://thread.gmane.org/gmane.lisp.guile.devel/7998 [1] http://lists.r6rs.org/pipermail/r6rs-discuss/2007-April/002252.html