Hello! Neil Jerram <neiljer...@googlemail.com> writes:
> But what about the other possible debate, about the API? Are you > thinking that we should accept R6RS's choice? No, I think we have SRFI-1[34] to start with, both of which are well defined in the context of Unicode. > (I really haven't read up on all this enough - however when reading > Tom Lord's analysis just now, I was thinking "why not just specify > that things like char-upcase don't work in the difficult cases", and > it seems to me that this is what R6RS chose to do. So at first glance > the R6RS API looks OK to me. Regarding `ß' (German eszet), which is one of the "difficult cases" mentioned by Tom Lord, SRFI-13 reads: Some characters case-map to more than one character. For example, the Latin-1 German eszet character upper-cases to "SS." * This means that the R5RS function char-upcase is not well-defined, since it is defined to produce a (single) character result. * It means that an in-place string-upcase! procedure cannot be reliably defined, since the original string may not be long enough to contain the result -- an N-character string might upcase to a 2N-character result. * It means that case-insensitive string-matching or searching is quite tricky. For example, an n-character string s might match a 2N-character string s'. And then: SRFI 13 makes no attempt to deal with these issues; it uses a simple 1-1 locale- and context-independent case-mapping I think it's reasonable to stick to this approach at first, at least. Locale-dependent case folding is part of `(ice-9 i18n)' anyway. Thanks, Ludo'.