Hello! Mike Gran <spk...@yahoo.com> writes:
> On Tue, 2009-04-21 at 23:37 +0200, Ludovic Courtès wrote: >> You seem to imply that `scm_getc ()' will now return a Unicode >> codepoint, is that right? What about `scm_c_{read,write} ()', and >> `scm_{get,put}s ()'? >> > > I vacillate on this, but, I think the most logical approach is to have > scm_getc return codepoints and to have the rest of those functions > return strings that could contain wide characters. Hmm, `scm_c_{read,write} ()' are biased toward binary data, according to the manual and to their prototype (they take `void *' buffers). So I would keep them this way. `scm_puts ()' is more of a concern since it takes a `char *', which the caller may consider an 8-bit-encoded, null-terminated string. We should probably deprecate it, and have it return an ISO-8859-1 string, transcoding as necessary. And `scm_gets ()' doesn't exist actually. ;-) > This is if and only > if the port has been assigned a character encoding. If it doesn't have > an associated encoding, ports will be treated as de facto ISO-8859-1, > where character values between 0 and 255 are stored without any > interpretation and characters greater than 255 are invalid. (Unicode > codepoints 0 to 255 are by design the same as ISO-8859-1.) OK. Thanks, Ludo'.