> From: Andy Wingo <wi...@pobox.com>
[...] > The solution is to use functions that specify the locale. We don't have > those yet, but we do have the capability to write them > now. Specifically: > > scm_from_utf8_string > scm_from_utf8_symbol > scm_from_utf8_keyword > > scm_from_latin1_string > scm_from_latin1_symbol > scm_from_latin1_keyword > > We probably also need the "n" variants. > [...] > So then we need, I think: > > scm_to_utf8_string > scm_to_utf16_string > scm_to_utf32_string > > We need the "n" variants here too (perhaps more). Some of this is already in the bytevectors module, but, perhaps not in an easy form for C source code. It would easy enough to do, but, there is a failure case to consider for scm_from_utf8_string. The C utf8 string could contain incorrectly encoded data. You could throw the encoding error, or you could replace the bad utf8 with U+FFFD or the question mark. The bytevector's utf8->string always throws encoding-error. Maybe that's good enough. Otherwise, perhaps something like scm_from_utf8_stringn (str, len, error_or_replace_strategy) If you didn't mind the overhead of calling the somewhat heavyweight scm_{to,from}_stringn, these could be macros or inline functions that wrap that. -Mike