Hi, In our C source, we have been trained to use scm_from_locale_string et al. This is usually the right thing to do when interacting with the operating system.
However, when we have literals in C source code, I think this strategy is incorrect. I write my C source code in UTF-8 or in ISO-8859-1, but if the user is running in another locale, they will not load my strings/symbols/keywords correctly. The solution is to use functions that specify the locale. We don't have those yet, but we do have the capability to write them now. Specifically: scm_from_utf8_string scm_from_utf8_symbol scm_from_utf8_keyword scm_from_latin1_string scm_from_latin1_symbol scm_from_latin1_keyword We probably also need the "n" variants. It's unlikely that you have a known utf-32 string as a char*, but we should probably also provide scm_t_uint16* and scm_t_uint32* variants for utf16 and utf32. * * * We also have the converse problem: since the easiest (and recommended) way to get a char* from a Scheme string has been scm_to_locale_string, in many cases we give external libraries locale-encoded strings instead of the encoding they expect. For example, most GLib-based libraries expect utf-8 strings, but Guile-GNOME ignorantly passes them the result of calling scm_to_locale_string. Though this will work in UTF-8 locales, it's only by accident. So then we need, I think: scm_to_utf8_string scm_to_utf16_string scm_to_utf32_string We need the "n" variants here too (perhaps more). What do people think? Any takers on implementing this? :) Cheers, Andy -- http://wingolog.org/