> From: Andy Wingo <[email protected]> > Hi, > > On Sun 06 Dec 2009 21:43, Linas Vepstas writes: > > > 2009/12/6 Mike Gran : > >> > >>> > need to call (setlocale LC_ALL "") > >> > >> But for Guile to store characters as codepoints, declaring a locale > >> pretty much a requirement now. > > > > Would it make sense to add (setlocale LC_ALL "") to some default, > > e.g. boot-9.scm ? > > Mike I admit I don't follow this completely. Does Linas' suggestion > make sense? I somehow thought that locales would magically just > work.
If we always call setlocale, legacy code that used UTF-8 and other non-Latin locales will just work. Legacy code that used strings to contain binary data would break. (Of couse, UTF-8 strings only worked on Guile 1.8.x so long as you either never looked at substrings or chars, or did UTF-8 parsing yourself.) As it is now, the opposite is true: legacy code with strings containing binary data will just work; strings containing non-8-bit locale encoded strings will break. | 1.8.x | setlocale | | Strings | called | Guile 2.0 | contain | 1.8 | 2.0 | will ----------------------------------------------------------------- | ASCII | Y/N | Y/N | just work ----------------------------------------------------------------- | locale-encoded | Y/N | Y | just work | strings | | | ----------------------------------------------------------------- | locale-encoded | Y/N | N | interpret string bytes as | strings | | | Latin-1 ----------------------------------------------------------------- | binary data | Y/N | Y | if locale is Latin-1: just work | | | | | | | | if locale is not latin-1: | | | | interpret string bytes using | | | | locale encoding ----------------------------------------------------------------- | binary data | Y/N | N | just work | | | | I think I prefer that the coder take the responsibility of calling setlocale, but, I only think that because it is how C works. I'm used to that convention. Thanks, Mike
