Ben Franksen writes: > Over the last years, unicode has established itself world-wide and firmly > and is well supported by all the major operating systems. This is why I vote > for dropping support for older 8-bit encodings that are not unicode > compatible, thereby allowing e.g. Chinese users to use Darcs with their > native languages.
Does "just dropping 8-bit support" actually enable that, or does it only work in a .UTF8 locale? Or does it even work at all? I have trouble imagining how a random 8 bit encoding would get passed in verbatim to a widechar Unicode string, which can then be cast to an 8-bit encoding that actually comes out the way it went in. 8-bit encodings (including Latin-1) must be recoded to Unicode, or they probably violate the UTF-8 format (eg, the sequence ASCII-characters latin-1-character ASCII-character can never be valid UTF-8, but it's extremely common in Latin-1 text). Nor do I think you can count on command lines having a .UTF-8 locale. Shift JIS and to some extent EUC-JP remain popular in Japan, and at least my Chinese students frequently use Big5 and the GB family or encodings. All of these have repertoires that are Unicode subsets, but the encodings are different. Users expect to be able to "cat" them to the terminal and read them, and for that use case they will have a locale that specifies a default charset other than UTF-8. Most terminals are not able to switch encodings on the fly, so this can be extremely inconvenient. I'm not saying it's not worth doing, but be prepared for quite a bit more work than "just dropping 8-bit support." _______________________________________________ darcs-users mailing list darcs-users@darcs.net http://lists.osuosl.org/mailman/listinfo/darcs-users