Am Mon, 29 Sep 2014 04:33:08 +0300 schrieb ketmar via Digitalmars-d <[email protected]>:
> On Sun, 28 Sep 2014 19:44:39 +0000 > Uranuz via Digitalmars-d <[email protected]> wrote: > > > I speaking language which graphemes coded by 2 bytes > UCS-4? KOI8? my locale is KOI8, and i HATE D for assuming that everyone > one the planet using UTF-8 and happy with it. from my POV, almost all > string decoding is broken. string i got from filesystem? good god, lest > it not contain anything out of ASCII range! string i got from text > file? the same. string i must write to text file or stdout? oh, 'cmon, > what do you mean telling me "п©я─п╦п╡п╣я┌"?! i can't read that! My friend, we agree here! We must convert the whole world to UTF-8 eventually to end this madness! But for now when we write to a terminal, we have to convert to the system locale, because there are still people who don't use Unicode. (On Windows consoles the wide-char writing functions are good enough for NFC strings.) And a path from the filesystem is actually in no specific encoding on Unix. We only know it is byte based and uses ASCII '/' and '\0' as delimiters. On Windows it is ushort based IIRC. To make matters more messy, Gtk assumes Unicode, while Qt assumes the user's locale for file names. And in reality it is determined by the IO charset at mount time. -- Marco
signature.asc
Description: PGP signature
