[PDCurses] Windows character sets

William McBrine Sat, 12 May 2007 12:32:20 -0700

In Windows, there are two separate 8-bit character sets: one for GUI apps,which Microsoft calls the "ANSI" code page; and one for the console, knownas the "OEM" code page. (On a typical U.S. system, these would be Windows-1252 (a superset of ISO-8859-1) and IBM Code Page 437, respectively.)

In adding wide-character support to PDCurses, I made the existing charstring functions, like addstr(), into wrappers for their wide-charcounterparts, like addwstr(). They convert between Unicode and the current8-bit character set via the standard library functions mbtowc(),mbstowcs(), and wcstombs(). And I still think that was the right thing todo -- but only later did I realize that, for Windows, I didn't knowwhether these functions would use the ANSI or OEM code pages.

As it turns out, different compilers have implemented it in differentways. MSVC and MinGW use the ANSI page; Borland and Watcom use the OEM.I'm not sure there's anything to "fix" here, since neither behavior isincorrect per se, and either might be desired; but it's something to beaware of.

Note that this only applies to the library built with wide-charactersupport, but without forced UTF-8 mode. The narrow-character build usesthe OEM code page, as always; in this case, the char string functions arenot wrappers, and the wide-char versions are not available.


--
William McBrine <[EMAIL PROTECTED]>

[PDCurses] Windows character sets

Reply via email to