On 9/7/06, Paul Prescod <[EMAIL PROTECTED]> wrote: > 1. On US English Windows, Notepad defaults to an encoding called "ANSI". > What does "ANSI" map to in European and Asian versions of Windows?
On most Western European configurations, the ANSI Code Page is historically 1252 (CP1252 or WINDOWS-1252 according to iconv). It may be something different now for supporting the EURO symbol. Japanese machines tend to use CP932 (or MS932), also known as SHIFT-JIS (or close enough). I don't know exactly which ACPs match other languages off the top of my head. I expect notepad will default to the ACP encoding whenever a file is detected as such, or a new file contains only characters representable via that code page. Otherwise I expect it will default to "Unicode" (UTF-16 / UCS-2). When editing an existing file, it will default to the detected encoding, unless "Unicode" is required to save the changes. It uses BOMs to mark all unicode encodings, but doesn't require them to be present in order to detect "Unicode." http://blogs.msdn.com/michkap/archive/2006/06/14/631016.aspx > 3. In general, how do modern versions of Linux and other Unix handle this > issue? I use en-US.UTF-8, after many years of C or en-US.ISO-8859-1. Due to the age of my install, this was not the default, but now I use it as pervasively as possible. I set it via GDM these days, but via my shell rc file originally. Michael -- Michael Urman http://www.tortall.net/mu/blog _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com