On Fri, Aug 30, 2013 at 11:04 AM, Albert-Jan Roskam <fo...@yahoo.com> wrote: > In Windows, sys.getfilesystemencoding() returns 'mbcs' (multibyte code > system), which doesn't say very much imho.
Why aren't you using Unicode for the filename? The native encoding for NTFS is UTF-16, and CPython 2.x uses _wfopen() if you pass it a Unicode filename: http://hg.python.org/cpython/file/70274d53c1dd/Objects/fileobject.c#l357 http://msdn.microsoft.com/en-us/library/yeby3zcb(v=vs.90) Anyway, the "mbcs" codec uses mbcs_encode() and mbcs_decode() from the codecs module. In CPython 2.x, these call PyUnicode_EncodeMBCS() and PyUnicode_DecodeMBCS(), which in turn call the Windows API functions WideCharToMultiByte() and MultiByteToWideChar() for the CP_ACP (ANSI) codepage. This is a system defined encoding, such as Windows 1252. > So I wrote the function below, which returns the codepage as reported by > the windows chcp command. chcp.com is a console application. It's calling GetConsoleCP(), which simply returns the current code page of the attached console (running the command creates a new console if there isn't one to inherit from the parent). This isn't the function you want. There's already a Python function that returns the default ANSI codepage: >>> import locale >>> locale.getpreferredencoding() 'cp1252' You can also use ctypes to call the Windows API directly, and then convert the integer to a string: >>> from ctypes import windll >>> str(windll.kernel32.GetACP()) '1252' > the function returns 850 (codepage 850) when I run it via the command prompt, > but 1252 (cp1252) when I run it in my IDE (Spyder). Maybe Spyder communicates with python.exe as a subprocess in a hidden console, with the console's codepage set to 1252. You can use ctypes to check windll.kernel32.GetConsoleCP(). If a console is attached, this will return a nonzero value. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor