> On Wed, Dec 19, 2012 at 5:43 AM, Albert-Jan Roskam <fo...@yahoo.com>
> wrote: >> >> So MBCS is just a collective noun for whatever happens to be the >> installed/available codepage of the host computer (at least with >> CP_ACP)? > > To be clear, the "mbcs" encoding in Python uses CP_ACP. MBCS means > multibyte character set. The term ANSI gets thrown around, too, but > Windows legacy code pages aren't ANSI standards. > >> I didn't know anything about wintypes and I find it quite hard to >> understand! I am trying to write a ctypes wrapper for >> WideCharToMultiByte. > > Just for the fun of it? Yes, I am afraid so. ;-) >> http://pastebin.com/SEr4Wec9 >> The code is kinda verbose, but I hope this makes it easier to read. >> Does this makes sense at all? As for now, the program returns an >> error code (oddly, zero is an error code here). > > Use None for NULL. Aahh... I was already thinking the prototype didn't match the zeroes. > You shouldn't encode a string argument you've declared as c_wchar_p > (i.e. wintypes.LPCWSTR, i.e. type 'Z'). If you initialize to a byte > string, the setter Z_set calls PyUnicode_FromEncodedObject using the > "mbcs" encoding (this is the default on Windows, set by > set_conversion_mode("mbcs", "ignore")). This hands off to > decode_mbcs, > which produces nonsense for a UTF-16LE encoded string. Ok, yes, that was plain stupid of me. > GetLastError should be defined already, along with WinError, a > convenience function that returns an instance of WindowsError. 2.6.4 > source: Convenient indeed. No need to reinvent the wheel. > http://hg.python.org/cpython/file/8803c3d61da2/Lib/ctypes/__init__.py#l448 > > Here's a quick hack that you should help you along: > > import ctypes > from ctypes import wintypes As per PEP8, the only time I use from x import * is with ctypes. Don't you do this because of name clashes with wintypes? I general, the module-dot-function notation is nicer (I hate that about R, where this is almost the rule, although one could write things like reshape::melt) > _CP_UTF8 = 65001 > _CP_ACP = 0 # ANSI > _LPBOOL = ctypes.POINTER(ctypes.c_long) > > _wideCharToMultiByte = ctypes.windll.kernel32.WideCharToMultiByte > _wideCharToMultiByte.restype = ctypes.c_int > _wideCharToMultiByte.argtypes = [ > wintypes.UINT, wintypes.DWORD, wintypes.LPCWSTR, ctypes.c_int, > wintypes.LPSTR, ctypes.c_int, wintypes.LPCSTR, _LPBOOL] > > def wide2utf8(fn): > codePage = _CP_UTF8 > dwFlags = 0 > lpWideCharStr = fn > cchWideChar = len(fn) > lpMultiByteStr = None > cbMultiByte = 0 # zero requests size > lpDefaultChar = None > lpUsedDefaultChar = None > # get size > mbcssize = _wideCharToMultiByte( > codePage, dwFlags, lpWideCharStr, cchWideChar, lpMultiByteStr, > cbMultiByte, lpDefaultChar, lpUsedDefaultChar) > if mbcssize <= 0: > raise ctypes.WinError(mbcssize) > lpMultiByteStr = ctypes.create_string_buffer(mbcssize) > # convert > retcode = _wideCharToMultiByte( > codePage, dwFlags, lpWideCharStr, cchWideChar, lpMultiByteStr, > mbcssize, lpDefaultChar, lpUsedDefaultChar) > if retcode <= 0: > raise ctypes.WinError(retcode) > return lpMultiByteStr.value Awesome, thank you so much! Glad to see that my code was pretty much in the right direction, but I made some silly and some more fundamental mistakes. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor