Yesterday I ran into a bug in the C API docs. The top of this page: http://docs.python.org/api/unicodeObjects.html
says: Py_UNICODE This type represents a 16-bit unsigned storage type which is used by Python internally as basis for holding Unicode ordinals. On platforms where wchar_t is available and also has 16-bits, Py_UNICODE is a typedef alias for wchar_t to enhance native platform compatibility. On all other platforms, Py_UNICODE is a typedef alias for unsigned short. This is incorrect on some platforms: on Debian, Py_UNICODE turns out to be 32 bits. I'm not sure what the correct quote should be: Does python use wchar_t whenever it's available (16 bits or not)? I solved my problem by realizing that I was going about things entirely wrong, and that I should use the python codecs from C and not worry about what Py_UNICODE contains. However, I think we should fix the docs to avoid confusing others... or maybe it would be better to document what's in Py_UNICODE and suggest always using the codec methods? I don't have a strong opinion either way. robey _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com