David Coles <coles.da...@gmail.com> added the comment: On Fri, May 6, 2011 at 1:31 PM, Marc-Andre Lemburg <rep...@bugs.python.org> wrote: > wchar_t should be fairly portable these days. I think the main > problem is that we never assumed sizeof(wchar_t) == 1 to be a > possibility. On Windows, wchar_t was 16 bit and the glibc started > out with 32 bits.
Well a 1 byte wchar_t is a bit "ass backwards". I think it's very much an edge case. :) > Note that HAVE_USABLE_WCHAR_T is only used to check whether > Python can use wchar_t as alias for Py_UNICODE. Python's Unicode > implementation needs Py_UNICODE to be an unsigned type with > either 2 bytes or 4 bytes. If wchar_t does not provide these > sizes or is a signed type, Python cannot use it for Py_UNICODE > and must instead use "unsigned short". Right. That makes sense. In that case it's probably sensible to keep around. > If the configure script does not detect this case, then a patch > would be helpful. Yup. I'll put something together that causes configure to bail out if you're either missing HAVE_WCHAR_H or if SIZEOF_WCHAR_T is less than 16 bits. > Python should not use wchar_t for Py_UNICODE on such platforms > and instead go with "unsigned short". > > I would assume that the wchar_t C lib routines work based on UTF-8 > with sizeof(wchar_t) == 1, so the PyUnicode_*WideChar*() APIs would > need to be adjusted to work more or less like the UTF-8 codecs. Yes. Using UTF-8 would be the sensible solution. Sadly it looks like all the wide character functions <2.3 are undefined, so in this case Android saying it has wchar_t support is worse than useless. On Fri, May 6, 2011 at 1:37 PM, Marc-Andre Lemburg <rep...@bugs.python.org> wrote: > With none of the wide-char functions working in Android <2.3, I don't > think you have a good chance of getting Python 3.x working, unless > you remove all their uses in the code and replace them with standard > char* functions. I agree. In my case I should be able to bump the required version number without too much fuss. It seems a bit silly to write in support for a platform that no longer supports said feature. > The last paragraph doesn't sound very promising either. I wonder > what they mean with "better representation". The C standard doesn't > have any better representation for Unicode at the moment. In C I guess the only sensible alternative would be UTF-8 char strings (or maybe using uint32_t), but in Python's case it really depends on how the underlying OS represents internationalized characters. Perhaps in other projects you would use an external library like ICU, but that's out the scope of my experience. :) ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12010> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com