Martin v. Löwis wrote: > Shane Hathaway wrote: >>More generally, how should a non-unicode-expert writing Python extension >>code find out the minimum they need to know about unicode to use the >>Python unicode API? The API reference [1] ought to at least have a list >>of background links. I had to hunt everywhere. > > That, of course, depends on what your background is. Did you know what > Latin-1 is, when you started? How it relates to code page 1252? What > UTF-8 is? What an abstract character is, as opposed to a byte sequence > on the one hand, and to a glyph on the other hand? > > Different people need different background, especially if they are > writing different applications.
Yes, but the first few steps are the same for nearly everyone, and people need more help taking the first few steps. In particular: - The Python docs link to unicode.org, but unicode.org is complicated, long-winded, and leaves many questions unanswered. The Wikipedia article is far better. I wish I had thought to look there instead. http://en.wikipedia.org/wiki/Unicode - The docs should say what to expect to happen when a large unicode character winds up in a Py_UNICODE array. For instance, what is len(u'\U00012345')? 1 or 2? Does the answer depend on the UCS4 compile-time switch? - The docs should help developers evaluate whether they need the UCS4 compile-time switch. Is UCS2 good enough for Asia? For math? For hieroglyphics? <wink> Shane _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com