> The additional field is 8 bits, two bits for each normalization (a > Yes/Maybe/No value). In Unicode 4.1 only 5 different combinations are > used, but I don't know if that's true of later versions. As > _PyUnicode_Database_Records stores only unique records, this also results > in an increase of the number of records, from 219 to 304. Each record > looks like this:
If I count correctly, this gives roughly 900 additional bytes. That's fine. > It doesn't affect behavior or the API much(*), only performance. Current > test_normalize.py uses a test suite it fetches from UCD, so it > should be adequate. I assumed you want to expose it to Python also, as an is_normalized function. I guess not having such a function is fine if applications can do normalize(form, s) == s and have that be efficient as long as the outcome is true (i.e. if it is more expensive only if it's not normalized). Regards, Martin _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com