Re: [Python-3000] String comparison

Martin v. Löwis Fri, 08 Jun 2007 10:36:38 -0700

> The additional field is 8 bits, two bits for each normalization (a
> Yes/Maybe/No value). In Unicode 4.1 only 5 different combinations are
> used, but I don't know if that's true of later versions. As
> _PyUnicode_Database_Records stores only unique records, this also results
> in an increase of the number of records, from 219 to 304. Each record
> looks like this:


If I count correctly, this gives roughly 900 additional bytes. That's
fine.

> It doesn't affect behavior or the API much(*), only performance. Current
> test_normalize.py uses a test suite it fetches from UCD, so it
> should be adequate.

I assumed you want to expose it to Python also, as an is_normalized
function. I guess not having such a function is fine if applications
can do normalize(form, s) == s and have that be efficient as long
as the outcome is true (i.e. if it is more expensive only if it's
not normalized).

Regards,
Martin
_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Re: [Python-3000] String comparison

Reply via email to