On 6/6/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > FWIW, I don't buy that normalization is expensive, as most strings are > > in NFC form anyway, and there are fast checks for that (see UAX#15, > > "Detecting Normalization Forms"). Python does not currently have > > a fast path for this, but if it's added, then normalizing everything > > to NFC should be fast. > > That would be useful to have, anyway. Would you like to contribute it?
I implemented it for all normalizations in the most straightforward way I could think of, which was adding a field to _PyUnicode_DatabaseRecord, generating data for it in makeunicodedata.py from DerivedNormalizationProps.txt of UCD 4.1, and writing a function is_normalized which uses it. The function is called from unicodedata_normalized. I made the modifications against py3k-struni. Does this sound reasonable? I haven't made any contributions to Python before, but I heard attempting such hazardous activity involves lots of hard knocks :-) Where should I send the patch? I saw some patches here in other threads, but then again http://www.python.org/dev/patches/ tells to use SourceForge. _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com