Patches item #1734234, was opened at 2007-06-10 01:45 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1734234&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Modules Group: Python 2.6 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Rauli Ruohonen (raulir) Assigned to: Nobody/Anonymous (nobody) Summary: Fast path for unicodedata.normalize() Initial Comment: Implements quick checking of already normalized forms as described in http://unicode.org/reports/tr15/#Annex8 The patch is against 2.6 SVN trunk. Normalization test passes on both UCS2 and UCS4 builds on Ubuntu Edgy. API affected: unicodedata.normalize('NFC', u'a') is u'a' and similar expressions become true, as the unicode object is not copied when it is found to be already normalized. The documentation does not specify either way. Added memory footprint: A new 8-bit field is added to _PyUnicode_DatabaseRecord, and the generated _PyUnicode_Database_Records array grows from 219 records to 304 records. Each record looks like this: typedef struct { const unsigned char category; const unsigned char combining; const unsigned char bidirectional; const unsigned char mirrored; const unsigned char east_asian_width; const unsigned char normalization_quick_check; } _PyUnicode_DatabaseRecord; normalization_quick_check is the added field. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1734234&group_id=5470 _______________________________________________ Patches mailing list Patches@python.org http://mail.python.org/mailman/listinfo/patches