Le 24/08/2011 04:41, Torsten Becker a écrit :
On Tue, Aug 23, 2011 at 18:27, Victor Stinner
<victor.stin...@haypocalc.com>  wrote:
I posted a patch to re-add it:
http://bugs.python.org/issue12819#msg142867

Thank you for the patch!  Note that this patch adds the fast path only
to the helper function which determines the length of the string and
the maximum character.  The decoding part is still without a fast path
for ASCII runs.

Ah? If utf8_max_char_size_and_has_errors() returns no error hand maxchar=127: memcpy() is used. You mean that memcpy() is too slow? :-)

maxchar = utf8_max_char_size_and_has_errors(s, size, &unicode_size,
                                            &has_errors);
if (has_errors) {
  ...
}
else {
   unicode = (PyUnicodeObject *)PyUnicode_New(unicode_size, maxchar);
   if (!unicode) return NULL;
        /* When the string is ASCII only, just use memcpy and return. */
        if (maxchar < 128) {
            assert(unicode_size == size);
            Py_MEMCPY(PyUnicode_1BYTE_DATA(unicode), s, unicode_size);
            return (PyObject *)unicode;
        }
    ...
}

But yes, my patch only optimize ASCII only strings, not "mostly-ASCII" strings (e.g. 100 ASCII + 1 latin1 character). It can be optimized later. I didn't benchmark my patch.

Victor
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to