Re: [Python-Dev] PEP 393 Summer of Code Project

Victor Stinner Wed, 24 Aug 2011 01:20:33 -0700

Le 24/08/2011 04:41, Torsten Becker a écrit :

On Tue, Aug 23, 2011 at 18:27, Victor Stinner
<victor.stin...@haypocalc.com>  wrote:

I posted a patch to re-add it:
http://bugs.python.org/issue12819#msg142867


Thank you for the patch!  Note that this patch adds the fast path only
to the helper function which determines the length of the string and
the maximum character.  The decoding part is still without a fast path
for ASCII runs.

Ah? If utf8_max_char_size_and_has_errors() returns no error handmaxchar=127: memcpy() is used. You mean that memcpy() is too slow? :-)


maxchar = utf8_max_char_size_and_has_errors(s, size, &unicode_size,
                                            &has_errors);
if (has_errors) {
  ...
}
else {
   unicode = (PyUnicodeObject *)PyUnicode_New(unicode_size, maxchar);
   if (!unicode) return NULL;
        /* When the string is ASCII only, just use memcpy and return. */
        if (maxchar < 128) {
            assert(unicode_size == size);
            Py_MEMCPY(PyUnicode_1BYTE_DATA(unicode), s, unicode_size);
            return (PyObject *)unicode;
        }
    ...
}

But yes, my patch only optimize ASCII only strings, not "mostly-ASCII"strings (e.g. 100 ASCII + 1 latin1 character). It can be optimizedlater. I didn't benchmark my patch.


Victor
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 Summer of Code Project

Reply via email to