On 2011-08-23, at 10:55 , Martin v. Löwis wrote:
>> - “The UTF-8 decoding fast path for ASCII only characters was removed
>> and replaced with a memcpy if the entire string is ASCII.”
>> The fast path would still be useful for mostly-ASCII strings, which
>> are extremely common (unless UTF-8 has become a no-op?).
>
> Is it really extremely common to have strings that are mostly-ASCII but
> not completely ASCII? I would agree that pure ASCII strings are
> extremely common.
Mostly ascii is pretty common for western-european languages (French, for
instance, is probably 90 to 95% ascii). It's also a risk in english, when
the writer "correctly" spells foreign words (résumé and the like).
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com