Re: [Python-Dev] PEP 393: Special-casing ASCII-only strings

Martin v. Löwis Thu, 15 Sep 2011 14:41:11 -0700

I like it. If we start which such optimization, we can also also remove data
from strings allocated by the new API (it can be computed: object pointer +
size of the structure). See my email for my proposition of structures:
    Re: [Python-Dev] PEP 393 review
    Thu Aug 25 00:29:19 2011


I agree it is tempting to drop the data pointer. However, I'm not sure
how many different structures we would end up with, and how the aliasing

rules would defeat this (you cannot interpret a struct X* as a structY*, unless either X is the first field of Y or vice versa).


Thinking about this, the following may work:
- ASCIIObject: state, length, hash, wstr*, data follow
- SingleBlockUnicode: ASCIIObject, wstr_len,
                      utf8*, utf8_len, data follow
- UnicodeObject: SingleBlockUnicode, data pointer, no data follow

This is essentially your proposal, except that the wstr_len is droppedfor ASCII strings, and that it uses nested structs.

The single-block variants would always be "ready", the full unicodeobject is ready only if the data pointer is set.


I'll try it out, unless somebody can punch a hole into this proposal :-)

Regards,
Martin

_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393: Special-casing ASCII-only strings

Reply via email to