I like it. If we start which such optimization, we can also also remove data
from strings allocated by the new API (it can be computed: object pointer +
size of the structure). See my email for my proposition of structures:
    Re: [Python-Dev] PEP 393 review
    Thu Aug 25 00:29:19 2011

I agree it is tempting to drop the data pointer. However, I'm not sure
how many different structures we would end up with, and how the aliasing
rules would defeat this (you cannot interpret a struct X* as a struct Y*, unless either X is the first field of Y or vice versa).

Thinking about this, the following may work:
- ASCIIObject: state, length, hash, wstr*, data follow
- SingleBlockUnicode: ASCIIObject, wstr_len,
                      utf8*, utf8_len, data follow
- UnicodeObject: SingleBlockUnicode, data pointer, no data follow

This is essentially your proposal, except that the wstr_len is dropped for ASCII strings, and that it uses nested structs.

The single-block variants would always be "ready", the full unicode object is ready only if the data pointer is set.

I'll try it out, unless somebody can punch a hole into this proposal :-)

Regards,
Martin

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to