On Sep 7, 2008, at 6:55 PM, Guido van Rossum wrote:
One possibility that occurs to me is to use a PyVarObject variant
that
allocates space for an additional void pointer before the variable
sized
section of the object. The builtin type would leave that pointer
NULL,
but subtypes could perform the second allocation needed to populate
it.
The question is whether the 4-8 bytes wasted per object would be
worth
the fact that only one memory allocation would be needed.
I believe that 4-8 bytes is more than the overhead of an extra memory
allocation from the obmalloc heap. It is probably about the same as
the overhead for a memory allocation from the regular malloc heap. So
for short strings (of which there are often a lot) it would be more
expensive; for longer objects it would probably work out just about
the same.
There could be a different approach though, whereby the offset from
the start of the object to the start of the character array wasn't a
constant but a value stored in the class object. (In fact,
tp_basicsize could probably be used for this.) It would slow down
access to the characters a bit though -- a classic time-space
trade-off that would require careful measurement in order to decide
which is better.
Given that you can, today, subclass str in Python, without wasting an
extra 4/8 bytes of memory, or adding anything new to the class object,
why wouldn't anyone who really wanted to make a hypothetical optimized
subclass just use the same mechanism (putting your additional data
*after* the character data) to subclass it in C?
It may be a little tricky, but not exactly rocket science, and given
that all these C subclasses of str are so far hypothetical, just
leaving it as "it's possible" seems perfectly reasonable...
James
_______________________________________________
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe:
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com