On Sep 7, 2008, at 6:55 PM, Guido van Rossum wrote:
One possibility that occurs to me is to use a PyVarObject variant that allocates space for an additional void pointer before the variable sized section of the object. The builtin type would leave that pointer NULL, but subtypes could perform the second allocation needed to populate it.

The question is whether the 4-8 bytes wasted per object would be worth
the fact that only one memory allocation would be needed.

I believe that 4-8 bytes is more than the overhead of an extra memory
allocation from the obmalloc heap. It is probably about the same as
the overhead for a memory allocation from the regular malloc heap. So
for short strings (of which there are often a lot) it would be more
expensive; for longer objects it would probably work out just about
the same.

There could be a different approach though, whereby the offset from
the start of the object to the start of the character array wasn't a
constant but a value stored in the class object. (In fact,
tp_basicsize could probably be used for this.) It would slow down
access to the characters a bit though -- a classic time-space
trade-off that would require careful measurement in order to decide
which is better.


Given that you can, today, subclass str in Python, without wasting an extra 4/8 bytes of memory, or adding anything new to the class object, why wouldn't anyone who really wanted to make a hypothetical optimized subclass just use the same mechanism (putting your additional data *after* the character data) to subclass it in C?

It may be a little tricky, but not exactly rocket science, and given that all these C subclasses of str are so far hypothetical, just leaving it as "it's possible" seems perfectly reasonable...

James
_______________________________________________
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Reply via email to