Marc-Andre Lemburg added the comment: Your microbenchmark is biased towards your patched version. The KEEPALIVE_SIZE_LIMIT will only cut in when you deallocate and then reallocate Unicode objects. The free list used for Unicode objects is also limited to 1024 objects - which isn't all that much. You could tune MAX_UNICODE_FREELIST_SIZE as well.
Regarding memory usage: this is difficult to measure in Python, since pymalloc will keep memory chunks allocated even if they are not in use by Python. However, this is a feature of pymalloc and not specific to the Unicode implementation. It can be tuned in pymalloc. To get more realistic memory measurements, you'd have to switch off pymalloc altogether and then create a separate process that consumes lots of memory to force the OS to have it allocate only memory that's really needed to the process you're running for memory measurements. Of course, keeping objects alive in a free list will always use more memory than freeing them altogether and returning the memory to the OS. It's a speed/space tradeoff. The RAM/CPU costs ratio has shifted a lot towards RAM nowadays, so using more RAM is usually more efficient than using more CPU time. Regarding resize: you're right - the string object is a PyVarObject as well and couldn't be changed at the time due to backwards compatibility reasons. You should also note that when I added Unicode to Python 1.6, it was a new and not commonly used type. Codecs were not used much either, so there was no incentive to make resizing strings work better. Later on, other optimizations were added to the Unicode implementation that caused the PyUnicode_Resize() API to also require being able to change the object address. Still, in the common case, it doesn't change the object address. The reason for using an external buffer for the Unicode object was to be able to do further optimizations, such as share buffers between Unicode objects. We never ended up using this, though, but there's still a lot of room for speedups and more memory efficiency because of this design. Like I already mentioned, PyObjects are also easier to extend at C level - adding new variables to the object at the end is easy with PyObjects. It's difficult for PyVarObjects, since you always have to take the current size of the object into account and you always have to use indirection to get at the extra variables due to the undefined offset of the variables. How much speedup do you get when you compare the pybench test with KEEPALIVE_SIZE_LIMIT = 200 compared to your patched version ? __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1943> __________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com