[issue1943] improved allocation of PyUnicode objects

Marc-Andre Lemburg Sun, 27 Jan 2008 08:07:54 -0800

Marc-Andre Lemburg added the comment:

I don't really see the connection with #1629305.


An optimization that would be worth checking is hooking up the
Py_UNICODE pointer to interned Unicode objects if the contents match
(e.g. do a quick check based on the hash value and then a memcmp). That
would save memory and the call to the pymalloc allocator.

Another strategy could involve a priority queue style cache with the aim
of identifying often used Unicode strings and then reusing them. 

This could also be enhanced using an offline approach: you first run an
application with an instrumented Python interpreter to find the most
often used strings and then pre-fill the cache or interned dictionary on
the production Python interpreter at startup time.

Coming from a completely different angle, you could also use the
Py_UNICODE pointer to share slices of a larger data buffer. A Unicode
sub-type could handle this case, keeping a PyObject* reference to the
larger buffer, so that it doesn't get garbage collected before the
Unicode slice.

Regarding memory constrained environments: these should simply switch
off all free lists and pymalloc. OTOH, even mobile phones come with
gigabytes of RAM nowadays, so it's not really worth the trouble, IMHO.

__________________________________
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue1943>
__________________________________
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue1943] improved allocation of PyUnicode objects

Reply via email to