Re: [Python-3000] PyUnicodeObject implementation

Antoine Pitrou Sun, 07 Sep 2008 06:53:25 -0700

Stefan Behnel <stefan_ml <at> behnel.de> writes:
> 
> From a Cython perspective, I find the lack of efficient subclassing after such
> a change particularly striking. That seriously bit me in Py2 when I tried
> making XML text content a bit more intelligent in lxml (i.e. make it remember
> what XML element it originated from).


I've used a library which had adopted this kind of behaviour (I think it was
BeautifulSoup). After using it several times in a row I noticed memory
consumption of my program exploded. The problem was that the library was
returning objects which looked innocently like strings, but internally kept a
reference to a multi-megabyte HTML tree. The solution was to convert them
explicitly to str before storing them for later use, which defeated the point of
having an str-derived type.

In these cases I think it's much friendlier to the user of the API to use
composition rather than inheritance. Or, simply, just return a raw string and
let the user keep the context separately if he wants to.


PS: what do you call "efficient subclassing"? if you look at the current
implementation of unicode_subtype_new() in unicodeobject.c, it isn't very
efficient (everything including the raw data buffer is allocated twice).


_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Re: [Python-3000] PyUnicodeObject implementation

Reply via email to