Re: [Python-Dev] PEP 393: Flexible String Representation

Stefan Behnel Fri, 28 Jan 2011 23:52:12 -0800

"Martin v. Löwis", 24.01.2011 21:17:

I have been thinking about Unicode representation for some time now.
This was triggered, on the one hand, by discussions with Glyph Lefkowitz
(who complained that his server app consumes too much memory), and Carl
Friedrich Bolz (who profiled Python applications to determine that
Unicode strings are among the top consumers of memory in Python).
On the other hand, this was triggered by the discussion on supporting
surrogates in the library better.


I'd like to propose PEP 393, which takes a different approach,
addressing both problems simultaneously: by getting a flexible
representation (one that can be either 1, 2, or 4 bytes), we can
support the full range of Unicode on all systems, but still use
only one byte per character for strings that are pure ASCII (which
will be the majority of strings for the majority of users).

You'll find the PEP at

http://www.python.org/dev/peps/pep-0393/

After much discussion, I'm +1 for this PEP. Implementation and benchmarksare pending, but there are strong indicators that it will bring relief forthe memory overhead of most applications without leading to a majordegradation performance-wise. Not for Python code anyway, and I'll try tomake sure Cython extensions won't notice much when switching to CPython 3.3.


Martin, this is a smart way of doing it.

Stefan

_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393: Flexible String Representation

Reply via email to