Re: [Python-Dev] PEP 393: Flexible String Representation

Stefan Behnel Thu, 27 Jan 2011 13:46:56 -0800

James Y Knight, 27.01.2011 21:26:

On Jan 27, 2011, at 2:06 PM, Stefan Behnel wrote:

"Martin v. Löwis", 24.01.2011 21:17:

The Py_UNICODE type is still supported but deprecated. It is always
defined as a typedef for wchar_t, so the wstr representation can
double as Py_UNICODE representation.


It's too bad this isn't initialised by default, though. Py_UNICODE is
the only representation that can be used efficiently from C code and
Cython relies on it for fast text processing. This proposal will
therefore likely have a pretty negative performance impact on
extensions written in Cython as the compiler could no longer expect
this representation to be available instantaneously.


But the whole point of the exercise is so that it doesn't have to store
a 4byte-per-char representation when a 1byte-per-char rep would do.

I am well aware of that. But I'm arguing that the current simpler internalrepresentation has had its advantages for CPython as a platform.

If cython wants to work most efficiently with this proposal, it should
learn to deal with the three possible raw representations.

I agree. After all, CPython is lucky to have it available. It wouldn't bethe first time that we duplicate looping code based on the input type.However, like the looping code, it will also complicate all indexing codeat runtime as it always needs to test which of the representations iscurrent before it can read a character. Currently, all of this is a compiletime decision. This will necessarily have a performance impact.


Stefan

_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393: Flexible String Representation

Reply via email to