Re: [Python-Dev] The future of the wchar_t cache

Steve Dower Mon, 22 Oct 2018 06:51:01 -0700

On 22Oct2018 0928, Victor Stinner wrote:

Also, I'm
proposing keeping the 'kind' as UCS-2 when the string is created from
UCS-2 data that is likely to be used as UCS-2.


Oh. That's a major change in the PEP 393 design. You would have to
modify many functions in CPython. Currently, the PEP 393 requires that
a string always use the most efficient storage, and many optimizations
and code paths rely on that assumptions.

I don't know that it requires that many modifications - those functionsalready have to handle UCS-2 content anyway (e.g. if I get a path fromscandir() that includes a non-ASCII character), and they're only usingthe assumption of most efficient storage to determine the resultingstorage size of a string operation (which I'm proposing should also beUCS-2 when the source strings are UCS-2, since that's the best indicatorwe have that it'll be used as UCS-2 later, as well as being the currentimplementation :) ).

I'm against this change.

Moreover, it's hard to guess how a string will be used later...

Agreed. There are some heuristics we can use, but it's definitely only aguess. That's the nature of this problem - guessing that it *won't* beused as UCS-2 later on is also a guess.


Cheers,
Steve
_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] The future of the wchar_t cache

Reply via email to