Re: [Python-Dev] Internal representation of strings and Micropython

Nick Coghlan Thu, 05 Jun 2014 05:22:30 -0700

On 5 June 2014 22:01, Paul Sokolovsky <pmis...@gmail.com> wrote:
>> Aside from
>> some of the POSIX locale handling issues on Linux, many of the
>> concerns are with the usability of bytes and bytearray, not with str -
>> that's why binary interpolation is coming back in 3.5, and there will
>> likely be other usability tweaks for those types as well.
>
> All these changes are what let me dream on and speculate on
> possibility that Python4 could offer an encoding-neutral string type
> (which means based on bytes), while move unicode back to an explicit
> type to be used explicitly only when needed (bloated frameworks like
> Django can force users to it anyway, but that will be forcing on
> framework level, not on language level, against which people rebel.)
> People can dream, right?


If you don't model strings as arrays of code points, or at least
assume a particular universal encoding (like UTF-8), you have to give
up string concatenation in order to tolerate arbitrary encodings -
otherwise you end up with unintelligible data that nobody can decode
because it switches encodings without notice. That's a viable model if
your OS guarantees it (Mac OS X does, for example, so Python 3 assumes
UTF-8 for all OS interfaces there), but Linux currently has no such
guarantee - many runtimes just decide they don't care, and assume
UTF-8 anyway (Python 3 may even join them some day, due to the
problems caused by trusting the locale encoding to be correct, but the
startup code will need non-trivial changes for that to happen - the
C.UTF-8 locale may even become widespread before we get there).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

Reply via email to