Re: [Python-Dev] PEP 393 Summer of Code Project

Nick Coghlan Wed, 31 Aug 2011 15:47:01 -0700

On Thu, Sep 1, 2011 at 8:02 AM, Terry Reedy <tjre...@udel.edu> wrote:
> On 8/31/2011 1:10 PM, Guido van Rossum wrote:
>> Ok, I dig this, to some extent. However saying it is UCS-2 is equally
>> bad.
>
> As I said on the tracker, our narrow builds are in-between (while moving
> closer to UTF-16), and both terms are deceptive, at least to some.


We should probably just explicitly document that the internal
representation in narrow builds is a UCS-2/UTF-16 hybrid - like
UTF-16, it can handle the full code point space, but, like UCS-2, it
allows code unit sequences (such as lone surrogates) that strict
UTF-16 would reject.

Perhaps we should also finally split strings out to a dedicated
section on the same tier as Sequence types in the library reference.
Yes, they're sequences, but they're also so much more than that (try
as you might, you're unlikely to be successful in ducktyping strings
the way you can sequences, mappings, files, numbers and other
interfaces. Needing a "real string" is even more common than needing a
"real dict", especially after the efforts to make most parts of the
interpreter that previously cared about the latter distinction accept
arbitrary mapping objects).

I've created http://bugs.python.org/issue12874, suggesting that the
"Sequence Types" and "memoryview type" sections could be usefully
rearranged as:

    Sequence Types - list, tuple, range
    Text Data - str
    Binary Data - bytes, bytearray, memoryview

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 Summer of Code Project

Reply via email to