Re: [Python-3000] String comparison

2007-06-08 Thread Stephen J. Turnbull
Rauli Ruohonen writes: > The ones it absolutely prohibits in interchange are surrogates. Excuse me? Surrogates are code points with a specific interpretation if it is "purported that the stream is in UTF-16". Otherwise, Unicode 4.0 explicitly says that there is nothing illegal about an isolate

Re: [Python-3000] PEP 3127 (integer literal syntax) -- any takers?

2007-06-08 Thread Collin Winter
On 6/8/07, Guido van Rossum <[EMAIL PROTECTED]> wrote: > Separately, the 2to3 tool needs a fixer for this (and it should also > accept the new notations in its input). I wrote a num_literals fixer when the debate over this feature was still in progress. It's checked in, but I need to sync it with

Re: [Python-3000] PEP 3127 (integer literal syntax) -- any takers?

2007-06-08 Thread Collin Winter
On 6/8/07, Collin Winter <[EMAIL PROTECTED]> wrote: > On 6/8/07, Guido van Rossum <[EMAIL PROTECTED]> wrote: > > Separately, the 2to3 tool needs a fixer for this (and it should also > > accept the new notations in its input). > > I wrote a num_literals fixer when the debate over this feature was >

[Python-3000] PEP 3127 (integer literal syntax) -- any takers?

2007-06-08 Thread Guido van Rossum
PEP 3127 (Integer Literal Support and Syntax) introduces new notations for octal and binary integers. This isn't implemented yet. Are there any takers? It shouldn't be particularly complicated. Separately, the 2to3 tool needs a fixer for this (and it should also accept the new notations in its inp

Re: [Python-3000] Support for PEP 3131

2007-06-08 Thread Martin v. Löwis
> This keeps getting characterized as only a security argument, but > it's much deeper; it's a basic code comprehension issue. Despite you repeating this over and over, I still honestly, sincerely do not understand the concern. You might be technically correct, but I feel that the cases where thes

Re: [Python-3000] Unicode IDs -- why NFC? Why allow ligatures?

2007-06-08 Thread Martin v. Löwis
> I hope this helps in the discussion. Indeed it does. When I find the time, I'll propose a change to the PEP to do NFKC. Regards, Martin ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe

Re: [Python-3000] String comparison

2007-06-08 Thread Martin v. Löwis
> The additional field is 8 bits, two bits for each normalization (a > Yes/Maybe/No value). In Unicode 4.1 only 5 different combinations are > used, but I don't know if that's true of later versions. As > _PyUnicode_Database_Records stores only unique records, this also results > in an increase of

Re: [Python-3000] Unicode IDs -- why NFC? Why allow ligatures?

2007-06-08 Thread Andrew McNabb
On Thu, Jun 07, 2007 at 06:50:57PM -0400, Jim Jewett wrote: > On 6/7/07, Andrew McNabb <[EMAIL PROTECTED]> wrote: > > On Wed, Jun 06, 2007 at 07:06:05PM -0400, Jim Jewett wrote: > > > (There were mixed opinions on Technical symbols, and no one has spoken > > > up yet about the half-dozen Croatian

Re: [Python-3000] String comparison

2007-06-08 Thread Jim Jewett
On 6/8/07, Rauli Ruohonen <[EMAIL PROTECTED]> wrote: > The additional field is 8 bits, two bits for each normalization (a > Yes/Maybe/No value). In Unicode 4.1 only 5 different combinations are > used, but I don't know if that's true of later versions. There are no "Maybe" values for the Decompose

Re: [Python-3000] String comparison

2007-06-08 Thread Rauli Ruohonen
On 6/8/07, Stephen J. Turnbull <[EMAIL PROTECTED]> wrote: > AFAIK, the only strings the Unicode standard absolutely prohibits > emitting are those containing code points guaranteed not to be > characters by the standard. The ones it absolutely prohibits in interchange are surrogates. They are also

Re: [Python-3000] String comparison

2007-06-08 Thread Rauli Ruohonen
On 6/8/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > In principle, yes. What's the cost of the additional field in terms of > a size increase? If you just need another bit, could that fit into > _PyUnicode_TypeRecord.flags instead? The additional field is 8 bits, two bits for each normalizati

Re: [Python-3000] String comparison

2007-06-08 Thread Stephen J. Turnbull
Guido van Rossum writes: > If you want to have an abstraction that guarantees you'll never see > an unnormalized text string you should design a library for doing so. OK. > (*) It looks like such a library will not have a way to talk about > "\u0308" at all, since it is considered unnormaliz