Steve Howell writes:
> So I'm +1 on the unquoted third option, that canonically
> equivalent, but differently encoded, Unicode characters are allowed
> yet treated as different.
>
> Am I stretching the analogy too far?
Yes. By definition, that is nonconformant to the standard.
Canonically
> I think "obvious" referred to the reasoning, not the outcome.
>
> I can tell that the decision was "NFC, anything goes", but I don't see why.
I think I'm repeating myself: Because UAX 31 says so. That's it. There
is a standard that experts in the domain have specified, and PEP 3131
follows it.
"Martin v. Löwis" writes:
> > TR 15, section 19, numbered paragraph 3
> > """
> > Higher-level processes that transform or compare strings, or that
> > perform other higher-level functions, must respect canonical
> > equivalence or problems will result.
> > """
>
> That's not a mandatory
Jim Jewett writes:
> On 6/5/07, Stephen J. Turnbull <[EMAIL PROTECTED]> wrote:
>
> > It seems to me that what UAX#31 is saying is "Distinguishing (or not)
> > between 0035 DIGIT 3 and 2075 SUPERSCRIPT 3 should be
> > equivalent to distinguishing (or not) between LATIN CAPITAL
> > LETTER A a
Jim Jewett writes:
> > Not sure what the proposal is here. If people say "we want the PEP do
> > NFKC", I understand that as "instead of saying NFC, it should say
> > NFKC", which in turn means "all identifiers are converted into the
> > normal form NFKC while parsing".
>
> I would prefer t
Rauli Ruohonen writes:
> On 6/5/07, Stephen J. Turnbull <[EMAIL PROTECTED]> wrote:
> > I'd love to get rid of full-width ASCII and halfwidth kana (via
> > compatibility decomposition).
>
> If you do forbid compatibility characters in identifiers, then they
> should be flagged as an error, n
--- Jim Jewett <[EMAIL PROTECTED]> wrote:
>
> Ideally, either that equivalence would also include
> compatibility, or
> else characters whose compatibility and canonical
> equivalents are
> different would be banned for use in identifiers.
>
Current Python has the precedence that color/colour
a
--- Ka-Ping Yee <[EMAIL PROTECTED]> wrote:
>
> > > B. Should the default behaviour accept only
> ASCII identifiers, or
> > >should it accept identifiers containing
> non-ASCII characters?
> >
> > Added as an open issue.
> [...]
Martin, I hope you close out this issue, and just make
a firm,
On 6/5/07, Ka-Ping Yee <[EMAIL PROTECTED]> wrote:
> > > G. Should source code be required to be in normalized
> > > form?
...
> To your earlier question of "what about non-UTF-8 files", I
> imagine that the normalization restriction would apply to the
> decoded characters. That is, once you know
> > A. Should identifiers be allowed to contain any Unicode letter?
>
> Not an open issue; the PEP has been accepted.
The items listed under "A." are concerns that I wanted to be noted
in the PEP, so thanks for listing them.
> > B. Should the default behaviour accept only ASCII identifiers, or
>
On 6/5/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote:
> On 6/5/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > I'd rather see them here than in SF, SF is a pain to use.
> >
> > But unless the bugs prevent you from proceeding, you could also ignore them.
>
> The first bug that I reported to
On 6/5/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> I'd rather see them here than in SF, SF is a pain to use.
>
> But unless the bugs prevent you from proceeding, you could also ignore them.
The first bug that I reported today (the one about `make`) stop me
from running the test suite. So, ca
I'd rather see them here than in SF, SF is a pain to use.
But unless the bugs prevent you from proceeding, you could also ignore them.
There are 96 failing unit tests right now in that branch -- no need to
report all of them.
--Guido
On 6/5/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote:
>
Hi again,
I just found yet another bug in py3k-struni branch. This one about the
pdb module.
Should I start to report these bugs to the bug tracker, instead? At
this pace, I will flood the mailing list. :)
-- Alexandre
Python 3.0x (py3k-struni, Jun 5 2007, 18:41:44)
[GCC 4.1.2 (Ubuntu 4.1.2-0u
Feel free to mail me a patch to fix it.
On 6/5/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I found another bug to report. It seems there is a bug in
> subprocess.py that makes help() fail.
>
> -- Alexandre
>
> Python 3.0x (py3k-struni, Jun 5 2007, 18:41:44)
> [GCC 4.1.2 (Ubuntu
Hi,
I found another bug to report. It seems there is a bug in
subprocess.py that makes help() fail.
-- Alexandre
Python 3.0x (py3k-struni, Jun 5 2007, 18:41:44)
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> help(open)
Tr
On 6/5/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> >1. Python will lose the ability to make a reliable round trip to
> > a human-readable display on screen or on paper.
> Correct. Was already the case, though, because of comments
> and string literals.
But these are usually less
Alexandre Vassalotti wrote:
> On 6/5/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>> If "make clean" makes the problem go away, it's usually because there
>> were old .pyc files with incompatible byte code. We don't change the
>> .pyc magic number for each change to the compiler.
>
> Nope. It i
On 6/5/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> If "make clean" makes the problem go away, it's usually because there
> were old .pyc files with incompatible byte code. We don't change the
> .pyc magic number for each change to the compiler.
Nope. It is still not working. I just did the f
If "make clean" makes the problem go away, it's usually because there
were old .pyc files with incompatible byte code. We don't change the
.pyc magic number for each change to the compiler.
--Guido
On 6/5/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote:
> Hi,
>
> On Ubuntu linux, when I try ru
Hi,
On Ubuntu linux, when I try run make in the py3k-struni branch I get
an weird error about split(). However, I don't get this error when I
run ``make clean; make''.
Thanks,
-- Alexandre
% make
Traceback (most recent call last):
File "./setup.py", line 6, in
import sys, os, imp, re, opt
Greg Ewing schrieb:
> Steve Howell wrote:
>
>> einfugen = in joints ()
>
> Maybe "join in" (as a verb)?
It's actually "insert" (into the list).
Regards,
Martin
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/
>> > Unicode does say pretty clearly that (at least) canonical equivalents
>> > must be treated the same.
>
>> Chapter and verse, please?
>
> I am pretty sure this list is not exhaustive, but it may be helpful:
>
> The Identifiers Annex http://www.unicode.org/reports/tr31/
Ah, that's in the con
> Here's a summary of some of the remaining open issues and unaddressed
> arguments regarding PEP 3131. These are the ones I'm familiar with,
> so I don't claim this to be complete. I hope it helps give some
> perspective on this huge thread, though.
Thanks, I added them all to the PEP. Not sure
On 6/5/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Jim Jewett schrieb:
> > On 6/5/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> >> > Always normalizing would have the advantage of simplicity (no
> >> > matter what the encoding, the result is the same), and I think
> >> > that is the rea
On 6/5/07, Rauli Ruohonen <[EMAIL PROTECTED]> wrote:
> On 6/5/07, Stephen J. Turnbull <[EMAIL PROTECTED]> wrote:
> > I'd love to get rid of full-width ASCII and halfwidth kana (via
> > compatibility decomposition).
> If you do forbid compatibility characters in identifiers, then they
> should be f
Jim Jewett schrieb:
> On 6/5/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
>> > Always normalizing would have the advantage of simplicity (no
>> > matter what the encoding, the result is the same), and I think
>> > that is the real path of least surprise if you sum over all
>> > surprises.
>
>>
On 6/5/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> > I'd love to get rid of full-width ASCII and halfwidth kana (via
> > compatibility decomposition). Native Japanese speakers often use them
> > interchangably with the "proper" versions when correcting typos and
> > updating numbers in a se
On 6/5/07, Stephen J. Turnbull <[EMAIL PROTECTED]> wrote:
> It seems to me that what UAX#31 is saying is "Distinguishing (or not)
> between 0035 DIGIT 3 and 2075 SUPERSCRIPT 3 should be
> equivalent to distinguishing (or not) between LATIN CAPITAL
> LETTER A and LATIN SMALL LETTER A." I don't kno
> I'd love to get rid of full-width ASCII and halfwidth kana (via
> compatibility decomposition). Native Japanese speakers often use them
> interchangably with the "proper" versions when correcting typos and
> updating numbers in a series. Ugly, to say the least. I don't think
> that native Japa
Eric V. Smith wrote:
> Talin wrote:
>> Other kinds of customization require replacing a much larger chunk of
>> code. Changing the "underscores" and "check-unused" behavior requires
>> overriding 'vformat', which means replacing the entire template string
>> parser. I figured that there would
Nick Coghlan wrote:
> Talin wrote:
>> What I wanted to avoid in the PEP was having to specify how all of
>> these different parts fit together and the exact nature of the
>> parameters being passed between them.
>>
>> And I think that even if we do break up vformat this way, we still end
>> up w
On 6/5/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> > Always normalizing would have the advantage of simplicity (no
> > matter what the encoding, the result is the same), and I think
> > that is the real path of least surprise if you sum over all
> > surprises.
> I'd like to repeat that this
On 6/5/07, Josiah Carlson <[EMAIL PROTECTED]> wrote:
> Talin <[EMAIL PROTECTED]> wrote:
> > I haven't heard anyone whose native language is RTL
> > lobbying for support of their language.
...
> I don't believe I've read anything saying "since bidi is hard,
> lets not do unicode at all".
Not in
Talin wrote:
> Other kinds of customization require replacing a much larger chunk of
> code. Changing the "underscores" and "check-unused" behavior requires
> overriding 'vformat', which means replacing the entire template string
> parser. I figured that there would be a lot of people who might
Talin wrote:
> What I wanted to avoid in the PEP was having to specify how all of these
> different parts fit together and the exact nature of the parameters
> being passed between them.
>
> And I think that even if we do break up vformat this way, we still end
> up with people having to replac
On 6/5/07, Stephen J. Turnbull <[EMAIL PROTECTED]> wrote:
> I'd love to get rid of full-width ASCII and halfwidth kana (via
> compatibility decomposition).
If you do forbid compatibility characters in identifiers, then they
should be flagged as an error, not converted silently. NFC, on the
other h
Jim Jewett writes:
> > The PEP assumes NFC, but I haven't really understood why, unless that
> > is required for compatibility with other systems (in which case, it
> > should be made explicit).
"Martin v. Löwis" writes:
> It's because UAX#31 tells us to use NFC, in section 5
>
> "General
> The path of least surprise for legacy encodings might be for
> the codecs to produce whatever is closest to the original encoding
> if possible. I.e. what was one code point would remain one code
> point, and if that's not possible then normalize. I don't know if
> this is any different from alwa
> 1) My first proposal is that someone - one of the PEP 3131 advocates
> probably - create a set of patches, or possibly a branch, that
> implements unicode identifiers in whatever manner they think is
> appropriate. Write some actual code instead of just talking about it.
I'm working on that.
Talin <[EMAIL PROTECTED]> wrote:
> In other words - instead of endless discussions of hypotheticals, let
> people vote with their feet. Because I can already tell that as far as
> this mailing list goes, there will never be a consensus on this issue,
> due to basic value differences.
If the un
41 matches
Mail list logo