> I implemented it for all normalizations in the most straightforward way I
> could think of, which was adding a field to _PyUnicode_DatabaseRecord,
> generating data for it in makeunicodedata.py from
> DerivedNormalizationProps.txt of UCD 4.1, and writing a function
> is_normalized which uses it.
On 6/7/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> >> The os.environ.get() method probably should return a unicode string. (?)
> >
> > Indeed -- care to contribute a patch?
>
> Ideally, such a patch would make use of the Win32 Unicode API for
> environment variables on Windows. People had al
>> The os.environ.get() method probably should return a unicode string. (?)
>
> Indeed -- care to contribute a patch?
Ideally, such a patch would make use of the Win32 Unicode API for
environment variables on Windows. People had already been complaining
that they can't have "funny characters" in
Rauli Ruohonen writes:
Stephen wrote:
> > I think the default case should be that text operations produce the
> > expected result in the text domain, even at the expense of array
> > invariants.
>
> If you really want that, then you need a type for sequences of graphemes.
No. "Text" != "
Guido van Rossum wrote:
>> The os.environ.get() method probably should return a unicode string. (?)
>
> Indeed -- care to contribute a patch?
I thought you might ask that. :-)
It looks like os.py module imports a 'envirion' dictionary from various
sources depending on the platform.
po
On 6/6/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> > FWIW, I don't buy that normalization is expensive, as most strings are
> > in NFC form anyway, and there are fast checks for that (see UAX#15,
> > "Detecting Normalization Forms"). Python does not currently have
> > a fast path for this, b
On 6/7/07, Ron Adam <[EMAIL PROTECTED]> wrote:
> Martin v. Löwis wrote:
>
> > FWIW, for me the build error goes away when I unset
> > LANG, so that the error occurs during build definitely
> > *is* a locale issue.
>
> Yes, and to pin it down a bit further...
>
> This avoids the problem by setting t
Martin v. Löwis wrote:
> FWIW, for me the build error goes away when I unset
> LANG, so that the error occurs during build definitely
> *is* a locale issue.
Yes, and to pin it down a bit further...
This avoids the problem by setting the language to the default "C" which is
a unicode string and
On 6/8/07, Jim Jewett <[EMAIL PROTECTED]> wrote:
> How would you expect them to work on arrays of code points?
Just like they do with Python 2.5 unicode objects, as long as the
"array of code points" is str, not e.g. a numpy array or tuple of ints,
which I don't expect to grow string methods :-)
BTW, from now on this is PEP 3135. http://python.org/dev/peps/pep-3135/
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe:
On 6/7/07, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> At 02:31 PM 6/6/2007 -0700, Guido van Rossum wrote:
> >I wonder if this may meet the needs for your PEP 3124? In
> >particularly, earlier on, you wrote:
> >
> >>Btw, PEP 3124 needs a way to receive the same class object at more or
> >>less the
Done. Commited to r55817.
On 6/7/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> Looks great -- can you check it in yourself?
>
> On 6/7/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote:
> > I found a way to fix the bug; look at the attached patch. Although, I
> > am not sure it was correct wa
At 02:31 PM 6/6/2007 -0700, Guido van Rossum wrote:
>I wonder if this may meet the needs for your PEP 3124? In
>particularly, earlier on, you wrote:
>
>>Btw, PEP 3124 needs a way to receive the same class object at more or
>>less the same moment, although in the form of a callback rather than
>>a c
On 6/7/07, Georg Brandl <[EMAIL PROTECTED]> wrote:
> Nick Coghlan schrieb:
> > Guido van Rossum wrote:
> >> A few PEPs with numbers < 400 are now targeting Python 3000, e.g. PEP
> >> 367 (new super) and PEP 344 (exception chaining). Are there any
> >> others? I propose that we renumber these to num
On 6/7/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> FWIW, for me the build error goes away when I unset
> LANG, so that the error occurs during build definitely
> *is* a locale issue.
Ah! You're right. I needed to do a `make clean` before, though.
My LANG variable was set to "en_CA.UTF-8".
Thanks for finding the issue!
On this one I think subprocess.py should be changed to allow None
(like all the other open() functions).
I'll check it in.
--Guido
On 6/7/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote:
> On 6/5/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > Feel free to m
On 6/7/07, Rauli Ruohonen <[EMAIL PROTECTED]> wrote:
> ... I will use XML character references to denote code points here.
> Wherever you see such a thing in this e-mail, replace it in your
> mind with the corresponding code point *immediately*. E.g.
> len(r'�c5;') == 1, but len(r'\u00c5') == 6.
Looks great -- can you check it in yourself?
On 6/7/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote:
> I found a way to fix the bug; look at the attached patch. Although, I
> am not sure it was correct way to fix it. The problem was due to str8
> that is recognized as an instance of `str'.
>
>
On 6/7/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> > It's time to look at the original traceback (attached as "tb", after
> > fixing the formatting problems). it looks like any call to
> > encodings.normalize_encoding() causes this problem.
>
> One problem with normalize_encoding is that it
> I've ran across the same zero arg split error a while back when attempting
> to run 'make test'. Below was the solution I came up with. Is there going
> to be an unicode equivalent to the str.translate() method?
The unicode type supports translate since 2.0.
Regards,
Martin
> It's time to look at the original traceback (attached as "tb", after
> fixing the formatting problems). it looks like any call to
> encodings.normalize_encoding() causes this problem.
One problem with normalize_encoding is that it might do
encoding = encoding.encode('latin-1')
return '_'.jo
Guido van Rossum wrote:
> On 6/7/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote:
>> On 6/7/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>>> It's time to look at the original traceback (attached as "tb", after
>>> fixing the formatting problems). it looks like any call to
>>> encodings.n
On 6/7/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote:
> On 6/7/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > It's time to look at the original traceback (attached as "tb", after
> > fixing the formatting problems). it looks like any call to
> > encodings.normalize_encoding() causes this
On 6/5/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
Feel free to mail me a patch to fix it.
Since you asked so politely, here a patch for you. :)
On 6/5/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I found another bug to report. It seems there is a bug in
> subprocess.py
I found a way to fix the bug; look at the attached patch. Although, I
am not sure it was correct way to fix it. The problem was due to str8
that is recognized as an instance of `str'.
-- Alexandre
On 6/5/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
On 6/5/07, Alexandre Vassalotti <[EMAIL PR
On 6/7/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> It's time to look at the original traceback (attached as "tb", after
> fixing the formatting problems). it looks like any call to
> encodings.normalize_encoding() causes this problem.
Don't know if it will help to know that, but it seems add
"Stephen J. Turnbull" <[EMAIL PROTECTED]> wrote:
> Josiah Carlson writes:
>
> > Maybe I'm missing something, but it seems to me that there might be a
> > simple solution. Don't normalize any identifiers or strings.
>
> That's not a solution, that's denying that there's a problem.
For core Py
On 6/7/07, Neal Norwitz <[EMAIL PROTECTED]> wrote:
> On 6/7/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> > > tokenize.TokenError: ('EOF in multi-line statement', (315, 0))
> >
> > I analyzed that a bit further, and found that
> > Lib/distutils/unixccompiler.py:214 reads
> >
> > if not isinsta
On 6/7/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> > tokenize.TokenError: ('EOF in multi-line statement', (315, 0))
>
> I analyzed that a bit further, and found that
> Lib/distutils/unixccompiler.py:214 reads
>
> if not isinstance(output_dir, (str, type(None)):
>
> This is a syntax error; a
> tokenize.TokenError: ('EOF in multi-line statement', (315, 0))
I analyzed that a bit further, and found that
Lib/distutils/unixccompiler.py:214 reads
if not isinstance(output_dir, (str, type(None)):
This is a syntax error; a closing parenthesis is missing.
tokenize.py chokes at the EOF as the
On 6/7/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote:
On 6/7/07, Ron Adam <[EMAIL PROTECTED]> wrote:
> Well not the bug yet, but I did find the file. :-)
>
> The following clears it so make will work.
>
> rm ./build/lib.linux-i686-3.0/_struct.so
>
> So maybe something to do with Module
On 6/7/07, Stephen J. Turnbull <[EMAIL PROTECTED]> wrote:
> What bothers me about the "sequence of code points" way of thinking is
> that len("Löwis") is nondeterministic.
It doesn't have to be, *for this specific example*. After what I've
read so far, I'm okay with normalization happening on the
On 6/7/07, Stephen J. Turnbull <[EMAIL PROTECTED]> wrote:
> I apologize for mistyping the example. *I* *was* talking about a
> string literal containing Unicode characters.
Then I misunderstood you too. To avoid such problems, I will use XML
character references to denote code points here. Wherev
> Then you wouldn't even be able to iterate over or index strings anymore,
> as that could produce such "invalid" strings, which would need to
> generate exceptions if you really want to ban them.
I don't think that's right: iterating over the the string should
presumably generate a iteration of v
On 6/7/07, Ron Adam <[EMAIL PROTECTED]> wrote:
> Well not the bug yet, but I did find the file. :-)
>
> The following clears it so make will work.
>
> rm ./build/lib.linux-i686-3.0/_struct.so
>
> So maybe something to do with Modules/_struct.c, or would it be something
> else that uses it?
R
On 6/6/07, Neal Norwitz <[EMAIL PROTECTED]> wrote:
> This probably means there is a problem with marshalling the byte code
> out. The first run compiles the .pyc files. Theoretically this
> writes out the same thing in memory. This isn't always the case
> though (ie, when there are bugs).
>
> A
On 6/6/07, Greg Ewing <[EMAIL PROTECTED]> wrote:
> Are you suggesting that this should be done on the fly
> when comparing strings? Or that all strings should be
> stored in canonicalised form?
Preferably the second; store them canonicalized.
> I can see some big cans of worms being opened up by
On 6/5/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> >> > Unicode does say pretty clearly that (at least) canonical
> >> > equivalents must be treated the same.
On reflection, what it actually says is that you may not assume they
are different. They can be different in the same way that two
Guido van Rossum writes:
> No it cannot. We are talking about \u escapes, not about a string
> literal containing Unicode characters ("Löwis").
Ah, good point.
I apologize for mistyping the example. *I* *was* talking about a
string literal containing Unicode characters. However, on my
termin
Nick Coghlan schrieb:
> Guido van Rossum wrote:
>> A few PEPs with numbers < 400 are now targeting Python 3000, e.g. PEP
>> 367 (new super) and PEP 344 (exception chaining). Are there any
>> others? I propose that we renumber these to numbers in the 3100+
>> range. I can see two forms of renaming:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Jun 6, 2007, at 9:21 PM, Chris Monson wrote:
> Renumbering, +1; using the next 31xx number, +1.
>
> Renumbering +1
> Leaving (old PEP number) in place as a stripped down PEP that just
> points to the new number: +1
I don't want to (accidentally)
Guido van Rossum wrote:
> A few PEPs with numbers < 400 are now targeting Python 3000, e.g. PEP
> 367 (new super) and PEP 344 (exception chaining). Are there any
> others? I propose that we renumber these to numbers in the 3100+
> range. I can see two forms of renaming:
>
> (a) 344 -> 3344 and 367
Neal Norwitz wrote:
> On 6/5/07, Ron Adam <[EMAIL PROTECTED]> wrote:
>> Alexandre Vassalotti wrote:
>> > On 6/5/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>> >> If "make clean" makes the problem go away, it's usually because there
>> >> were old .pyc files with incompatible byte code. We don't
Josiah Carlson writes:
> Maybe I'm missing something, but it seems to me that there might be a
> simple solution. Don't normalize any identifiers or strings.
That's not a solution, that's denying that there's a problem.
> Hear me out for a moment. People type what they want.
You're thinkin
When I originally tried to check in rev 55797, I got this exception:
Traceback (most recent call last):
File "/data/repos/projects/hooks/checkwhitespace.py", line 50, in ?
run_app(main)
File "/usr/lib/python2.3/site-packages/svn/core.py", line 33, in run_app
return apply(func, (pool,)
45 matches
Mail list logo