On Tue, Jun 22, 2010 at 11:58:57AM +0900, Stephen J. Turnbull wrote:
Toshio Kuratomi writes:
One comment here -- you can also have uri's that aren't decodable into
their
true textual meaning using a single encoding.
Apache will happily serve out uris that have utf-8, shift-jis,
On Jun 21, 2010, at 10:58 PM, Stephen J. Turnbull wrote:
The RFC says that URIs are text, and therefore they can (and IMO
should) be operated on as text in the stdlib.
No, *blue* is the best color for a shed.
Oops, wait, let me try that again.
While I broadly agree with this statement, it
Michael Urman writes:
It is somewhat troublesome that there doesn't appear to be an obvious
built-in idempotent-when-possible function that gives back the
provided bytes/str,
If you want something idempotent, it's already the case that
bytes(b'abc') = b'abc'. What might be desirable is to
There's an entry in whatsnew for 2.7 to the effect of The UserDict class is
now a new-style class.
I had thought there was a conscious decision to not change any existing classes
from old-style to new-style. IIRC, Martin had championed this idea and had
rejected all of proposals to make
On 21 Jun, 2010, at 22:25, Antoine Pitrou wrote:
Le lundi 21 juin 2010 à 21:13 +0100, Michael Foord a écrit :
If OS X is a supported and important platform for Python then fixing all
problems that it reveals (or being willing to) should definitely not be
a pre-requisite of providing a
On Jun 21, 2010, at 10:31 PM, Glyph Lefkowitz wrote:
This is a common pain-point for porting software to 3.x - you had a string,
it kinda worked most of the time before, but now you need to keep track of
text too and the functions which seemed to work on bytes no longer do.
Thanks Glyph.
P.J. Eby writes:
I know, it's a hard thing to wrap one's head around, since on the
surface it sounds like unicode is the programmer's savior.
I don't need to wrap my head around it. It's been deeply embedded,
point first, and the nasty barbs ensure that I have no desire to pull
it back out.
Glyph Lefkowitz writes:
On Jun 21, 2010, at 10:58 PM, Stephen J. Turnbull wrote:
Note also that the complete solution argument cuts both ways. Eg, a
complete solution should implement UTS 39 confusables detection[1]
and IDNA[2]. Good luck doing that with bytes!
And good luck
hello,
how can i simply add new functions to module after its initialization
(Py_InitModule())? I'm missing something like
PyModule_AddCFunction().
thank you
L.
___
Python-Dev mailing list
Python-Dev@python.org
how can i simply add new functions to module after its initialization
(Py_InitModule())? I'm missing something like
PyModule_AddCFunction().
This type of question really belongs to python-list aka
comp.lang.python which I CC-d now. Please keep the discussion on that
list.
Cheers,
Daniel
--
On Tue, Jun 22, 2010 at 4:49 PM, Stephen J. Turnbull step...@xemacs.org wrote:
Which works if and only if your outputs are truly unicode-able.
With PEP 383, they always are, as long as you allow Unicode to be
decoded to the same garbage your bytes-based program would have
produced anyway.
On Tue, 22 Jun 2010 11:46:27 am Terry Reedy wrote:
3. Unicode disclaims direct representation of glyphic variants
(though again, exceptions were made for asian acceptance). For
example, in English, mechanically printed 'a' and 'g' are different
from manually printed 'a' and 'g'. Representing
Toshio Kuratomi writes:
I'll definitely buy that. Would urljoin(b_base, b_subdir) = bytes and
urljoin(u_base, u_subdir) = unicode be acceptable though?
Probably.
But it doesn't matter what I say, since Guido has defined that as
polymorphism and approved it in principle.
(I think,
Nick Coghlan writes:
On Tue, Jun 22, 2010 at 4:49 PM, Stephen J. Turnbull step...@xemacs.org
wrote:
Which works if and only if your outputs are truly unicode-able.
With PEP 383, they always are, as long as you allow Unicode to be
decoded to the same garbage your bytes-based
On Tue, Jun 22, 2010 at 2:21 AM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
I had thought there was a conscious decision to not change any existing
classes from old-style to new-style.
I thought so as well. Changing any class from old-style to new-style
risks breaking applications in
2010/6/22 Raymond Hettinger raymond.hettin...@gmail.com:
There's an entry in whatsnew for 2.7 to the effect of The UserDict class is
now a new-style class.
I had thought there was a conscious decision to not change any existing
classes from old-style to new-style. IIRC, Martin had championed
On Tue, Jun 22, 2010 at 2:40 PM, Fred Drake fdr...@acm.org wrote:
On Tue, Jun 22, 2010 at 2:21 AM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
I had thought there was a conscious decision to not change any existing
classes from old-style to new-style.
I thought so as well. Changing
On Tue, Jun 22, 2010 at 00:28, Stephen J. Turnbull step...@xemacs.org wrote:
Michael Urman writes:
It is somewhat troublesome that there doesn't appear to be an obvious
built-in idempotent-when-possible function that gives back the
provided bytes/str,
If you want something idempotent,
[Just addressing one little issue here; generally I'm just happy that
we're discussing this issue in such detail from so many points of
view.]
On Mon, Jun 21, 2010 at 10:50 PM, Toshio Kuratomi a.bad...@gmail.com wrote:
[...] Would urljoin(b_base, b_subdir) = bytes and
urljoin(u_base, u_subdir) =
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Jesse Noller wrote:
On Jun 19, 2010, at 10:13 AM, Tres Seaver tsea...@palladion.com wrote:
Nothing is set in stone; if something is incredibly painful, or worse
yet broken, then someone needs to file a bug, bring it to this list,
or bring up a
On 22 Jun, 2010, at 3:38, Alexander Belopolsky wrote:
On Mon, Jun 21, 2010 at 6:16 PM, Martin v. Löwis mar...@v.loewis.de wrote:
The test_posix failure is a regression from 2.6 (but it only shows up on
some machines - it is caused by a fairly braindead implementation of a
couple of posix
It looks like simplejson 2.1.0 and 2.1.1 have been released:
http://bob.pythonmac.org/archives/2010/03/10/simplejson-210/
http://bob.pythonmac.org/archives/2010/03/31/simplejson-211/
It looks like any changes that didn't come from the Python tree didn't
go into the Python tree, either.
I guess
2010/6/22 Dirkjan Ochtman dirk...@ochtman.nl:
I guess we can't put these changes into 2.7 anymore? How can we make
this better next time?
Never have externally maintained packages.
--
Regards,
Benjamin
___
Python-Dev mailing list
On Jun 22, 2010, at 5:48 AM, Benjamin Peterson wrote:
2010/6/22 Raymond Hettinger raymond.hettin...@gmail.com:
There's an entry in whatsnew for 2.7 to the effect of The UserDict class is
now a new-style class.
I had thought there was a conscious decision to not change any existing
classes
On Tue, Jun 22, 2010 at 6:31 AM, Stephen J. Turnbull step...@xemacs.orgwrote:
Toshio Kuratomi writes:
I'll definitely buy that. Would urljoin(b_base, b_subdir) = bytes and
urljoin(u_base, u_subdir) = unicode be acceptable though?
Probably.
But it doesn't matter what I say, since
On Tue, Jun 22, 2010 at 12:39 PM, Ronald Oussoren
ronaldousso...@mac.com wrote:
..
Both are valid fixes, both have both advantages and disadvantages.
Your proposal:
* Reverts to the behavior in 2.6
* Ensures that posix.getgroups and posix.setgroups are internally consistent
It is also very
2010/6/22 Raymond Hettinger raymond.hettin...@gmail.com:
On Jun 22, 2010, at 5:48 AM, Benjamin Peterson wrote:
2010/6/22 Raymond Hettinger raymond.hettin...@gmail.com:
There's an entry in whatsnew for 2.7 to the effect of The UserDict class is
now a new-style class.
I had thought there was
Alexander Belopolsky alexander.belopol...@gmail.com wrote:
On Mon, Jun 21, 2010 at 9:26 PM, Bill Janssen jans...@parc.com wrote:
..
Though, isn't that behavior of urllib.proxy_bypass another bug?
I don't know. Ask Ronald.
Hmmm. I brought up the System Preferences panel on my Mac, and
On Tue, Jun 22, 2010 at 08:31:13PM +0900, Stephen J. Turnbull wrote:
Toshio Kuratomi writes:
unicode handling redesign. I'm stating my reading of the RFC not to defend
the use case Philip has, but because I think that the outlook that non-text
uris (before being percentencoded) are
On Jun 22, 2010, at 10:08 AM, Benjamin Peterson wrote:
. There was a typo in
abc.py which prevented it from raising errors when non new-style class
objects were passed in.
For 2.x, that was probably a good thing, a happy accident
that made it possible to register existing mapping classes
as a
On Mon, Jun 21, 2010 at 10:28 PM, Stephen J. Turnbull
step...@xemacs.org wrote:
Michael Urman writes:
It is somewhat troublesome that there doesn't appear to be an obvious
built-in idempotent-when-possible function that gives back the
provided bytes/str,
If you want something
On Jun 22, 2010, at 1:03 PM, Ian Bicking wrote:
Similarly I'd expect (from experience) that a programmer using
Python to want to take the same approach, sticking with unencoded
data in nearly all situations.
Yeah. This is a real issue I have with the direction Python3 went: it
pushes you
Guido van Rossum wrote:
[Just addressing one little issue here; generally I'm just happy that
we're discussing this issue in such detail from so many points of
view.]
On Mon, Jun 21, 2010 at 10:50 PM, Toshio Kuratomi a.bad...@gmail.com wrote:
[...] Would urljoin(b_base, b_subdir) = bytes
[cc'ing Bob on his gmail address; didn't have any other address handy
so I don't know if this will actually get to him]
On Tue, Jun 22, 2010 at 09:54, Dirkjan Ochtman dirk...@ochtman.nl wrote:
It looks like simplejson 2.1.0 and 2.1.1 have been released:
On Tuesday, June 22, 2010, Brett Cannon br...@python.org wrote:
[cc'ing Bob on his gmail address; didn't have any other address handy
so I don't know if this will actually get to him]
On Tue, Jun 22, 2010 at 09:54, Dirkjan Ochtman dirk...@ochtman.nl wrote:
It looks like simplejson 2.1.0 and
On 6/22/2010 1:22 AM, Glyph Lefkowitz wrote:
The thing that I have heard in passing from a couple of folks with
experience in this area is that some older software in asia would
present characters differently if they were originally encoded in a
japanese encoding versus a chinese encoding, even
On 6/22/2010 9:24 AM, Michael Urman wrote:
By idempotent-when-possible, I mean to_bytes(str_or_bytes, encoding,
errors) that would pass an instance of bytes through, or encode an
instance of str. And of course a to_str that performs similarly,
passing str through and decoding bytes. While
On 6/22/2010 12:53 PM, Guido van Rossum wrote:
On Mon, Jun 21, 2010 at 11:47 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
On Jun 21, 2010, at 10:31 PM, Glyph Lefkowitz wrote:
This is a common pain-point for porting software to 3.x - you had a
string, it kinda worked most of
On 6/22/2010 6:52 AM, Steven D'Aprano wrote:
On Tue, 22 Jun 2010 11:46:27 am Terry Reedy wrote:
3. Unicode disclaims direct representation of glyphic variants
(though again, exceptions were made for asian acceptance). For
example, in English, mechanically printed 'a' and 'g' are different
from
Hello,
The method in question: http://docs.python.org/library/cgi.html#cgi.escape
http://svn.python.org/view/python/tags/r265/Lib/cgi.py?view=markup # at
the bottom
Convert the characters '', '' and '' in string s to HTML-safe sequences.
Use this if you need to display text that might contain
On Tue, Jun 22, 2010 at 1:07 PM, James Y Knight f...@fuhm.net wrote:
The surrogateescape method is a nice workaround for this, but I can't help
thinking that it might've been better to just treat stuff as
possibly-invalid-but-probably-utf8 byte-strings from input, through
processing, to
Tres, I am a Python3 enthusiast and realist. I did not expect major
adoption for about 3 years (more optimistic than the 5 years of some).
If you are feeling pressured to 'move' to Python3, it is not from me. I
am sure you will do so on your own, perhaps even with enthusiasm, when
it will be
On Tue, Jun 22, 2010 at 9:37 AM, Tres Seaver tsea...@palladion.com wrote:
Any turdiness (which I am *not* arguing for) is a natural consequence
of the kinds of backward incompatibilities which were *not* ruled out
for Python 3, along with the (early, now waning) build it and they will
come
Craig Younkins cyounk...@gmail.com wrote:
cgi.escape never escapes single quote characters, which can easily lead to a
Cross-Site Scripting (XSS) vulnerability. This seems to be known by many,
but a quick search reveals many are using cgi.escape for HTML attribute
escaping.
Did you file a
On Wed, Jun 23, 2010 at 6:09 AM, M.-A. Lemburg m...@egenix.com wrote:
return constant.encode('utf-8')
So now you can write x.split(literal_as('', x)).
This polymorphism is what we used in Python2 a lot to write
code that works for both Unicode and 8-bit strings.
Unfortunately,
This effectively substitutes getgrouplist called on the current user
for getgroups. In 3.x, I believe the correct action will be to
provide direct access to getgrouplist which is while not POSIX (yet?),
is widely available.
As a policy, adding non-POSIX functions to the posix module is
On Tue, Jun 22, 2010 at 12:56 PM, Benjamin Peterson benja...@python.org wrote:
Never have externally maintained packages.
Seriously! I concur with this.
Fortunately, it's not a real problem in this case.
There's the (maintained) simplejson package, and the unmaintained json
package. And
On Wed, Jun 23, 2010 at 2:17 AM, Guido van Rossum gu...@python.org wrote:
(1) Literals.
If you write something like x.split('') you are implicitly assuming x
is text. I don't see a very clean way to overcome this; you'll have to
implement some kind of type check e.g.
x.split('') if
On Wed, Jun 23, 2010 at 4:09 AM, M.-A. Lemburg m...@egenix.com wrote:
It would be great if we could have something like the above as
builtin method:
x.split(''.as(x))
As per my other message, another possible (and reasonably intuitive)
spelling would be:
x.split(x.coerce(''))
Writing it
Benjamin Peterson wrote:
IIRC this was because UserDict tries to be a MutableMapping but abcs
require new style classes.
Are there any use cases for UserList and UserDict in new
code, now that list and dict can be subclassed?
If not, I don't think it would be a big problem if they
were left
On 23/06/2010 00:03, Greg Ewing wrote:
Benjamin Peterson wrote:
IIRC this was because UserDict tries to be a MutableMapping but abcs
require new style classes.
Are there any use cases for UserList and UserDict in new
code, now that list and dict can be subclassed?
Inheriting from list or
On 22/06/2010 22:40, Robert Collins wrote:
On Wed, Jun 23, 2010 at 6:09 AM, M.-A. Lemburgm...@egenix.com wrote:
return constant.encode('utf-8')
So now you can write x.split(literal_as('', x)).
This polymorphism is what we used in Python2 a lot to write
code that works
On Jun 22, 2010, at 3:59 PM, Michael Foord wrote:
On 23/06/2010 00:03, Greg Ewing wrote:
Benjamin Peterson wrote:
IIRC this was because UserDict tries to be a MutableMapping but abcs
require new style classes.
Are there any use cases for UserList and UserDict in new
code, now that list
On 22/06/2010 19:07, James Y Knight wrote:
On Jun 22, 2010, at 1:03 PM, Ian Bicking wrote:
Similarly I'd expect (from experience) that a programmer using Python
to want to take the same approach, sticking with unencoded data in
nearly all situations.
Yeah. This is a real issue I have with
On Tue, Jun 22, 2010 at 11:17 AM, Guido van Rossum gu...@python.org wrote:
(2) Data sources.
These can be functions that produce new data from non-string data,
e.g. str(int), read it from a named file, etc. An example is read()
vs. write(): it's easy to create a (hypothetical) polymorphic
At 07:41 AM 6/23/2010 +1000, Nick Coghlan wrote:
Then my example above could be made polymorphic (for ASCII compatible
encodings) by writing:
[x for x in seq if x.endswith(x.coerce(b))]
I'm trying to see downsides to this idea, and I'm not really seeing
any (well, other than 2.7 being almost
On Jun 22, 2010, at 12:53 PM, Guido van Rossum wrote:
On Mon, Jun 21, 2010 at 11:47 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
On Jun 21, 2010, at 10:31 PM, Glyph Lefkowitz wrote:
This is a common pain-point for porting software to 3.x - you had a
string, it kinda worked
On Jun 22, 2010, at 2:07 PM, James Y Knight wrote:
Yeah. This is a real issue I have with the direction Python3 went: it pushes
you into decoding everything to unicode early, even when you don't care --
all you really wanted to do is pass it from one API to another, with some
well-defined
On Jun 22, 2010, at 7:23 PM, Ian Bicking wrote:
This is a place where bytes+encoding might also have some benefit. XML is
someplace where you might load a bunch of data but only touch a little bit of
it, and the amount of data is frequently large enough that the efficiencies
are
On Tue, Jun 22, 2010 at 15:32, Terry Reedy tjre...@udel.edu wrote:
On 6/22/2010 9:24 AM, Michael Urman wrote:
These are trivial functions;
I just don't fully understand why the capability isn't baked in.
Possible reasons: They are special purpose functions easily built on the
basic functions
On Tue, Jun 22, 2010 at 4:23 PM, Ian Bicking i...@colorstudy.com wrote:
This reminds me of the optimization ElementTree and lxml made in Python 2
(not sure what they do in Python 3?) where they use str when a string is
ASCII to avoid the memory and performance overhead of unicode.
An
On Wed, Jun 23, 2010 at 12:25 PM, Glyph Lefkowitz
gl...@twistedmatrix.com wrote:
I can also appreciate what's been said in this thread a bunch of times: to my
knowledge, nobody has actually shown a profile of an application where
encoding is significant overhead. I believe that encoding
Bill Janssen jans...@parc.com wrote:
Considering that we've just released 2.7rc2, there are an awful lot of
red buildbots for 2.7. In fact, I don't remember having seen a green
buildbot for OS X and 2.7. Shouldn't these be fixed?
Thanks to some action by Ronald, my two PPC OS X buildbots
On Tue, Jun 22, 2010 at 7:17 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
Benjamin fixed the UserDict and ABC problem earlier today in r82155.
It is now the same as it was in Py2.6.
Thanks, Benjamin!
-Fred
--
Fred L. Drake, Jr.fdrake at gmail.com
A storm broke loose in my
64 matches
Mail list logo