R. David Murray wrote:
Having such a poly_str type would probably make my life easier.
A thought on this poly_str type: perhaps it could be
called ascii, since that's what it would have to be
restricted to, and have
a'xxx'
as a literal syntax for it, seeing as literals seem to
be one of
On Mon, Jun 28, 2010 at 08:28:45PM +1200, Greg Ewing wrote:
A thought on this poly_str type: perhaps it could be
called ascii, since that's what it would have to be
restricted to, and have
a'xxx'
as a literal syntax for it, seeing as literals seem to
be one of its main use cases.
This
On Mon, 28 Jun 2010 13:55:26 +0530, Senthil Kumaran orsent...@gmail.com wrote:
On Mon, Jun 28, 2010 at 08:28:45PM +1200, Greg Ewing wrote:
Thinking way outside the square, and probably the pale
as well, maybe @ could be pressed into service as an
infix operator, with
s...@i
On Mon, Jun 28, 2010 at 6:28 PM, Greg Ewing greg.ew...@canterbury.ac.nz wrote:
R. David Murray wrote:
Having such a poly_str type would probably make my life easier.
A thought on this poly_str type: perhaps it could be
called ascii, since that's what it would have to be
restricted to, and
On Sat, 26 Jun 2010 23:49:11 -0400
P.J. Eby p...@telecommunity.com wrote:
Remember, bytes and strings already have to detect mixed-type
operations.
Not in Python 3. They just raise a TypeError on bad
(mixed-type) arguments.
Regards
Antoine.
___
P.J. Eby writes:
At 12:42 PM 6/26/2010 +0900, Stephen J. Turnbull wrote:
What I'm saying here is that if bytes are the signal of validity, and
the stdlib functions preserve validity, then it's better to have the
stdlib functions object to unicode data as an argument. Compare the
At 03:53 PM 6/27/2010 +1000, Nick Coghlan wrote:
We could talk about this even longer, but the most effective way
forward is going to be a patch that improves the URL parsing
situation.
Certainly, it's the only practical solution for the immediate problems in 3.2.
I only mentioned that I hate
I've been watching this discussion with intense interest, but have
been so lagged in following the thread that I haven't replied.
I got caught up today
On Sun, 27 Jun 2010 15:53:59 +1000, Nick Coghlan ncogh...@gmail.com wrote:
The difference is that we have three classes of algorithm here:
At 12:42 PM 6/26/2010 +0900, Stephen J. Turnbull wrote:
What I'm saying here is that if bytes are the signal of validity, and
the stdlib functions preserve validity, then it's better to have the
stdlib functions object to unicode data as an argument. Compare the
alternative: it returns a
On Sun, Jun 27, 2010 at 4:17 AM, P.J. Eby p...@telecommunity.com wrote:
The idea that I'm proposing is that the basic string and byte types should
defer to user-defined string types for mixed type operations, so that
polymorphism of string-manipulation functions is the *default* case, rather
At 12:43 PM 6/27/2010 +1000, Nick Coghlan wrote:
While full support for third party strings and
byte sequence implementations is an interesting idea, I think it's
overkill for the specific problem of making it easier to write
str/bytes agnostic functions for tasks like URL parsing.
OTOH, to
On Sun, Jun 27, 2010 at 1:49 PM, P.J. Eby p...@telecommunity.com wrote:
I just hate the idea that functions taking strings should have to be
*rewritten* to be explicitly type-agnostic. It seems *so* un-Pythonic...
like if all the bitmasking functions you'd ever written using 32-bit int
Guido van Rossum writes:
On Thu, Jun 24, 2010 at 1:12 AM, Stephen J. Turnbull step...@xemacs.org
wrote:
Understood, but both the majority of str/bytes methods and several
existing APIs (e.g. many in the os module, like os.listdir()) do it
this way.
Understood.
Also, IMO a
P.J. Eby writes:
This doesn't have to be in the functions; it can be in the
*types*. Mixed-type string operations have to do type checking and
upcasting already, but if the protocol were open, you could make an
encoded-bytes type that would handle the error checking.
Don't you
At 04:49 PM 6/25/2010 +0900, Stephen J. Turnbull wrote:
P.J. Eby writes:
This doesn't have to be in the functions; it can be in the
*types*. Mixed-type string operations have to do type checking and
upcasting already, but if the protocol were open, you could make an
encoded-bytes type
On Fri, Jun 25, 2010 at 2:05 AM, Stephen J. Turnbull step...@xemacs.orgwrote:
But join('x', 'y') - 'x/y' and join(b'x', b'y') - b'x/y' make
sense to me.
So, actually, I *don't* understand what you mean by needing LBYL.
Consider docutils. Some folks assert that URIs *are* bytes and
Ian Bicking writes:
I don't get what you are arguing against. Are you worried that if
we make URL code polymorphic that this will mean some code will
treat URLs as bytes, and that code will be incompatible with URLs
as text? No one is arguing we remove text support from any of
these
At 01:18 AM 6/26/2010 +0900, Stephen J. Turnbull wrote:
It seems to me what is wanted here is something like Perl's taint
mechanism, for *both* kinds of strings. Am I missing something?
You could certainly view it as a kind of tainting. The part where
the type would be bytes-based is indeed
P.J. Eby writes:
it's just that if you already have the bytes, and all you want to
do is tag them (e.g. the WSGI headers case), the extra encoding
step seems pointless.
Well, I'll have to concede that unless and until I get involved in the
WSGI development effort.wink
But with your
Guido van Rossum writes:
For example: how we can make the suite of functions used for URL
processing more polymorphic, so that each developer can choose for
herself how URLs need to be treated in her application.
While you have come down on the side of polymorphism (as opposed to
separate
On Tue, Jun 22, 2010 at 20:07, James Y Knight f...@fuhm.net wrote:
Yeah. This is a real issue I have with the direction Python3 went: it pushes
you into decoding everything to unicode early, even when you don't care --
Well, yes, maybe even if *you* don't care. But often the functions you
need
Lennart Regebro wrote:
On Tue, Jun 22, 2010 at 20:07, James Y Knight f...@fuhm.net wrote:
Yeah. This is a real issue I have with the direction Python3 went: it pushes
you into decoding everything to unicode early, even when you don't care --
Well, yes, maybe even if *you* don't care. But
On 24/06/2010 11:58, M.-A. Lemburg wrote:
Lennart Regebro wrote:
On Tue, Jun 22, 2010 at 20:07, James Y Knightf...@fuhm.net wrote:
Yeah. This is a real issue I have with the direction Python3 went: it pushes
you into decoding everything to unicode early, even when you don't care --
On Thu, Jun 24, 2010 at 1:12 AM, Stephen J. Turnbull step...@xemacs.org wrote:
Guido van Rossum writes:
For example: how we can make the suite of functions used for URL
processing more polymorphic, so that each developer can choose for
herself how URLs need to be treated in her
On Fri, Jun 25, 2010 at 12:33 AM, Guido van Rossum gu...@python.org wrote:
Also, IMO a polymorphic function should *not* accept *mixed*
bytes/text input -- join('x', b'y') should be rejected. But join('x',
'y') - 'x/y' and join(b'x', b'y') - b'x/y' make sense to me.
A policy of allowing
On Thu, Jun 24, 2010 at 8:25 AM, Nick Coghlan ncogh...@gmail.com wrote:
On Fri, Jun 25, 2010 at 12:33 AM, Guido van Rossum gu...@python.org wrote:
Also, IMO a polymorphic function should *not* accept *mixed*
bytes/text input -- join('x', b'y') should be rejected. But join('x',
'y') - 'x/y' and
P.J. Eby a écrit :
[...] stdlib constants are almost always ASCII,
and the main use cases for ebytes would involve ascii-extended encodings.)
Then, how about a new ascii string literal? This would produce a special kind
of string that would coerce to a normal string when mixed with a str,
At 05:12 PM 6/24/2010 +0900, Stephen J. Turnbull wrote:
Guido van Rossum writes:
For example: how we can make the suite of functions used for URL
processing more polymorphic, so that each developer can choose for
herself how URLs need to be treated in her application.
While you have come
On Fri, Jun 25, 2010 at 3:07 AM, P.J. Eby p...@telecommunity.com wrote:
(Btw, in some earlier emails, Stephen, you implied that this could be fixed
with codecs -- but it can't, because the problem isn't with the bytes
containing invalid Unicode, it's with the Unicode containing invalid bytes
On Fri, Jun 25, 2010 at 1:41 AM, Guido van Rossum gu...@python.org wrote:
I don't think we should abuse sum for this. A simple idiom to get the
*empty* string of a particular type is x[:0] so you could write
something like this to concatenate a list or strings or bytes:
xs[:0].join(xs). Note
Ian Bicking writes:
Just for perspective, I don't know if I've ever wanted to deal with a URL
like that.
Ditto, I do many times a day for Japanese media sites and Wikipedia.
I know how it is supposed to work, and I know what a browser does
with that, but so many tools will clean that
James Y Knight writes:
The surrogateescape method is a nice workaround for this, but I can't
help thinking that it might've been better to just treat stuff as
possibly-invalid-but-probably-utf8 byte-strings from input, through
processing, to output.
This is the world we already
Nick Coghlan wrote:
On Wed, Jun 23, 2010 at 4:09 AM, M.-A. Lemburg m...@egenix.com wrote:
It would be great if we could have something like the above as
builtin method:
x.split(''.as(x))
As per my other message, another possible (and reasonably intuitive)
spelling would be:
On Wed, Jun 23, 2010 at 7:18 PM, M.-A. Lemburg m...@egenix.com wrote:
Note that the point of using a builtin method was to get
better performance. Such type adaptions are often needed in
loops, so adding a few extra Python function calls just to
convert a str object to a bytes object or
At 08:34 PM 6/22/2010 -0400, Glyph Lefkowitz wrote:
I suspect the practical problem here is that there's no CharacterString ABC
That, and the absence of a string coercion protocol so that mixing
your custom string with standard strings will do the right thing for
your intended use.
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Stephen J. Turnbull wrote:
We do need str-based implementations of modules like urllib.
Why would that be? URLs aren't text, and never will be. The fact that
to the eye they may seem to be text-ish doesn't make them text. This
*is* a case where
On Wed, Jun 23, 2010 at 8:30 AM, Tres Seaver tsea...@palladion.com wrote:
Stephen J. Turnbull wrote:
We do need str-based implementations of modules like urllib.
Why would that be? URLs aren't text, and never will be. The fact that
to the eye they may seem to be text-ish doesn't make them
On Jun 23, 2010, at 08:43 AM, Guido van Rossum wrote:
So I propose that we drop the discussion are URLs text or bytes and
try to find something more pragmatic to discuss.
email has exactly the same question, and the answer is yes. wink
For example: how we can make the suite of functions used
Tres Seaver tsea...@palladion.com wrote:
Stephen J. Turnbull wrote:
We do need str-based implementations of modules like urllib.
Why would that be? URLs aren't text, and never will be. The fact that
to the eye they may seem to be text-ish doesn't make them text. This
URLs are exactly
On Wed, Jun 23, 2010 at 10:30 AM, Tres Seaver tsea...@palladion.com wrote:
Stephen J. Turnbull wrote:
We do need str-based implementations of modules like urllib.
Why would that be? URLs aren't text, and never will be. The fact that
to the eye they may seem to be text-ish doesn't make
Guido van Rossum gu...@python.org wrote:
So I propose that we drop the discussion are URLs text or bytes and
try to find something more pragmatic to discuss.
For example: how we can make the suite of functions used for URL
processing more polymorphic, so that each developer can choose for
Oops, I forgot some important quoting (important for the algorithm,
maybe not actually for the discussion)...
from urllib.parse import urlsplit, urlunsplit
import encodings.idna
# urllib.parse.quote both always returns str, and is not as
conservative in quoting as required here...
def
On Jun 22, 2010, at 8:57 PM, Robert Collins wrote:
bzr has a cache of decoded strings in it precisely because decode is
slow. We accept slowness encoding to the users locale because thats
typically much less data to examine than we've examined while
generating the commit/diff/whatever. We
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Bill Janssen wrote:
The bigger problem seems to be that we're revisiting the design
discussion about urllib.parse from the summer of 2008. See
http://bugs.python.org/issue3300 if you want to recall how we hashed
this out 2 years ago. I didn't
On Wed, 23 Jun 2010 14:23:33 -0400
Tres Seaver tsea...@palladion.com wrote:
Perhaps such decisions need revisiting in light of subsequent experience
/ pain / learning. E.g:
- - the repeated inability of the web-sig to converge on appropriate
semantics for a Python3-compatible version of
On Wed, Jun 23, 2010 at 09:36:45PM +0200, Antoine Pitrou wrote:
On Wed, 23 Jun 2010 14:23:33 -0400
Tres Seaver tsea...@palladion.com wrote:
- - the slow adoption / porting rate of major web frameworks and libraries
to Python 3.
Some of the major web frameworks and libraries have a ton
On Wed, 23 Jun 2010 17:30:22 -0400
Toshio Kuratomi a.bad...@gmail.com wrote:
Note that this assumption seems optimistic to me. I started talking to Graham
Dumpleton, author of mod_wsgi a couple years back because mod_wsgi and paste
do decoding of bytes to unicode at different layers which
On Wed, Jun 23, 2010 at 11:35:12PM +0200, Antoine Pitrou wrote:
On Wed, 23 Jun 2010 17:30:22 -0400
Toshio Kuratomi a.bad...@gmail.com wrote:
Note that this assumption seems optimistic to me. I started talking to
Graham
Dumpleton, author of mod_wsgi a couple years back because mod_wsgi
On Tue, Jun 22, 2010 at 11:58:57AM +0900, Stephen J. Turnbull wrote:
Toshio Kuratomi writes:
One comment here -- you can also have uri's that aren't decodable into
their
true textual meaning using a single encoding.
Apache will happily serve out uris that have utf-8, shift-jis,
On Jun 21, 2010, at 10:58 PM, Stephen J. Turnbull wrote:
The RFC says that URIs are text, and therefore they can (and IMO
should) be operated on as text in the stdlib.
No, *blue* is the best color for a shed.
Oops, wait, let me try that again.
While I broadly agree with this statement, it
On Jun 21, 2010, at 10:31 PM, Glyph Lefkowitz wrote:
This is a common pain-point for porting software to 3.x - you had a string,
it kinda worked most of the time before, but now you need to keep track of
text too and the functions which seemed to work on bytes no longer do.
Thanks Glyph.
Glyph Lefkowitz writes:
On Jun 21, 2010, at 10:58 PM, Stephen J. Turnbull wrote:
Note also that the complete solution argument cuts both ways. Eg, a
complete solution should implement UTS 39 confusables detection[1]
and IDNA[2]. Good luck doing that with bytes!
And good luck
Toshio Kuratomi writes:
I'll definitely buy that. Would urljoin(b_base, b_subdir) = bytes and
urljoin(u_base, u_subdir) = unicode be acceptable though?
Probably.
But it doesn't matter what I say, since Guido has defined that as
polymorphism and approved it in principle.
(I think,
[Just addressing one little issue here; generally I'm just happy that
we're discussing this issue in such detail from so many points of
view.]
On Mon, Jun 21, 2010 at 10:50 PM, Toshio Kuratomi a.bad...@gmail.com wrote:
[...] Would urljoin(b_base, b_subdir) = bytes and
urljoin(u_base, u_subdir) =
On Tue, Jun 22, 2010 at 6:31 AM, Stephen J. Turnbull step...@xemacs.orgwrote:
Toshio Kuratomi writes:
I'll definitely buy that. Would urljoin(b_base, b_subdir) = bytes and
urljoin(u_base, u_subdir) = unicode be acceptable though?
Probably.
But it doesn't matter what I say, since
On Tue, Jun 22, 2010 at 08:31:13PM +0900, Stephen J. Turnbull wrote:
Toshio Kuratomi writes:
unicode handling redesign. I'm stating my reading of the RFC not to defend
the use case Philip has, but because I think that the outlook that non-text
uris (before being percentencoded) are
On Jun 22, 2010, at 1:03 PM, Ian Bicking wrote:
Similarly I'd expect (from experience) that a programmer using
Python to want to take the same approach, sticking with unencoded
data in nearly all situations.
Yeah. This is a real issue I have with the direction Python3 went: it
pushes you
Guido van Rossum wrote:
[Just addressing one little issue here; generally I'm just happy that
we're discussing this issue in such detail from so many points of
view.]
On Mon, Jun 21, 2010 at 10:50 PM, Toshio Kuratomi a.bad...@gmail.com wrote:
[...] Would urljoin(b_base, b_subdir) = bytes
On 6/22/2010 1:22 AM, Glyph Lefkowitz wrote:
The thing that I have heard in passing from a couple of folks with
experience in this area is that some older software in asia would
present characters differently if they were originally encoded in a
japanese encoding versus a chinese encoding, even
On 6/22/2010 12:53 PM, Guido van Rossum wrote:
On Mon, Jun 21, 2010 at 11:47 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
On Jun 21, 2010, at 10:31 PM, Glyph Lefkowitz wrote:
This is a common pain-point for porting software to 3.x - you had a
string, it kinda worked most of
On Tue, Jun 22, 2010 at 1:07 PM, James Y Knight f...@fuhm.net wrote:
The surrogateescape method is a nice workaround for this, but I can't help
thinking that it might've been better to just treat stuff as
possibly-invalid-but-probably-utf8 byte-strings from input, through
processing, to
On Wed, Jun 23, 2010 at 6:09 AM, M.-A. Lemburg m...@egenix.com wrote:
return constant.encode('utf-8')
So now you can write x.split(literal_as('', x)).
This polymorphism is what we used in Python2 a lot to write
code that works for both Unicode and 8-bit strings.
Unfortunately,
On Wed, Jun 23, 2010 at 2:17 AM, Guido van Rossum gu...@python.org wrote:
(1) Literals.
If you write something like x.split('') you are implicitly assuming x
is text. I don't see a very clean way to overcome this; you'll have to
implement some kind of type check e.g.
x.split('') if
On Wed, Jun 23, 2010 at 4:09 AM, M.-A. Lemburg m...@egenix.com wrote:
It would be great if we could have something like the above as
builtin method:
x.split(''.as(x))
As per my other message, another possible (and reasonably intuitive)
spelling would be:
x.split(x.coerce(''))
Writing it
On 22/06/2010 22:40, Robert Collins wrote:
On Wed, Jun 23, 2010 at 6:09 AM, M.-A. Lemburgm...@egenix.com wrote:
return constant.encode('utf-8')
So now you can write x.split(literal_as('', x)).
This polymorphism is what we used in Python2 a lot to write
code that works
On 22/06/2010 19:07, James Y Knight wrote:
On Jun 22, 2010, at 1:03 PM, Ian Bicking wrote:
Similarly I'd expect (from experience) that a programmer using Python
to want to take the same approach, sticking with unencoded data in
nearly all situations.
Yeah. This is a real issue I have with
On Tue, Jun 22, 2010 at 11:17 AM, Guido van Rossum gu...@python.org wrote:
(2) Data sources.
These can be functions that produce new data from non-string data,
e.g. str(int), read it from a named file, etc. An example is read()
vs. write(): it's easy to create a (hypothetical) polymorphic
At 07:41 AM 6/23/2010 +1000, Nick Coghlan wrote:
Then my example above could be made polymorphic (for ASCII compatible
encodings) by writing:
[x for x in seq if x.endswith(x.coerce(b))]
I'm trying to see downsides to this idea, and I'm not really seeing
any (well, other than 2.7 being almost
On Jun 22, 2010, at 12:53 PM, Guido van Rossum wrote:
On Mon, Jun 21, 2010 at 11:47 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
On Jun 21, 2010, at 10:31 PM, Glyph Lefkowitz wrote:
This is a common pain-point for porting software to 3.x - you had a
string, it kinda worked
On Jun 22, 2010, at 2:07 PM, James Y Knight wrote:
Yeah. This is a real issue I have with the direction Python3 went: it pushes
you into decoding everything to unicode early, even when you don't care --
all you really wanted to do is pass it from one API to another, with some
well-defined
On Jun 22, 2010, at 7:23 PM, Ian Bicking wrote:
This is a place where bytes+encoding might also have some benefit. XML is
someplace where you might load a bunch of data but only touch a little bit of
it, and the amount of data is frequently large enough that the efficiencies
are
On Tue, Jun 22, 2010 at 4:23 PM, Ian Bicking i...@colorstudy.com wrote:
This reminds me of the optimization ElementTree and lxml made in Python 2
(not sure what they do in Python 3?) where they use str when a string is
ASCII to avoid the memory and performance overhead of unicode.
An
On Wed, Jun 23, 2010 at 12:25 PM, Glyph Lefkowitz
gl...@twistedmatrix.com wrote:
I can also appreciate what's been said in this thread a bunch of times: to my
knowledge, nobody has actually shown a profile of an application where
encoding is significant overhead. I believe that encoding
Robert Collins writes:
Also, url's are bytestrings - by definition;
Eh? RFC 3896 explicitly says
A URI is an identifier consisting of a sequence of characters
matching the syntax rule named URI in Section 3.
(where the phrase sequence of characters appears in all ancestors I
found
2010/6/21 Stephen J. Turnbull step...@xemacs.org:
IMO, the UI is right. Something like the above ought to work.
Right. That said, many times when you want to do urlparse etc they
might be binary, and you might want binary. So maybe the methods
should work with both?
--
Lennart Regebro:
On Mon, Jun 21, 2010 at 12:30 PM, P.J. Eby p...@telecommunity.com wrote:
I also find it weird that there seem to be two camps on this subject, one of
which claims that All Is Well And There Is No Problem -- but I do not recall
seeing anyone who was in the What do I do; this doesn't seem ready
Lennart Regebro writes:
2010/6/21 Stephen J. Turnbull step...@xemacs.org:
IMO, the UI is right. Something like the above ought to work.
Right. That said, many times when you want to do urlparse etc they
might be binary, and you might want binary. So maybe the methods
should work
At 10:51 PM 6/21/2010 +1000, Nick Coghlan wrote:
It may be that there are places where we need to rewrite standard
library algorithms to be bytes/str neutral (e.g. by using length one
slices instead of indexing). It may be that there are more APIs that
need to grow encoding keyword arguments
On 21/06/2010 17:46, P.J. Eby wrote:
At 10:51 PM 6/21/2010 +1000, Nick Coghlan wrote:
It may be that there are places where we need to rewrite standard
library algorithms to be bytes/str neutral (e.g. by using length one
slices instead of indexing). It may be that there are more APIs that
need
At 01:08 AM 6/22/2010 +0900, Stephen J. Turnbull wrote:
But if you need that everywhere, what's so hard about
def urljoin_wrapper (base, subdir):
return urljoin(str(base, 'latin-1'), subdir).encode('latin-1')
Now, note how that pattern fails as soon as you want to use
non-ISO-8859-1
On Tue, Jun 22, 2010 at 01:08:53AM +0900, Stephen J. Turnbull wrote:
Lennart Regebro writes:
2010/6/21 Stephen J. Turnbull step...@xemacs.org:
IMO, the UI is right. Something like the above ought to work.
Right. That said, many times when you want to do urlparse etc they
might
On 6/20/2010 11:56 PM, Terry Reedy wrote:
The specific example is
urllib.parse.parse_qsl('a=b%e0')
[('a', 'b�')]
where the character after 'b' is white ? in dark diamond, indicating an
error.
parse_qsl() splits that input on '=' and sends each piece to
urllib.parse.unquote
unquote()
On Mon, Jun 21, 2010 at 9:46 AM, P.J. Eby p...@telecommunity.com wrote:
At 10:51 PM 6/21/2010 +1000, Nick Coghlan wrote:
It may be that there are places where we need to rewrite standard
library algorithms to be bytes/str neutral (e.g. by using length one
slices instead of indexing). It may
At 05:49 PM 6/21/2010 +0100, Michael Foord wrote:
Why is your proposed bstr wrapper not practical to implement outside
the core and use in your own libraries and frameworks?
__contains__ doesn't have a converse operation, so you can't code a
type that works around this (Python 3.1 shown):
On 6/21/2010 8:51 AM, Nick Coghlan wrote:
I don't know that the all is well camp actually exists. The camp
that I do see existing is the one that says without a bug report,
inconsistencies in the standard library's unicode handling won't get
fixed.
The issues picked up by the regression test
At 12:56 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
One comment here -- you can also have uri's that aren't decodable into their
true textual meaning using a single encoding.
Apache will happily serve out uris that have utf-8, shift-jis, and euc-jp
components inside of their path but the
At 10:29 AM 6/21/2010 -0700, Guido van Rossum wrote:
Perhaps there are more situations where a polymorphic API would be
helpful. Such APIs are not always so easy to implement, because they
have to be careful with literals or other constants (and even more so
mutable state) used internally -- but
2010/6/21 Stephen J. Turnbull step...@xemacs.org:
Robert Collins writes:
Also, url's are bytestrings - by definition;
Eh? RFC 3896 explicitly says
?Definitions of Managed Objects for the DS3/E3 Interface Type
Perhaps you mean 3986 ? :)
A URI is an identifier consisting of a sequence
On 6/21/2010 1:29 PM, P.J. Eby wrote:
At 05:49 PM 6/21/2010 +0100, Michael Foord wrote:
Why is your proposed bstr wrapper not practical to implement outside
the core and use in your own libraries and frameworks?
__contains__ doesn't have a converse operation, so you can't code a type
that
On 6/21/2010 1:29 PM, Guido van Rossum wrote:
Actually, the big problem with Python 2 is that if you mix str and
unicode, things work or crash depending on whether any of the str
objects involved contain non-ASCII bytes.
If one API decides to upgrade to Unicode, the result, when passed to
Toshio Kuratomi writes:
One comment here -- you can also have uri's that aren't decodable into their
true textual meaning using a single encoding.
Apache will happily serve out uris that have utf-8, shift-jis, and
euc-jp components inside of their path but the textual
representation
Robert Collins writes:
Perhaps you mean 3986 ? :)
Thank you for the correction.
A URI is an identifier consisting of a sequence of characters
matching the syntax rule named URI in Section 3.
(where the phrase sequence of characters appears in all ancestors I
found back to
On Jun 21, 2010, at 2:17 PM, P.J. Eby wrote:
One issue I remember from my enterprise days is some of the Asian-language
developers at NTT/Verio explaining to me that unicode doesn't actually solve
certain issues -- that there are use cases where you really *do* need bytes
plus encoding in
On Sun, 20 Jun 2010 14:40:56 -0400
P.J. Eby p...@telecommunity.com wrote:
Actually, I would say that it's more that (in the network protocol
case) we *have* bytes, some of which we would like to *treat* as
text, yet do not wish to constantly convert back and forth to
full-blown unicode
2010/6/20 Antoine Pitrou solip...@pitrou.net:
On Sun, 20 Jun 2010 14:40:56 -0400
P.J. Eby p...@telecommunity.com wrote:
Actually, I would say that it's more that (in the network protocol
case) we *have* bytes, some of which we would like to *treat* as
text, yet do not wish to constantly
Also, url's are bytestrings - by definition; if the standard library
has made them unicode objects in 3, I expect a lot of pain in the
webserver space.
-Rob
___
Python-Dev mailing list
Python-Dev@python.org
On 6/20/2010 5:55 PM, Benjamin Peterson wrote:
2010/6/20 Antoine Pitrousolip...@pitrou.net:
On Sun, 20 Jun 2010 14:40:56 -0400
P.J. Ebyp...@telecommunity.com wrote:
Actually, I would say that it's more that (in the network protocol
case) we *have* bytes, some of which we would like to
At 07:33 PM 6/20/2010 -0400, Terry Reedy wrote:
Do you have in mind any tools that could and should operate on both,
but do not?
From http://mail.python.org/pipermail/web-sig/2009-September/004105.html :
The problem which arises is that unquoting of URLs in Python 3.X
stdlib can only be done
At 11:47 PM 6/20/2010 +0200, Antoine Pitrou wrote:
On Sun, 20 Jun 2010 14:40:56 -0400
P.J. Eby p...@telecommunity.com wrote:
Actually, I would say that it's more that (in the network protocol
case) we *have* bytes, some of which we would like to *treat* as
text, yet do not wish to constantly
On 6/20/2010 9:33 PM, P.J. Eby wrote:
At 07:33 PM 6/20/2010 -0400, Terry Reedy wrote:
Do you have in mind any tools that could and should operate on both,
but do not?
From http://mail.python.org/pipermail/web-sig/2009-September/004105.html :
Thank for the concrete examples in this and your
1 - 100 of 101 matches
Mail list logo