Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Stephen J. Turnbull
M == M.-A. Lemburg [EMAIL PROTECTED] writes: M James Y Knight wrote: Nice and simple. M Albeit, too simple. M The above approach would basically remove the possibility to M easily create bytes() from literals in Py3k, since literals in M Py3k create Unicode objects,

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Just van Rossum
Guido van Rossum wrote: If bytes support the buffer interface, we get another interesting issue -- regular expressions over bytes. Brr. We already have that: import re, array re.search('\2', array.array('B', [1, 2, 3, 4])).group() array('B', [2]) Not sure whether to blame array

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Bengt Richter
On Tue, 14 Feb 2006 12:31:07 -0700, Neil Schemenauer [EMAIL PROTECTED] wrote: On Mon, Feb 13, 2006 at 08:07:49PM -0800, Guido van Rossum wrote: On 2/13/06, Neil Schemenauer [EMAIL PROTECTED] wrote: \x80.encode('latin-1') But in 2.5 we can't change that to return a bytes object without

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Bengt Richter
On Tue, 14 Feb 2006 15:14:07 -0800, Guido van Rossum [EMAIL PROTECTED] wrote: On 2/14/06, M.-A. Lemburg [EMAIL PROTECTED] wrote: Guido van Rossum wrote: As Phillip guessed, I was indeed thinking about introducing bytes() sooner than that, perhaps even in 2.5 (though I don't want anything

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Jim Jewett
On 2/14/06, Neil Schemenauer wrote: People could spell it bytes(s.encode('latin-1')) Guido wrote: At the cost of an extra copying step. I asked: ... why not just add some smarts to the bytes constructor? Guido wrote: ... the VM usually keeps an extra reference on the stack so the refcount

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Josiah Carlson
Ron Adam [EMAIL PROTECTED] wrote: Greg Ewing wrote: Ron Adam wrote: b = bytes(0L) - bytes([0,0,0,0]) No, bytes(0L) -- TypeError because 0L doesn't implement the iterator protocol or the buffer interface. It wouldn't need it if it was a direct C memory copy. Yes it would.

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Thomas Wouters
On Wed, Feb 15, 2006 at 01:38:41PM -0500, Jim Jewett wrote: On 2/14/06, Neil Schemenauer wrote: People could spell it bytes(s.encode('latin-1')) Guido wrote: At the cost of an extra copying step. I asked: ... why not just add some smarts to the bytes constructor? Guido wrote:

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Greg Ewing
Ron Adam wrote: I was presuming it would be done in C code and it will just need a pointer to the first byte, memchr(), and then read n bytes directly into a new memory range via memcpy(). If the object supports the buffer interface, it can be done that way. But if not, it would seem to

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Ron Adam
Greg Ewing wrote: I think you don't understand what an encoding is. Unicode strings don't *have* an encoding, because theyre not encoded! Encoding is what happens when you go from a unicode string to something else. Ah.. ok, my mental picture was a bit off. I had this reversed somewhat.

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Aahz
On Tue, Feb 14, 2006, Guido van Rossum wrote: Anyway, I'm now convinced that bytes should act as an array of ints, where the ints are restricted to range(0, 256) but have type int. range(0, 255)? -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ 19. A language that

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-15 Thread Bob Ippolito
On Feb 15, 2006, at 6:35 PM, Aahz wrote: On Tue, Feb 14, 2006, Guido van Rossum wrote: Anyway, I'm now convinced that bytes should act as an array of ints, where the ints are restricted to range(0, 256) but have type int. range(0, 255)? No, Guido was correct. range(0, 256) is [0, 1, 2,

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Thomas Wouters
On Mon, Feb 13, 2006 at 03:44:27PM -0800, Guido van Rossum wrote: But adding an encoding doesn't help. The str.encode() method always assumes that the string itself is ASCII-encoded, and that's not good enough: abc.encode(latin-1) 'abc' abc.decode(latin-1) u'abc'

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Greg Ewing
Guido van Rossum wrote: I also wonder if having a b... literal would just add more confusion -- bytes are not characters, but b... makes it appear as if they are. I'm inclined to agree. Bytes objects are more likely to be used for things which are *not* characters -- if they're characters,

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Greg Ewing
Guido van Rossum wrote: There's also the consideration for APIs that, informally, accept either a string or a sequence of objects. My preference these days is not to design APIs that way. It's never necessary and it avoids a lot of problems. Greg

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Nick Coghlan
Guido van Rossum wrote: In general I've come to appreciate that there are two ways of converting an object of type A to an object of type B: ask an A instance to convert itself to a B, or ask the type B to create a new instance from an A. And the difference between the two isn't even always

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Adam Olsen
On 2/14/06, Martin v. Löwis [EMAIL PROTECTED] wrote: Adam Olsen wrote: What would that imply for repr()? To support eval(repr(x)) I don't think eval(repr(x)) needs to be supported for the bytes type. However, if that is desirable, it should return something like bytes([1,2,3]) I'm

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Michael Hudson
Greg Ewing [EMAIL PROTECTED] writes: Guido van Rossum wrote: There's also the consideration for APIs that, informally, accept either a string or a sequence of objects. My preference these days is not to design APIs that way. It's never necessary and it avoids a lot of problems. Oh yes.

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Barry Warsaw
On Feb 14, 2006, at 6:35 AM, Greg Ewing wrote: Barry Warsaw wrote: This makes me think I want an unsigned byte type, which b[0] would return. Come to think of it, this is something I don't remember seeing discussed. I've been thinking that bytes[i] would return an integer, but is the

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread James Y Knight
On Feb 14, 2006, at 1:52 AM, Martin v. Löwis wrote: Phillip J. Eby wrote: I was just pointing out that since byte strings are bytes by definition, then simply putting those bytes in a bytes() object doesn't alter the existing encoding. So, using latin-1 when converting a string to

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Phillip J. Eby
At 11:08 AM 2/14/2006 -0500, James Y Knight wrote: On Feb 14, 2006, at 1:52 AM, Martin v. Löwis wrote: Phillip J. Eby wrote: I was just pointing out that since byte strings are bytes by definition, then simply putting those bytes in a bytes() object doesn't alter the existing encoding. So,

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread M.-A. Lemburg
James Y Knight wrote: Kill the encoding argument, and you're left with: Python2.X: - bytes(bytes_object) - copy constructor - bytes(str_object) - copy the bytes from the str to the bytes object - bytes(sequence_of_ints) - make bytes with the values of the ints, error on overflow

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Josiah Carlson
James Y Knight [EMAIL PROTECTED] wrote: I like it, it makes sense. Unicode strings are simply not allowed as arguments to the byte constructor. Thinking about it, why would it be otherwise? And if you're mixing str-strings and unicode-strings, that means the str-strings you're sometimes

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread M.-A. Lemburg
Guido van Rossum wrote: On 2/13/06, M.-A. Lemburg [EMAIL PROTECTED] wrote: Guido van Rossum wrote: It'd be cruel and unusual punishment though to have to write bytes(abc, Latin-1) I propose that the default encoding (for basestring instances) ought to be ascii just like everywhere else.

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread James Y Knight
On Feb 14, 2006, at 11:47 AM, M.-A. Lemburg wrote: The above approach would basically remove the possibility to easily create bytes() from literals in Py3k, since literals in Py3k create Unicode objects, e.g. bytes(123) would not work in Py3k. That is true. And I think that is correct. There

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread James Y Knight
On Feb 14, 2006, at 11:25 AM, Phillip J. Eby wrote: At 11:08 AM 2/14/2006 -0500, James Y Knight wrote: I like it, it makes sense. Unicode strings are simply not allowed as arguments to the byte constructor. Thinking about it, why would it be otherwise? And if you're mixing str-strings and

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Guido van Rossum
On 2/14/06, Thomas Wouters [EMAIL PROTECTED] wrote: On Mon, Feb 13, 2006 at 03:44:27PM -0800, Guido van Rossum wrote: But adding an encoding doesn't help. The str.encode() method always assumes that the string itself is ASCII-encoded, and that's not good enough: abc.encode(latin-1)

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Guido van Rossum
On 2/14/06, Adam Olsen [EMAIL PROTECTED] wrote: I'm starting to wonder, do we really need anything fancy? Wouldn't it be sufficient to have a way to compactly store 8-bit integers? In 2.x we could convert unicode like this: bytes(ord(c) for c in uIt'sencode('utf-8')) Yuck.

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Guido van Rossum
On 2/13/06, Barry Warsaw [EMAIL PROTECTED] wrote: This makes me think I want an unsigned byte type, which b[0] would return. In another thread I think someone mentioned something about fixed width integral types, such that you could have an object that was guaranteed to be 8-bits wide,

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Guido van Rossum
On 2/13/06, Adam Olsen [EMAIL PROTECTED] wrote: What would that imply for repr()? To support eval(repr(x)) it would have to produce whatever format the source code includes to begin with. I'm not sure that's a requirement. (I do think that in 2.x, str(bytes(s)) == s should hold as long as

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Guido van Rossum
On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: At 04:29 PM 2/13/2006 -0800, Guido van Rossum wrote: On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: I didn't mean that it was the only purpose. In Python 2.x, practical code has to sometimes deal with string-like objects. That is,

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Guido van Rossum
On 2/14/06, Barry Warsaw [EMAIL PROTECTED] wrote: A related question: what would bytes([104, 101, 108, 108, 111, 8004]) return? An exception hopefully. Absolutely. I also think you'd want bytes([x for x in some_bytes_object]) to return an object equal to the original. You mean if

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Guido van Rossum
On 2/14/06, Neil Schemenauer [EMAIL PROTECTED] wrote: People could spell it bytes(s.encode('latin-1')) in order to make it work in 2.X. That spelling would provide a way of ensuring the type of the return value. At the cost of an extra copying step. [Guido] You missed the part where I said

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Barry Warsaw
On Tue, 2006-02-14 at 15:13 -0800, Guido van Rossum wrote: So I'm taking that the specific properties you want to model are the overflow behavior, right? N-bit unsigned is defined as arithmethic mod 2**N; N-bit signed is a bit more tricky to define but similar. These never overflow but

[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Jim Jewett
On 2/14/06, Neil Schemenauer nas at arctrix.com wrote: People could spell it bytes(s.encode('latin-1')) in order to make it work in 2.X. Guido wrote: At the cost of an extra copying step. That sounds like an implementation issue. If it is important enough to matter, then why not just add

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Guido van Rossum
On 2/14/06, Jim Jewett [EMAIL PROTECTED] wrote: On 2/14/06, Neil Schemenauer nas at arctrix.com wrote: People could spell it bytes(s.encode('latin-1')) in order to make it work in 2.X. Guido wrote: At the cost of an extra copying step. That sounds like an implementation issue. If it is

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Greg Ewing
Guido van Rossum wrote: The only remaining question is what if anything to do with an encoding argment when the first argument is of type str...) From what you said earlier about str in 2.x being interpretable as a unicode string which contains only ascii, it seems to me that if you say

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Greg Ewing
Guido van Rossum wrote: On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: At 04:29 PM 2/13/2006 -0800, Guido van Rossum wrote: On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: What would bytes(abc\xf0, latin-1) *mean*? I'm saying that XXX would be the same encoding as you specified.

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Ron Adam
Greg Ewing wrote: Guido van Rossum wrote: On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: At 04:29 PM 2/13/2006 -0800, Guido van Rossum wrote: On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: What would bytes(abc\xf0, latin-1) *mean*? I'm saying that XXX would be the same encoding

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-14 Thread Greg Ewing
Ron Adam wrote: My first impression and thoughts were: (and seems incorrect now) bytes(object) - byte sequence of objects value Basically a memory dump of objects value. As I understand the current intentions, this is correct. The bytes constructor would have two different

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
One recommendation: for starters, I'd much rather see the bytes type standardized without a literal notation. There should be are lots of ways to create bytes objects from string objects, with specific explicit encodings, and those should suffice, at least initially. I also wonder if having a

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread M.-A. Lemburg
Guido van Rossum wrote: One recommendation: for starters, I'd much rather see the bytes type standardized without a literal notation. There should be are lots of ways to create bytes objects from string objects, with specific explicit encodings, and those should suffice, at least initially.

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Phillip J. Eby
At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote: One recommendation: for starters, I'd much rather see the bytes type standardized without a literal notation. There should be are lots of ways to create bytes objects from string objects, with specific explicit encodings, and those should

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote: One recommendation: for starters, I'd much rather see the bytes type standardized without a literal notation. There should be are lots of ways to create bytes objects from string objects,

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread M.-A. Lemburg
Guido van Rossum wrote: On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote: One recommendation: for starters, I'd much rather see the bytes type standardized without a literal notation. There should be are lots of ways to create bytes

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Phillip J. Eby
At 10:55 PM 2/13/2006 +0100, M.-A. Lemburg wrote: Guido van Rossum wrote: On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote: One recommendation: for starters, I'd much rather see the bytes type standardized without a literal notation.

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread M.-A. Lemburg
Phillip J. Eby wrote: Why not just have the constructor be: bytes(initializer [,encoding]) Where initializer must be either an iterable of suitable integers, or a unicode/string object. If the latter (i.e., it's a basestring), the encoding argument would then be required. Then,

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, M.-A. Lemburg [EMAIL PROTECTED] wrote: Guido van Rossum wrote: It'd be cruel and unusual punishment though to have to write bytes(abc, Latin-1) I propose that the default encoding (for basestring instances) ought to be ascii just like everywhere else. (Meaning, it should

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: Actually, I thought we were talking about adding bytes() in 2.5. I was. However, now that you've brought this up, it actually makes perfect sense to just use latin-1 as the effective encoding for both strings and unicode. In Python 2.x,

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Phillip J. Eby
At 12:03 AM 2/14/2006 +0100, M.-A. Lemburg wrote: The conversion from Unicode to bytes is different in this respect, since you are converting from a bigger type to a smaller one. Choosing latin-1 as default for this conversion would give you all 8 bits, instead of just 7 bits that ASCII provides.

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: At 12:03 AM 2/14/2006 +0100, M.-A. Lemburg wrote: The conversion from Unicode to bytes is different in this respect, since you are converting from a bigger type to a smaller one. Choosing latin-1 as default for this conversion would give you

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Michael Foord
Phillip J. Eby wrote: [snip..] In fact, the 'encoding' argument seems useless in the case of str objects, and it seems it should default to latin-1 for unicode objects. The only -1 for having an implicit encode that behaves differently to other implicit encodes/decodes that happen in

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Michael Foord [EMAIL PROTECTED] wrote: Phillip J. Eby wrote: [snip..] In fact, the 'encoding' argument seems useless in the case of str objects, and it seems it should default to latin-1 for unicode objects. The only -1 for having an implicit encode that behaves differently

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Barry Warsaw
On Mon, 2006-02-13 at 15:44 -0800, Guido van Rossum wrote: The right way to look at this is, as Phillip says, to consider conversion between str and bytes as not an encoding but a data type change *only*. That sounds right to me too. -Barry signature.asc Description: This is a digitally

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Michael Foord
Guido van Rossum wrote: On 2/13/06, Michael Foord [EMAIL PROTECTED] wrote: Phillip J. Eby wrote: [snip..] In fact, the 'encoding' argument seems useless in the case of str objects, and it seems it should default to latin-1 for unicode objects. The only -1 for having an

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Phillip J. Eby
At 03:23 PM 2/13/2006 -0800, Guido van Rossum wrote: On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: The only use I see for having an encoding for a 'str' would be to allow confirming that the input string in fact is valid for that encoding. So, bytes(some_str,'ascii') would be an

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Michael Foord [EMAIL PROTECTED] wrote: Sorry - I meant for the unicode to bytes case. A default encoding that behaves differently to the current to implicit encodes/decodes would be confusing IMHO. And I am in agreement with you there (I think only Phillip argued otherwise). I

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: I didn't mean that it was the only purpose. In Python 2.x, practical code has to sometimes deal with string-like objects. That is, code that takes either strings or unicode. If such code calls bytes(), it's going to want to include an

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread James Y Knight
On Feb 13, 2006, at 7:09 PM, Guido van Rossum wrote: On 2/13/06, Michael Foord [EMAIL PROTECTED] wrote: Sorry - I meant for the unicode to bytes case. A default encoding that behaves differently to the current to implicit encodes/decodes would be confusing IMHO. And I am in agreement

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, James Y Knight [EMAIL PROTECTED] wrote: So, in python2.X, you have: - bytes(\x80), you get a bytestring with a single byte of value 0x80 (when no encoding is specified, and the object is a str, it doesn't try to encode it at all). - bytes(\x80, encoding=latin-1), you get an error,

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Neil Schemenauer
Guido van Rossum [EMAIL PROTECTED] wrote: In py3k, when the str object is eliminated, then what do you have? Perhaps - bytes(\x80), you get an error, encoding is required. There is no such thing as default encoding anymore, as there's no str object. - bytes(\x80, encoding=latin-1), you get a

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Fred L. Drake, Jr.
On Monday 13 February 2006 21:52, Neil Schemenauer wrote: Also, I think it would useful to introduce byte array literals at the same time as the bytes object. That would allow people to use byte arrays without having to get involved with all the silly string encoding confusion. bytes([0,

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Neil Schemenauer [EMAIL PROTECTED] wrote: Guido van Rossum [EMAIL PROTECTED] wrote: In py3k, when the str object is eliminated, then what do you have? Perhaps - bytes(\x80), you get an error, encoding is required. There is no such thing as default encoding anymore, as there's

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Barry Warsaw
On Feb 13, 2006, at 7:29 PM, Guido van Rossum wrote: There's one property that bytes, str and unicode all share: type(x[0]) == type(x), at least as long as len(x) = 1. This is perhaps the ultimate test for string-ness. But not perfect, since of course other containers can contain objects of

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Phillip J. Eby
At 04:29 PM 2/13/2006 -0800, Guido van Rossum wrote: On 2/13/06, Phillip J. Eby [EMAIL PROTECTED] wrote: I didn't mean that it was the only purpose. In Python 2.x, practical code has to sometimes deal with string-like objects. That is, code that takes either strings or unicode. If such

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Martin v. Löwis
M.-A. Lemburg wrote: We're talking about Py3k here: abc will be a Unicode string, so why restrict the conversion to 7 bits when you can have 8 bits without any conversion problems ? YAGNI. If you have a need for byte string in source code, it will typically be random bytes, which can be nicely

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Martin v. Löwis
Phillip J. Eby wrote: I was just pointing out that since byte strings are bytes by definition, then simply putting those bytes in a bytes() object doesn't alter the existing encoding. So, using latin-1 when converting a string to bytes actually seems like the the One Obvious Way to do it.

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Martin v. Löwis
Guido van Rossum wrote: In py3k, when the str object is eliminated, then what do you have? Perhaps - bytes(\x80), you get an error, encoding is required. There is no such thing as default encoding anymore, as there's no str object. - bytes(\x80, encoding=latin-1), you get a bytestring with a

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread James Y Knight
On Feb 14, 2006, at 12:20 AM, Phillip J. Eby wrote: bytes(map(ord, str_or_unicode)) In other words, without an encoding, bytes() should simply treat str and unicode objects *as if they were a sequence of integers*, and produce an error when an integer is out of range. This is a

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Martin v. Löwis
Adam Olsen wrote: What would that imply for repr()? To support eval(repr(x)) I don't think eval(repr(x)) needs to be supported for the bytes type. However, if that is desirable, it should return something like bytes([1,2,3]) Regards, Martin ___

[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-11 Thread Bengt Richter
On Fri, 10 Feb 2006 21:35:26 -0800, Guido van Rossum [EMAIL PROTECTED] wrote: On Sat, 11 Feb 2006 05:08:09 + (UTC), Neil Schemenauer [EMAIL PROTECTED] The backwards compatibility problems *seem* to be relatively minor. I only found one instance of breakage in the standard library.