Re: [Python-Dev] bytes.from_hex()

2006-03-03 Thread Ron Adam
Greg Ewing wrote: Ron Adam wrote: This uses syntax to determine the direction of encoding. It would be easier and clearer to just require two arguments or a tuple. u = unicode(b, 'encode', 'base64') b = bytes(u, 'decode', 'base64') The point of the exercise was to avoid

Re: [Python-Dev] bytes.from_hex()

2006-03-03 Thread Greg Ewing
Stephen J. Turnbull wrote: Doesn't that make base64 non-text by analogy to other look but don't touch strings like a .gz or vmlinuz? No, because I can take a piece of base64 encoded data and use a text editor to manually paste it in with some other text (e.g. a plain-text (not MIME) mail

Re: [Python-Dev] bytes.from_hex()

2006-03-03 Thread Greg Ewing
Ron Adam wrote: This would apply to codecs that could return either bytes or strings, or strings or unicode, or bytes or unicode. I'd need to see some concrete examples of such codecs before being convinced that they exist, or that they couldn't just as well return a fixed type that you

Re: [Python-Dev] bytes.from_hex()

2006-03-03 Thread Ron Adam
Greg Ewing wrote: Ron Adam wrote: This would apply to codecs that could return either bytes or strings, or strings or unicode, or bytes or unicode. I'd need to see some concrete examples of such codecs before being convinced that they exist, or that they couldn't just as well return a

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Ron Adam
Josiah Carlson wrote: Greg Ewing [EMAIL PROTECTED] wrote: u = unicode(b) u = unicode(b, 'utf8') b = bytes['utf8'](u) u = unicode['base64'](b) # encoding b = bytes(u, 'base64') # decoding u2 = unicode['piglatin'](u1) # encoding u1 = unicode(u2, 'piglatin') #

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Just van Rossum
Ron Adam wrote: Josiah Carlson wrote: Greg Ewing [EMAIL PROTECTED] wrote: u = unicode(b) u = unicode(b, 'utf8') b = bytes['utf8'](u) u = unicode['base64'](b) # encoding b = bytes(u, 'base64') # decoding u2 = unicode['piglatin'](u1) # encoding u1 =

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Greg Ewing
Ron Adam wrote: This uses syntax to determine the direction of encoding. It would be easier and clearer to just require two arguments or a tuple. u = unicode(b, 'encode', 'base64') b = bytes(u, 'decode', 'base64') The point of the exercise was to avoid using the terms

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Greg Ewing
Stephen J. Turnbull wrote: What you presumably meant was what would you consider the proper type for (P)CDATA? No, I mean the whole thing, including all the ... tags etc. Like you see when you load an XML file into a text editor. (BTW, doesn't the fact that you *can* load an XML file into what

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Stephen J. Turnbull
Greg == Greg Ewing [EMAIL PROTECTED] writes: Greg (BTW, doesn't the fact that you *can* load an XML file into Greg what we call a text editor say something?) Why not answer that question for yourself, and then turn that answer into a description of text semantics? For me, it says that,

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Donovan Baarda
On Tue, 2006-02-28 at 15:23 -0800, Bill Janssen wrote: Greg Ewing wrote: Bill Janssen wrote: bytes - base64 - text text - de-base64 - bytes It's nice to hear I'm not out of step with the entire world on this. :-) Well, I can certainly understand the bytes-base64-bytes side of

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Nick Coghlan
Bill Janssen wrote: Greg Ewing wrote: Bill Janssen wrote: bytes - base64 - text text - de-base64 - bytes It's nice to hear I'm not out of step with the entire world on this. :-) Well, I can certainly understand the bytes-base64-bytes side of thing too. The text produced is specified as

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Ron Adam
Nick Coghlan wrote: All the unicode codecs, on the other hand, use encode to get from characters to bytes and decode to get from bytes to characters. So if bytes objects *did* have an encode method, it should still result in a unicode object, just the same as a decode method does (because

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Chermside, Michael
Ron Adam writes: While playing around with the example bytes class I noticed code reads much better when I use methods called tounicode and tostring. [...] I'm not suggesting we start using to-type everywhere, just where it might make things clearer over decode and encode. +1 I always

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Bill Janssen
Huh... just joining here but surely you don't mean a text string that doesn't use every character available in a particular encoding is really bytes... it's still a text string... No, once it's in a particular encoding it's bytes, no longer text. As you say, Keep these two concepts separate

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Scott David Daniels
Chermside, Michael wrote: ... I will say that if there were no legacy I'd prefer the tounicode() and tostring() (but shouldn't itbe 'tobytes()' instead?) names for Python 3.0. Wouldn't 'tobytes' and 'totext' be better for 3.0 where text == unicode? -- -- Scott David Daniels [EMAIL PROTECTED]

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Michael Chermside
I wrote: ... I will say that if there were no legacy I'd prefer the tounicode() and tostring() (but shouldn't itbe 'tobytes()' instead?) names for Python 3.0. Scott Daniels replied: Wouldn't 'tobytes' and 'totext' be better for 3.0 where text == unicode? Um... yes. Sorry, I'm not completely

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Greg Ewing
Nick Coghlan wrote: ascii_bytes = orig_bytes.decode(base64).encode(ascii) orig_bytes = ascii_bytes.decode(ascii).encode(base64) The only slightly odd aspect is that this inverts the conventional meaning of base64 encoding and decoding, -1. Whatever we do, we shouldn't design

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Greg Ewing
Bill Janssen wrote: No, once it's in a particular encoding it's bytes, no longer text. The point at issue is whether the characters produced by base64 are in a particular encoding. According to my reading of the RFC, they're not. -- Greg Ewing, Computer Science Dept,

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Greg Ewing
Ron Adam wrote: While playing around with the example bytes class I noticed code reads much better when I use methods called tounicode and tostring. b64ustring = b.tounicode('base64') b = bytes(b64ustring, 'base64') I don't like that, because it creates a dependency (conceptually,

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Michael Urman
[My apologies Greg; I meant to send this to the whole list. I really need a list-reply button in GMail. ] On 3/1/06, Greg Ewing [EMAIL PROTECTED] wrote: I don't like that, because it creates a dependency (conceptually, at least) between the bytes type and the unicode type. I only find half of

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Ron Adam
Greg Ewing wrote: Ron Adam wrote: While playing around with the example bytes class I noticed code reads much better when I use methods called tounicode and tostring. b64ustring = b.tounicode('base64') b = bytes(b64ustring, 'base64') I don't like that, because it creates a

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Josiah Carlson
Greg Ewing [EMAIL PROTECTED] wrote: u = unicode(b) u = unicode(b, 'utf8') b = bytes['utf8'](u) u = unicode['base64'](b) # encoding b = bytes(u, 'base64') # decoding u2 = unicode['piglatin'](u1) # encoding u1 = unicode(u2, 'piglatin') # decoding Your provided

Re: [Python-Dev] bytes.from_hex()

2006-02-28 Thread Greg Ewing
Bill Janssen wrote: Well, I can certainly understand the bytes-base64-bytes side of thing too. The text produced is specified as using a 65-character subset of US-ASCII, so that's really bytes. But it then goes on to say that these same characters are also a subset of EBCDIC. So it seems to

Re: [Python-Dev] bytes.from_hex()

2006-02-27 Thread Greg Ewing
Bill Janssen wrote: I use it quite a bit for image processing (converting to and from the data: URL form), and various checksum applications (converting SHA into a string). Aha! We have a customer! For those cases, would you find it more convenient for the result to be text or bytes in Py3k?

Re: [Python-Dev] bytes.from_hex()

2006-02-25 Thread Stephen J. Turnbull
Ron == Ron Adam [EMAIL PROTECTED] writes: Ron So, lets consider a codec and a coding as being two Ron different things where a codec is a character sub set of Ron unicode characters expressed in a native format. And a Ron coding is *not* a subset of the unicode character set,

Re: [Python-Dev] bytes.from_hex()

2006-02-25 Thread Greg Ewing
Stephen J. Turnbull wrote: The reason that Python source code is text is that the primary producers/consumers of Python source code are human beings, not compilers I disagree with primary -- I think human and computer use of source code have equal importance. Because of the fact that Python

Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Stephen J. Turnbull
Ron == Ron Adam [EMAIL PROTECTED] writes: Ron We could call it transform or translate if needed. You're still losing the directionality, which is my primary objection to recode. The absence of directionality is precisely why recode is used in that sense for i18n work. There really isn't a

Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Stephen J. Turnbull
Greg == Greg Ewing [EMAIL PROTECTED] writes: Greg Stephen J. Turnbull wrote: No, base64 isn't a wire protocol. It's a family[...]. Greg Yes, and it's up to the programmer to choose those code Greg units (i.e. pick an encoding for the characters) that will, Greg in fact,

Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Ron Adam
* The following reply is a rather longer than I intended explanation of why codings (and how they differ) like 'rot' aren't the same thing as pure unicode codecs and probably should be treated differently. If you already understand that, then I suggest skipping this. But if you like detailed

Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Greg Ewing
Stephen J. Turnbull wrote: the kind of text for which Unicode was designed is normally produced and consumed by people, who wll pt up w/ ll knds f nnsns. Base64 decoders will not put up with the same kinds of nonsense that people will. The Python compiler won't put up with that sort of

Re: [Python-Dev] bytes.from_hex()

2006-02-23 Thread Greg Ewing
Stephen J. Turnbull wrote: Please define character, and explain how its semantics map to Python's unicode objects. One of the 65 abstract entities referred to in the RFC and represented in that RFC by certain visual glyphs. There is a subset of the Unicode code points that are conventionally

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Stephen J. Turnbull
Greg == Greg Ewing [EMAIL PROTECTED] writes: Greg Stephen J. Turnbull wrote: What I advocate for Python is to require that the standard base64 codec be defined only on bytes, and always produce bytes. Greg I don't understand that. It seems quite clear to me that Greg

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
Stephen J. Turnbull wrote: Base64 is a (family of) wire protocol(s). It's not clear to me that it makes sense to say that the alphabets used by baseNN encodings are composed of characters, Take a look at http://en.wikipedia.org/wiki/Base64 where it says ...base64 is a binary to text

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread James Y Knight
On Feb 22, 2006, at 6:35 AM, Greg Ewing wrote: I'm thinking of convenience, too. Keep in mind that in Py3k, 'unicode' will be called 'str' (or something equally neutral like 'text') and you will rarely have to deal explicitly with unicode codings, this being done mostly for you by the I/O

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Terry Reedy
Greg Ewing [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Efficiency is an implementation concern. It is also a user concern, especially if inefficiency overruns memory limits. In Py3k, strings which contain only ascii or latin-1 might be stored as 1 byte per character, in

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Ron Adam
Terry Reedy wrote: Greg Ewing [EMAIL PROTECTED] wrote in message Which is why I think that only *unicode* codings should be available through the .encode and .decode interface. Or alternatively there should be something more explicit like .unicode_encode and .unicode_decode that is thus

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
Terry Reedy wrote: Greg Ewing [EMAIL PROTECTED] wrote in message Efficiency is an implementation concern. It is also a user concern, especially if inefficiency overruns memory limits. Sure, but what I mean is that it's better to find what's conceptually right and then look for an efficient

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
Ron Adam wrote: While I prefer constructors with an explicit encode argument, and use a recode() method for 'like to like' coding. Then the whole encode/decode confusion goes away. I'd be happy with that, too. -- Greg Ewing, Computer Science Dept, +--+

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
James Y Knight wrote: Some MIME sections might have a base64 Content-Transfer-Encoding, others might be 8bit encoded, others might be 7bit encoded, others might be quoted- printable encoded. I stand corrected -- in that situation you would have to encode the characters before combining

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Stephen J. Turnbull
Greg == Greg Ewing [EMAIL PROTECTED] writes: Greg Stephen J. Turnbull wrote: Base64 is a (family of) wire protocol(s). It's not clear to me that it makes sense to say that the alphabets used by baseNN encodings are composed of characters, Greg Take a look at [this that

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Stephen J. Turnbull
Ron == Ron Adam [EMAIL PROTECTED] writes: Ron Terry Reedy wrote: I prefer the shorter names and using recode, for instance, for bytes to bytes. Ron While I prefer constructors with an explicit encode argument, Ron and use a recode() method for 'like to like' coding.

Re: [Python-Dev] bytes.from_hex()

2006-02-21 Thread Greg Ewing
Stephen J. Turnbull wrote: What I advocate for Python is to require that the standard base64 codec be defined only on bytes, and always produce bytes. I don't understand that. It seems quite clear to me that base64 encoding (in the general sense of encoding, not the unicode sense) takes binary

Re: [Python-Dev] bytes.from_hex()

2006-02-21 Thread Barry Warsaw
On Sun, 2006-02-19 at 23:30 +0900, Stephen J. Turnbull wrote: M == M.-A. Lemburg [EMAIL PROTECTED] writes: M * for Unicode codecs the original form is Unicode, the derived M form is, in most cases, a string First of all, that's Martin's point! Second, almost all Americans, a

Re: [Python-Dev] bytes.from_hex()

2006-02-21 Thread Greg Ewing
Josiah Carlson wrote: It doesn't seem strange to you to need to encode data twice to be able to have a usable sequence of characters which can be embedded in an effectively 7-bit email; I'm talking about a 3.0 world where all strings are unicode and the unicode - external coding is for the

Re: [Python-Dev] bytes.from_hex()

2006-02-20 Thread Stephen J. Turnbull
Martin == Martin v Löwis [EMAIL PROTECTED] writes: Martin Stephen J. Turnbull wrote: Bengt The characters in b could be encoded in plain ascii, or Bengt utf16le, you have to know. Which base64 are you thinking about? Both RFC 3548 and RFC 2045 (MIME) specify subsets of

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-20 Thread Bengt Richter
On Sat, 18 Feb 2006 23:33:15 +0100, Thomas Wouters [EMAIL PROTECTED] wrote: On Sat, Feb 18, 2006 at 01:21:18PM +0100, M.-A. Lemburg wrote: [...] - The return value for the non-unicode encodings depends on the value of the encoding argument. Not really: you'll always get a basestring

Re: [Python-Dev] bytes.from_hex()

2006-02-20 Thread Stephen J. Turnbull
Josiah == Josiah Carlson [EMAIL PROTECTED] writes: Josiah I try to internalize it by not thinking of strings as Josiah encoded data, but as binary data, and unicode as text. I Josiah then remind myself that unicode isn't native on-disk or Josiah cross-network (which stores and

Re: [Python-Dev] bytes.from_hex()

2006-02-20 Thread Martin v. Löwis
Stephen J. Turnbull wrote: Martin For an example where base64 is *not* necessarily Martin ASCII-encoded, see the binary data type in XML Martin Schema. There, base64 is embedded into an XML document, Martin and uses the encoding of the entire XML document. As a Martin

Re: [Python-Dev] bytes.from_hex()

2006-02-20 Thread Stephen J. Turnbull
Martin == Martin v Löwis [EMAIL PROTECTED] writes: Martin Please do take a look. It is the only way: If you were to Martin embed base64 *bytes* into character data content of an XML Martin element, the resulting XML file might not be well-formed Martin anymore (if the encoding of

Re: [Python-Dev] bytes.from_hex()

2006-02-20 Thread Bob Ippolito
On Feb 20, 2006, at 7:25 PM, Stephen J. Turnbull wrote: Martin == Martin v Löwis [EMAIL PROTECTED] writes: Martin Please do take a look. It is the only way: If you were to Martin embed base64 *bytes* into character data content of an XML Martin element, the resulting XML file

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Michael Hudson
M.-A. Lemburg [EMAIL PROTECTED] writes: Martin v. Löwis wrote: M.-A. Lemburg wrote: True. However, note that the .encode()/.decode() methods on strings and Unicode narrow down the possible return types. The corresponding .bytes methods should only allow bytes and Unicode. I forgot that:

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
Ian == Ian Bicking [EMAIL PROTECTED] writes: Ian Encodings cover up eclectic interfaces, where those Ian interfaces fit a basic pattern -- data in, data out. Isn't filter the word you're looking for? I think you've just made a very strong case that this is a slippery slope that we

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
M == M.-A. Lemburg [EMAIL PROTECTED] writes: M Martin v. Löwis wrote: No. The reason to ban string.decode and bytes.encode is that it confuses users. M Instead of starting to ban everything that can potentially M confuse a few users, we should educate those users and tell

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
M == M.-A. Lemburg [EMAIL PROTECTED] writes: M The main reason is symmetry and the fact that strings and M Unicode should be as similar as possible in order to simplify M the task of moving from one to the other. Those are perfectly compatible with Martin's suggestion. M Still,

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
Josiah == Josiah Carlson [EMAIL PROTECTED] writes: Josiah The question remains: is str.decode() returning a string Josiah or unicode depending on the argument passed, when the Josiah argument quite literally names the codec involved, Josiah difficult to understand? I don't

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
Bob == Bob Ippolito [EMAIL PROTECTED] writes: Bob On Feb 17, 2006, at 8:33 PM, Josiah Carlson wrote: But you aren't always getting *unicode* text from the decoding of bytes, and you may be encoding bytes *to* bytes: Please note that I presumed that you can indeed assume that

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
Bengt == Bengt Richter [EMAIL PROTECTED] writes: Bengt The characters in b could be encoded in plain ascii, or Bengt utf16le, you have to know. Which base64 are you thinking about? Both RFC 3548 and RFC 2045 (MIME) specify subsets of US-ASCII explicitly. -- School of Systems and

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Martin v. Löwis
Stephen J. Turnbull wrote: BTW, what use cases do you have in mind for Unicode - Unicode decoding? I think rot13 falls into that category: it is a transformation on text, not on bytes. For other odd cases: base64 goes Unicode-bytes in the *decode* direction, not in the encode direction. Some

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Martin v. Löwis
Stephen J. Turnbull wrote: Do you do any of the user education *about codec use* that you recommend? The people I try to teach about coding invariably find it difficult to understand. The problem is that the near-universal intuition is that for human-usable text is pretty much anything *but

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Martin v. Löwis
Stephen J. Turnbull wrote: Bengt The characters in b could be encoded in plain ascii, or Bengt utf16le, you have to know. Which base64 are you thinking about? Both RFC 3548 and RFC 2045 (MIME) specify subsets of US-ASCII explicitly. Unfortunately, it is ambiguous as to whether they

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Bob Ippolito
On Feb 19, 2006, at 10:55 AM, Martin v. Löwis wrote: Stephen J. Turnbull wrote: BTW, what use cases do you have in mind for Unicode - Unicode decoding? I think rot13 falls into that category: it is a transformation on text, not on bytes. The current implementation is a transformation on

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Josiah Carlson
Stephen J. Turnbull [EMAIL PROTECTED] wrote: Josiah == Josiah Carlson [EMAIL PROTECTED] writes: Josiah The question remains: is str.decode() returning a string Josiah or unicode depending on the argument passed, when the Josiah argument quite literally names the codec

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Ron Adam
Josiah Carlson wrote: Bob Ippolito [EMAIL PROTECTED] wrote: On Feb 17, 2006, at 8:33 PM, Josiah Carlson wrote: Greg Ewing [EMAIL PROTECTED] wrote: Stephen J. Turnbull wrote: Guido == Guido van Rossum [EMAIL PROTECTED] writes: Guido - b = bytes(t, enc); t = text(b, enc) +1 The coding

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread Martin v. Löwis
Aahz wrote: The problem is that they don't understand that Martin v. L?wis is not Unicode -- once all strings are Unicode, this is guaranteed to work. This specific call, yes. I don't think the problem will go away as long as both encode and decode are available for both strings and byte

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Josiah Carlson
Ron Adam [EMAIL PROTECTED] wrote: Josiah Carlson wrote: Bengt Richter had a good idea with bytes.recode() for strictly bytes transformations (and the equivalent for text), though it is ambiguous as to the direction; are we encoding or decoding with bytes.recode()? In my opinion, this is

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread M.-A. Lemburg
Martin, v. Löwis wrote: How are users confused? Users do py Martin v. Löwis.encode(utf-8) Traceback (most recent call last): File stdin, line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: ordinal not in range(128) because they want to convert the

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread Thomas Wouters
On Sat, Feb 18, 2006 at 12:06:37PM +0100, M.-A. Lemburg wrote: I've already explained why we have .encode() and .decode() methods on strings and Unicode many times. I've also explained the misunderstanding that can codecs only do Unicode-string conversions. And I've explained that the

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread M.-A. Lemburg
Martin v. Löwis wrote: M.-A. Lemburg wrote: Just because some codecs don't fit into the string.decode() or bytes.encode() scenario doesn't mean that these codecs are useless or that the methods should be banned. No. The reason to ban string.decode and bytes.encode is that it confuses

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Michael Hudson
This posting is entirely tangential. Be warned. Martin v. Löwis [EMAIL PROTECTED] writes: It's worse than that. The return *type* depends on the *value* of the argument. I think there is little precedence for that: There's one extremely significant example where the *value* of something

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Ron Adam
Josiah Carlson wrote: Ron Adam [EMAIL PROTECTED] wrote: Josiah Carlson wrote: Bengt Richter had a good idea with bytes.recode() for strictly bytes transformations (and the equivalent for text), though it is ambiguous as to the direction; are we encoding or decoding with bytes.recode()? In

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread M.-A. Lemburg
Thomas Wouters wrote: On Sat, Feb 18, 2006 at 12:06:37PM +0100, M.-A. Lemburg wrote: I've already explained why we have .encode() and .decode() methods on strings and Unicode many times. I've also explained the misunderstanding that can codecs only do Unicode-string conversions. And I've

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Adam Olsen
On 2/18/06, Josiah Carlson [EMAIL PROTECTED] wrote: Look at what we've currently got going for data transformations in the standard library to see what these removals will do: base64 module, binascii module, binhex module, uu module, ... Do we want or need to add another top-level module for

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Aahz
On Sat, Feb 18, 2006, Ron Adam wrote: I like the bytes.recode() idea a lot. +1 It seems to me it's a far more useful idea than encoding and decoding by overloading and could do both and more. It has a lot of potential to be an intermediate step for encoding as well as being used for many

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread M.-A. Lemburg
Aahz wrote: On Sat, Feb 18, 2006, Ron Adam wrote: I like the bytes.recode() idea a lot. +1 It seems to me it's a far more useful idea than encoding and decoding by overloading and could do both and more. It has a lot of potential to be an intermediate step for encoding as well as being

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread Martin v. Löwis
M.-A. Lemburg wrote: I've already explained why we have .encode() and .decode() methods on strings and Unicode many times. I've also explained the misunderstanding that can codecs only do Unicode-string conversions. And I've explained that the .encode() and .decode() method *do* check the

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Martin v. Löwis
Michael Hudson wrote: There's one extremely significant example where the *value* of something impacts on the type of something else: functions. The types of everything involved in str([1]) and len([1]) are the same but the results are different. This shows up in PyPy's type annotation; most

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread M.-A. Lemburg
Martin v. Löwis wrote: M.-A. Lemburg wrote: I've already explained why we have .encode() and .decode() methods on strings and Unicode many times. I've also explained the misunderstanding that can codecs only do Unicode-string conversions. And I've explained that the .encode() and .decode()

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread Martin v. Löwis
M.-A. Lemburg wrote: True. However, note that the .encode()/.decode() methods on strings and Unicode narrow down the possible return types. The corresponding .bytes methods should only allow bytes and Unicode. I forgot that: what is the rationale for that restriction? To assure that only

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread M.-A. Lemburg
Martin v. Löwis wrote: M.-A. Lemburg wrote: True. However, note that the .encode()/.decode() methods on strings and Unicode narrow down the possible return types. The corresponding .bytes methods should only allow bytes and Unicode. I forgot that: what is the rationale for that restriction?

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Josiah Carlson
Ron Adam [EMAIL PROTECTED] wrote: Josiah Carlson wrote: [snip] Again, the problem is ambiguity; what does bytes.recode(something) mean? Are we encoding _to_ something, or are we decoding _from_ something? This was just an example of one way that might work, but here are my thoughts on

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Ron Adam
Aahz wrote: On Sat, Feb 18, 2006, Ron Adam wrote: I like the bytes.recode() idea a lot. +1 It seems to me it's a far more useful idea than encoding and decoding by overloading and could do both and more. It has a lot of potential to be an intermediate step for encoding as well as being

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread Thomas Wouters
On Sat, Feb 18, 2006 at 01:21:18PM +0100, M.-A. Lemburg wrote: It's by no means a Perl attitude. In your eyes, perhaps. It certainly feels that way to me (or I wouldn't have said it :). Perl happens to be full of general constructs that were added because they were easy to add, or they were

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Terry Reedy
Josiah Carlson [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Again, the problem is ambiguity; what does bytes.recode(something) mean? Are we encoding _to_ something, or are we decoding _from_ something? Are we going to need to embed the direction in the encoding/decoding name

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Ron Adam
Josiah Carlson wrote: Ron Adam [EMAIL PROTECTED] wrote: Josiah Carlson wrote: [snip] Again, the problem is ambiguity; what does bytes.recode(something) mean? Are we encoding _to_ something, or are we decoding _from_ something? This was just an example of one way that might work, but here

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Josiah Carlson
Ron Adam [EMAIL PROTECTED] wrote: Josiah Carlson wrote: Ron Adam [EMAIL PROTECTED] wrote: Josiah Carlson wrote: [snip] Again, the problem is ambiguity; what does bytes.recode(something) mean? Are we encoding _to_ something, or are we decoding _from_ something? This was just an

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Ron Adam
Josiah Carlson wrote: Ron Adam [EMAIL PROTECTED] wrote: Except that ambiguates it even further. Is encodings.tounicode() encoding, or decoding? According to everything you have said so far, it would be decoding. But if I am decoding binary data, why should it be spending any time as a

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Martin v. Löwis
Josiah Carlson wrote: I would agree that zip is questionable, but 'uu', 'rot13', perhaps 'hex', and likely a few others that the two of you may be arguing against should stay as encodings, because strictly speaking, they are defined as encodings of data. They may not be encodings of _unicode_

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Jason Orendorff
On 2/15/06, Guido van Rossum [EMAIL PROTECTED] wrote: Actually users trying to figure out Unicode would probably be better served if bytes.encode() and text.decode() did not exist.[...]It would be better if the signature of text.encode() always returned a bytes object. But why deny the bytes

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Bob Ippolito
On Feb 16, 2006, at 9:20 PM, Josiah Carlson wrote: Greg Ewing [EMAIL PROTECTED] wrote: Josiah Carlson wrote: They may not be encodings of _unicode_ data, But if they're not encodings of unicode data, what business do they have being available through someunicodestring.encode(...)? I

Re: [Python-Dev] bytes.from_hex()

2006-02-17 Thread Stephen J. Turnbull
Guido == Guido van Rossum [EMAIL PROTECTED] writes: Guido I'd say there are two symmetric API flavors possible (t Guido and b are text and bytes objects, respectively, where text Guido is a string type, either str or unicode; enc is an encoding Guido name): Guido -

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread M.-A. Lemburg
Martin v. Löwis wrote: Josiah Carlson wrote: I would agree that zip is questionable, but 'uu', 'rot13', perhaps 'hex', and likely a few others that the two of you may be arguing against should stay as encodings, because strictly speaking, they are defined as encodings of data. They may not

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Bengt Richter
On Fri, 17 Feb 2006 00:33:49 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= [EMAIL PROTECTED] wrote: Josiah Carlson wrote: I would agree that zip is questionable, but 'uu', 'rot13', perhaps 'hex', and likely a few others that the two of you may be arguing against should stay as encodings,

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Martin v. Löwis
M.-A. Lemburg wrote: Just because some codecs don't fit into the string.decode() or bytes.encode() scenario doesn't mean that these codecs are useless or that the methods should be banned. No. The reason to ban string.decode and bytes.encode is that it confuses users. Regards, Martin

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Josiah Carlson
Martin v. Löwis [EMAIL PROTECTED] wrote: M.-A. Lemburg wrote: Just because some codecs don't fit into the string.decode() or bytes.encode() scenario doesn't mean that these codecs are useless or that the methods should be banned. No. The reason to ban string.decode and bytes.encode is

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Bengt Richter
On Fri, 17 Feb 2006 21:35:25 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= [EMAIL PROTECTED] wrote: M.-A. Lemburg wrote: Just because some codecs don't fit into the string.decode() or bytes.encode() scenario doesn't mean that these codecs are useless or that the methods should be banned.

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Martin v. Löwis
Josiah Carlson wrote: How are users confused? Users do py Martin v. Löwis.encode(utf-8) Traceback (most recent call last): File stdin, line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: ordinal not in range(128) because they want to convert the string to

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Ian Bicking
Martin v. Löwis wrote: Users do py Martin v. Löwis.encode(utf-8) Traceback (most recent call last): File stdin, line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: ordinal not in range(128) because they want to convert the string to Unicode, and they

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Josiah Carlson
Martin v. Löwis [EMAIL PROTECTED] wrote: Josiah Carlson wrote: How are users confused? Users do py Martin v. Löwis.encode(utf-8) Traceback (most recent call last): File stdin, line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: ordinal not in

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Martin v. Löwis
Ian Bicking wrote: That str.encode(unicode_encoding) implicitly decodes strings seems like a flaw in the unicode encodings, quite seperate from the existance of str.encode. I for one really like s.encode('zlib').encode('base64') -- and if the zlib encoding raised an error when it was passed a

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Ian Bicking
Josiah Carlson wrote: If some users can't understand this (passing different arguments to a function may produce different output), It's worse than that. The return *type* depends on the *value* of the argument. I think there is little precedence for that: normally, the return values depend on

  1   2   >