On Sat, 18 Feb 2006 23:33:15 +0100, Thomas Wouters [EMAIL PROTECTED] wrote:
On Sat, Feb 18, 2006 at 01:21:18PM +0100, M.-A. Lemburg wrote:
[...]
- The return value for the non-unicode encodings depends on the value of
the encoding argument.
Not really: you'll always get a basestring
Aahz wrote:
The problem is that they don't understand that Martin v. L?wis is not
Unicode -- once all strings are Unicode, this is guaranteed to work.
This specific call, yes. I don't think the problem will go away as long
as both encode and decode are available for both strings and byte
Martin, v. Löwis wrote:
How are users confused?
Users do
py Martin v. Löwis.encode(utf-8)
Traceback (most recent call last):
File stdin, line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11:
ordinal not in range(128)
because they want to convert the
On Sat, Feb 18, 2006 at 12:06:37PM +0100, M.-A. Lemburg wrote:
I've already explained why we have .encode() and .decode()
methods on strings and Unicode many times. I've also
explained the misunderstanding that can codecs only do
Unicode-string conversions. And I've explained that
the
Martin v. Löwis wrote:
M.-A. Lemburg wrote:
Just because some codecs don't fit into the string.decode()
or bytes.encode() scenario doesn't mean that these codecs are
useless or that the methods should be banned.
No. The reason to ban string.decode and bytes.encode is that
it confuses
Thomas Wouters wrote:
On Sat, Feb 18, 2006 at 12:06:37PM +0100, M.-A. Lemburg wrote:
I've already explained why we have .encode() and .decode()
methods on strings and Unicode many times. I've also
explained the misunderstanding that can codecs only do
Unicode-string conversions. And I've
M.-A. Lemburg wrote:
I've already explained why we have .encode() and .decode()
methods on strings and Unicode many times. I've also
explained the misunderstanding that can codecs only do
Unicode-string conversions. And I've explained that
the .encode() and .decode() method *do* check the
Martin v. Löwis wrote:
M.-A. Lemburg wrote:
I've already explained why we have .encode() and .decode()
methods on strings and Unicode many times. I've also
explained the misunderstanding that can codecs only do
Unicode-string conversions. And I've explained that
the .encode() and .decode()
M.-A. Lemburg wrote:
True. However, note that the .encode()/.decode() methods on
strings and Unicode narrow down the possible return types.
The corresponding .bytes methods should only allow bytes and
Unicode.
I forgot that: what is the rationale for that restriction?
To assure that only
Martin v. Löwis wrote:
M.-A. Lemburg wrote:
True. However, note that the .encode()/.decode() methods on
strings and Unicode narrow down the possible return types.
The corresponding .bytes methods should only allow bytes and
Unicode.
I forgot that: what is the rationale for that restriction?
On Sat, Feb 18, 2006 at 01:21:18PM +0100, M.-A. Lemburg wrote:
It's by no means a Perl attitude.
In your eyes, perhaps. It certainly feels that way to me (or I wouldn't have
said it :). Perl happens to be full of general constructs that were added
because they were easy to add, or they were
Josiah Carlson wrote:
I would agree that zip is questionable, but 'uu', 'rot13', perhaps 'hex',
and likely a few others that the two of you may be arguing against
should stay as encodings, because strictly speaking, they are defined as
encodings of data. They may not be encodings of _unicode_
On 2/15/06, Guido van Rossum [EMAIL PROTECTED] wrote:
Actually users trying to figure out Unicode would probably be better served if bytes.encode() and text.decode() did not exist.[...]It would be better if the signature of text.encode() always returned a
bytes object. But why deny the bytes
On Feb 16, 2006, at 9:20 PM, Josiah Carlson wrote:
Greg Ewing [EMAIL PROTECTED] wrote:
Josiah Carlson wrote:
They may not be encodings of _unicode_ data,
But if they're not encodings of unicode data, what
business do they have being available through
someunicodestring.encode(...)?
I
Martin v. Löwis wrote:
Josiah Carlson wrote:
I would agree that zip is questionable, but 'uu', 'rot13', perhaps 'hex',
and likely a few others that the two of you may be arguing against
should stay as encodings, because strictly speaking, they are defined as
encodings of data. They may not
On Fri, 17 Feb 2006 00:33:49 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
[EMAIL PROTECTED] wrote:
Josiah Carlson wrote:
I would agree that zip is questionable, but 'uu', 'rot13', perhaps 'hex',
and likely a few others that the two of you may be arguing against
should stay as encodings,
M.-A. Lemburg wrote:
Just because some codecs don't fit into the string.decode()
or bytes.encode() scenario doesn't mean that these codecs are
useless or that the methods should be banned.
No. The reason to ban string.decode and bytes.encode is that
it confuses users.
Regards,
Martin
Martin v. Löwis [EMAIL PROTECTED] wrote:
M.-A. Lemburg wrote:
Just because some codecs don't fit into the string.decode()
or bytes.encode() scenario doesn't mean that these codecs are
useless or that the methods should be banned.
No. The reason to ban string.decode and bytes.encode is
On Fri, 17 Feb 2006 21:35:25 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
[EMAIL PROTECTED] wrote:
M.-A. Lemburg wrote:
Just because some codecs don't fit into the string.decode()
or bytes.encode() scenario doesn't mean that these codecs are
useless or that the methods should be banned.
Josiah Carlson wrote:
How are users confused?
Users do
py Martin v. Löwis.encode(utf-8)
Traceback (most recent call last):
File stdin, line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11:
ordinal not in range(128)
because they want to convert the string to
Martin v. Löwis wrote:
Users do
py Martin v. Löwis.encode(utf-8)
Traceback (most recent call last):
File stdin, line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11:
ordinal not in range(128)
because they want to convert the string to Unicode, and they
Martin v. Löwis [EMAIL PROTECTED] wrote:
Josiah Carlson wrote:
How are users confused?
Users do
py Martin v. Löwis.encode(utf-8)
Traceback (most recent call last):
File stdin, line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11:
ordinal not in
Ian Bicking wrote:
That str.encode(unicode_encoding) implicitly decodes strings seems like
a flaw in the unicode encodings, quite seperate from the existance of
str.encode. I for one really like s.encode('zlib').encode('base64') --
and if the zlib encoding raised an error when it was passed a
Josiah Carlson wrote:
If some users
can't understand this (passing different arguments to a function may
produce different output),
It's worse than that. The return *type* depends on the *value* of
the argument. I think there is little precedence for that: normally,
the return values depend on
Martin v. Löwis wrote:
Ian Bicking wrote:
That str.encode(unicode_encoding) implicitly decodes strings seems like
a flaw in the unicode encodings, quite seperate from the existance of
str.encode. I for one really like s.encode('zlib').encode('base64') --
and if the zlib encoding raised an
Ian Bicking wrote:
Maybe it isn't worse, but the real alternative is:
import zlib
import base64
base64.b64encode(zlib.compress(s))
Encodings cover up eclectic interfaces, where those interfaces fit a
basic pattern -- data in, data out.
So should I write
3.1415.encode(sin)
or
Martin v. Löwis wrote:
Maybe it isn't worse, but the real alternative is:
import zlib
import base64
base64.b64encode(zlib.compress(s))
Encodings cover up eclectic interfaces, where those interfaces fit a
basic pattern -- data in, data out.
So should I write
3.1415.encode(sin)
On Feb 17, 2006, at 4:20 PM, Martin v. Löwis wrote:
Ian Bicking wrote:
Maybe it isn't worse, but the real alternative is:
import zlib
import base64
base64.b64encode(zlib.compress(s))
Encodings cover up eclectic interfaces, where those interfaces fit a
basic pattern -- data in,
On Fri, Feb 17, 2006, Martin v. L?wis wrote:
Josiah Carlson wrote:
How are users confused?
Users do
py Martin v. L?wis.encode(utf-8)
Traceback (most recent call last):
File stdin, line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11:
ordinal not in
Josiah Carlson wrote:
They may not be encodings of _unicode_ data,
But if they're not encodings of unicode data, what
business do they have being available through
someunicodestring.encode(...)?
Greg
___
Python-Dev mailing list
Python-Dev@python.org
Instead of byte literals, how about a classmethod bytes.from_hex(), which works like this:
# two equivalent things
expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, 227, 131, 79, 229, 201, 46, 106])
It's
On 2/15/06, Jason Orendorff [EMAIL PROTECTED] wrote:
Instead of byte literals, how about a classmethod bytes.from_hex(), which
works like this:
# two equivalent things
expected_md5_hash =
bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
expected_md5_hash = bytes([92, 83, 80, 36,
Jason Orendorff wrote:
Instead of byte literals, how about a classmethod bytes.from_hex(), which
works like this:
# two equivalent things
expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
I hope this will also be equivalent:
expected_md5_hash = bytes.from_hex('5c
Jason Orendorff wrote:
expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
This looks good, although it duplicates
expected_md5_hash = binascii.unhexlify('5c535024cac5199153e3834fe5c92e6a')
Regards,
Martin
___
Python-Dev mailing
Jason Orendorff wrote:
Instead of byte literals, how about a classmethod bytes.from_hex(), which
works like this:
# two equivalent things
expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, 227,
131,
On Wed, 2006-02-15 at 14:01 -0500, Jason Orendorff wrote:
Instead of byte literals, how about a classmethod bytes.from_hex(),
which works like this:
# two equivalent things
expected_md5_hash =
bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
expected_md5_hash = bytes([92, 83, 80,
On 2/15/06, M.-A. Lemburg [EMAIL PROTECTED] wrote:
Jason Orendorff wrote:
Also the pseudo-encodings ('hex', 'rot13',
'zip', 'uu', etc.) generally scare me.
Those are not pseudo-encodings, they are regular codecs.
It's a common misunderstanding that codecs are only seen as serving
the
Jason Orendorff wrote:
Also the pseudo-encodings ('hex',
'rot13', 'zip', 'uu', etc.) generally scare me.
I think these will have to cease being implemented as
encodings in 3.0. They should really never have been
in the first place.
--
Greg Ewing, Computer Science Dept,
Greg Ewing [EMAIL PROTECTED] wrote:
Jason Orendorff wrote:
Also the pseudo-encodings ('hex',
'rot13', 'zip', 'uu', etc.) generally scare me.
I think these will have to cease being implemented as
encodings in 3.0. They should really never have been
in the first place.
I would agree
39 matches
Mail list logo