Neil Schemenauer wrote: > Ron Adam <[EMAIL PROTECTED]> wrote: >> Why was it decided that the unicode encoding argument should be ignored >> if the first argument is a string? Wouldn't an exception be better >> rather than give the impression it does something when it doesn't? > >>From the PEP: > > There is no sane meaning that the encoding can have in that > case. str objects *are* byte arrays and they know nothing about > the encoding of character data they contain. We need to assume > that the programmer has provided str object that already uses > the desired encoding. > > Raising an exception would be a valid option. However, passing the > string through unchanged makes the transition from str to bytes > easier. > > Neil
I guess I'm concerned that if the string isn't already in the specified encoding it could pass though without complaining and not be encoded as expected. >>> b.bytes(u'abc', 'hex-codec') bytes([54, 49, 54, 50, 54, 51]) >>> b.bytes('abc', 'hex-codec') bytes([97, 98, 99]) # not hex If this was in a function I would need to do a check of some sort anyways or cast to unicode beforehand, or encode beforehand. Which negates the advantage of having the codec argument in bytes unfortunately. def hexabyte(s): s = unicode(s) return bytes(s, 'hex-codec') or def hexabyte(s): s = s.encode('hex-codec') return bytes(s) It seems to me if you are specifying a codec for bytes, then you will not be expecting to get an already encoded string, and if you do, it may not be in the codec you want since you are probably not specifying the default codec. Ron _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com