Jason Orendorff wrote: > Instead of byte literals, how about a classmethod bytes.from_hex(), which > works like this: > > # two equivalent things > expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a') > expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, 227, > 131, 79, 229, 201, 46, 106]) > > It's just a nicety; the former fits my brain a little better. This would > work fine both in 2.5 and in 3.0. > > I thought about unicode.encode('hex'), but obviously it will continue to > return a str in 2.x, not bytes. Also the pseudo-encodings ('hex', 'rot13', > 'zip', 'uu', etc.) generally scare me.
Those are not pseudo-encodings, they are regular codecs. It's a common misunderstanding that codecs are only seen as serving the purpose of converting between Unicode and strings. The codec system is deliberately designed to be general enough to also work with many other types, e.g. it is easily possible to write a codec that convert between the hex literal sequence you have above to a list of ordinals: """ Hex string codec Converts between a list of ordinals and a two byte hex literal string. Usage: >>> codecs.encode([1,2,3], 'hexstring') '010203' >>> codecs.decode(_, 'hexstring') [1, 2, 3] (c) 2006, Marc-Andre Lemburg. """ import codecs class Codec(codecs.Codec): def encode(self, input, errors='strict'): """ Convert hex ordinal list to hex literal string. """ if not isinstance(input, list): raise TypeError('expected list of integers') return ( ''.join(['%02x' % x for x in input]), len(input)) def decode(self,input,errors='strict'): """ Convert hex literal string to hex ordinal list. """ if not isinstance(input, str): raise TypeError('expected string of hex literals') size = len(input) if not size % 2 == 0: raise TypeError('input string has uneven length') return ( [int(input[(i<<1):(i<<1)+2], 16) for i in range(size >> 1)], size) class StreamWriter(Codec,codecs.StreamWriter): pass class StreamReader(Codec,codecs.StreamReader): pass def getregentry(): return (Codec().encode,Codec().decode,StreamReader,StreamWriter) > And now that bytes and text are > going to be two very different types, they're even weirder than before. > Consider: > > text.encode('utf-8') ==> bytes > text.encode('rot13') ==> text > bytes.encode('zip') ==> bytes > bytes.encode('uu') ==> text (?) > > This state of affairs seems kind of crazy to me. Really ? It all depends on what you use the codecs for. The above usages through the .encode() and .decode() methods is not the only way you can make use of them. To get full access to the codecs, you'll have to use the codecs module. > Actually users trying to figure out Unicode would probably be better served > if bytes.encode() and text.decode() did not exist. You're missing the point: the .encode() and .decode() methods are merely interfaces to the registered codecs. Whether they make sense for a certain codec depends on the codec, not the methods that interface to it, and again, codecs do not only exist to convert between Unicode and strings. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 15 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com