Re: unicode and hashlib

Bryan Olson Mon, 01 Dec 2008 05:55:46 -0800

Jeff H wrote:

[...] So once I have character strings transformed
internally to unicode objects, I should encode them in 'utf-8' before
attempting to do things that guess at the proper way to encode them
for further processing.(i.e. hashlib)

It looks like hashlib in Python 3 will not even attempt to digest aunicode object. Trying to hash 'abcdefg' in in Python 3.0rc3 I get:


  TypeError: object supporting the buffer API required

I think that's good behavior, except that the error message is likely tosend beginners to look up the obscure buffer interface before they findthey just need mystring.decode('utf8') or bytes(mystring, 'utf8').

a='André'
b=unicode(a,'cp1252')
b

u'Andr\xc3\xa9'

hashlib.md5(b.encode('utf-8')).hexdigest()

'b4e5418a36bc4badfc47deb657a2b50c'

Incidentally, MD5 has fallen and SHA-1 is falling. Python's hashlib alsoincludes the stronger SHA-2 family.



--
--Bryan
--
http://mail.python.org/mailman/listinfo/python-list

Re: unicode and hashlib

Reply via email to