On Thu, May 22, 2008 at 12:14 PM, GTXY20 <[EMAIL PROTECTED]> wrote: > Hello all, > > I will be dealing with an address list where I might have the following: > > Name SSN > John 111111111 > John 111111111 > Jane 222222222 > Jill 333333333 > > What I need to do is parse the address list and then create a unique random > unidentifiable value for the SSN field like so: > > Name SSNrandomvalue > John 1a1b1c1d1 > John 1a1b1c1d1 > Jane 2a2b2c2d2 > Jill 3a3b3c3d3 > > The unique random value does not have to follow this convention but it needs > to be unique so that I can relate it back to the original SSN when needed. > As opposed to using the random module I was thinking that it would be better > to use either sha or md5. Just curious as to thoughts on the correct > approach. > > Thank you in advance. > > G.
Both SHA and MD5 are intended to be one-way functions, such that you can't recover what you provide as an argument. For example (taken from http://www.python.org/doc/current/lib/module-hashlib.html) : >>> hashlib.sha224("Nobody inspects the spammish repetition").hexdigest() 'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2' There's no way to take the value 'a4337...' and return "Nobody insp..", because there are potentially infinite strings that have to map into the available 224-bit space that sha224 provides. If you want to be able to recover the SSN, you should probably look at cryptography. Here's a link that might interest you: http://www.amk.ca/python/code/crypto.html Tony R. aka Taser _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor