On 30 Mar 2007, Greg Perry wrote: > Here's one that has me stumped. > > I am writing a forensic analysis tool that takes either a file or a > directory as input, then calculates a hash digest based on the contents > of each file. > > I have created an instance of the hashlib class: > > m = hashlib.md5() > > I then load in a file in binary mode: > > f = open("c:\python25\python.exe", "rb") > > According to the docs, the hashlib update function will update the hash > object with the string arg. So: > > m.update(f.read()) > m.hexdigest() > > The md5 hash is not correct for the file.
Odd. It's correct for me: In Python: >>> import hashlib >>> m = hashlib.md5() >>> f = open("c:\python25\python.exe", "rb") >>> m.update(f.read()) >>> m.hexdigest() '7e7c8ae25d268636a3794f16c0c21d7c' Now, check against the md5 as calculated by the md5sum utility: >md5sum c:\Python25\python.exe \7e7c8ae25d268636a3794f16c0c21d7c *c:\\Python25\\python.exe > f.seek(0) > hashlib.md5(f.read()).hexdigest() No difference here: >>> f.close() >>> f = open("c:\python25\python.exe", "rb") >>> hashlib.md5(f.read()).hexdigest() '7e7c8ae25d268636a3794f16c0c21d7c' _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor