On Wed, Jan 31, 2018 at 6:21 AM, Peter Pearson <pkpearson@nowhere.invalid> wrote: > On Tue, 30 Jan 2018 11:24:07 +0100, jak <ple...@nospam.tnx> wrote: >> with open(fname, "rb") as fh: >> for data in fh.read(m.block_size * blocks): >> m.update(data) >> return m.hexdigest() >> > > I believe your "for data in fh.read" loop just reads the first block of > the file and loops over the bytes in that block (calling m.update once > for each byte, probably the least efficient approach imaginable), > omitting the remainder of the file. That's why you start getting the > right answer when the first block is big enough to encompass the whole > file.
Correct analysis. Generally, if you want to read a file in chunks, the easiest way is this: while "moar data": data = fh.read(block_size) if not data: break m.update(data) That should get you the correct result regardless of your block size, and then you can tweak the block size to toy with performance. ChrisA -- https://mail.python.org/mailman/listinfo/python-list