On Wed, Jan 31, 2018 at 6:21 AM, Peter Pearson
<[email protected]> wrote:
> On Tue, 30 Jan 2018 11:24:07 +0100, jak <[email protected]> wrote:
>> with open(fname, "rb") as fh:
>> for data in fh.read(m.block_size * blocks):
>> m.update(data)
>> return m.hexdigest()
>>
>
> I believe your "for data in fh.read" loop just reads the first block of
> the file and loops over the bytes in that block (calling m.update once
> for each byte, probably the least efficient approach imaginable),
> omitting the remainder of the file. That's why you start getting the
> right answer when the first block is big enough to encompass the whole
> file.
Correct analysis.
Generally, if you want to read a file in chunks, the easiest way is this:
while "moar data":
data = fh.read(block_size)
if not data: break
m.update(data)
That should get you the correct result regardless of your block size,
and then you can tweak the block size to toy with performance.
ChrisA
--
https://mail.python.org/mailman/listinfo/python-list