New issue 2752: Incorrect results of intensive read() results passing to cpyext
I've noticed that my program that hashes intensively using
[pyblake2](https://github.com/dchest/pyblake2) extension starts giving wrong
results at some point. I haven't been able to establish what is the exact cause
but I've been able to create a [test
case](https://github.com/mgorny/pypy-blake2-testcase) that easily reproduces
The exact code is:
bufsize = 4096
cs = pyblake2.blake2b()
with io.open('test.txt', 'rb') as f:
for block in iter(lambda: f.read(bufsize), b''):
assert cs.hexdigest() == sum, i
for x in map(sub, range(10000)):
With this test case, PyPy reliably fails (generates incorrect checksum) at
Few observations based on testing:
1. The issue affects PyPy2 only. PyPy3 and CPython work fine.
2. I can reproduce a similar problem with pyblake2, pysha3 but not e.g.
pycryptodome (which is also C extension) or builtin hash functions.
3. Some random changes to code (e.g. replacing io.open() with open()) cause the
failing iteration no to change.
4. If instead of the loop, I do a single `f.read()`, I wasn't able to get it to
fail (even with increased iteration count).
5. If I do `f.read()` without argument in a loop, it fails at iteration 852.
6. Changing `bufsize` and iteration count also changes the result, with no
clear correlation. E.g. with bufsize of 506000 and 10000 iterations, it doesn't
fail. With 100000 iterations, it fails at iteration 1488...
In other words, I really have no clue what might be happening here. I'm
attaching the test script and file for completeness.
pypy-issue mailing list