New issue 2752: Incorrect results of intensive read() results passing to cpyext https://bitbucket.org/pypy/pypy/issues/2752/incorrect-results-of-intensive-read
Michał Górny: I've noticed that my program that hashes intensively using [pyblake2](https://github.com/dchest/pyblake2) extension starts giving wrong results at some point. I haven't been able to establish what is the exact cause but I've been able to create a [test case](https://github.com/mgorny/pypy-blake2-testcase) that easily reproduces the problem. The exact code is: ``` #!python import io import pyblake2 bufsize = 4096 sum = 'ccefcd101b08863339602f7fdf2edd1d77ef05a970c36dbd7a560d33f957f81b15cfcac10114f8fca0d7c318b6aaa294220e3fcf4f88e6e3bd7840f121ff3b65' def sub(i): cs = pyblake2.blake2b() with io.open('test.txt', 'rb') as f: for block in iter(lambda: f.read(bufsize), b''): cs.update(block) assert cs.hexdigest() == sum, i for x in map(sub, range(10000)): pass ``` With this test case, PyPy reliably fails (generates incorrect checksum) at iteration 94. Few observations based on testing: 1. The issue affects PyPy2 only. PyPy3 and CPython work fine. 2. I can reproduce a similar problem with pyblake2, pysha3 but not e.g. pycryptodome (which is also C extension) or builtin hash functions. 3. Some random changes to code (e.g. replacing io.open() with open()) cause the failing iteration no to change. 4. If instead of the loop, I do a single `f.read()`, I wasn't able to get it to fail (even with increased iteration count). 5. If I do `f.read()` without argument in a loop, it fails at iteration 852. 6. Changing `bufsize` and iteration count also changes the result, with no clear correlation. E.g. with bufsize of 506000 and 10000 iterations, it doesn't fail. With 100000 iterations, it fails at iteration 1488... In other words, I really have no clue what might be happening here. I'm attaching the test script and file for completeness. _______________________________________________ pypy-issue mailing list pypy-issue@python.org https://mail.python.org/mailman/listinfo/pypy-issue