New issue 2752: Incorrect results of intensive read() results passing to cpyext
https://bitbucket.org/pypy/pypy/issues/2752/incorrect-results-of-intensive-read

Michał Górny:

I've noticed that my program that hashes intensively using 
[pyblake2](https://github.com/dchest/pyblake2) extension starts giving wrong 
results at some point. I haven't been able to establish what is the exact cause 
but I've been able to create a [test 
case](https://github.com/mgorny/pypy-blake2-testcase) that easily reproduces 
the problem.

The exact code is:


```
#!python

import io
import pyblake2

bufsize = 4096
sum = 
'ccefcd101b08863339602f7fdf2edd1d77ef05a970c36dbd7a560d33f957f81b15cfcac10114f8fca0d7c318b6aaa294220e3fcf4f88e6e3bd7840f121ff3b65'

def sub(i):
    cs = pyblake2.blake2b()
    with io.open('test.txt', 'rb') as f:
        for block in iter(lambda: f.read(bufsize), b''):
            cs.update(block)
    assert cs.hexdigest() == sum, i

for x in map(sub, range(10000)):
    pass
```

With this test case, PyPy reliably fails (generates incorrect checksum) at 
iteration 94.

Few observations based on testing:
1. The issue affects PyPy2 only. PyPy3 and CPython work fine.
2. I can reproduce a similar problem with pyblake2, pysha3 but not e.g. 
pycryptodome (which is also C extension) or builtin hash functions.
3. Some random changes to code (e.g. replacing io.open() with open()) cause the 
failing iteration no to change.
4. If instead of the loop, I do a single `f.read()`, I wasn't able to get it to 
fail (even with increased iteration count).
5. If I do `f.read()` without argument in a loop, it fails at iteration 852.
6. Changing `bufsize` and iteration count also changes the result, with no 
clear correlation. E.g. with bufsize of 506000 and 10000 iterations, it doesn't 
fail. With 100000 iterations, it fails at iteration 1488...

In other words, I really have no clue what might be happening here. I'm 
attaching the test script and file for completeness.


_______________________________________________
pypy-issue mailing list
pypy-issue@python.org
https://mail.python.org/mailman/listinfo/pypy-issue

Reply via email to