New issue 2240: file.__iter__() for pipes slower under PyPy 4.0.1 than CPython 2.7.10 https://bitbucket.org/pypy/pypy/issues/2240/file__iter__-for-pipes-slower-under-pypy
Richard Barrell: When running `for line in f: …`, where `f` is a pipe such as a subprocess.Popen object's stdout, CPython 2.7.10 reads 10kiB of data with each read(2) syscall but PyPy 4.0.1 is reading 1 byte of data with each read(2) syscall. PyPy 4.0.1 winds up being about 40-50 times slower than CPython 2.7.10 in this use case. I resorted to implementing a thing like file.__iter__() manually in pure Python to get around this in my program. :( ``` $ # testing with CPython 2.7.10 first: $ python --version Python 2.7.10 $ python lines.py generate created textfile.txt $ sha256sum lines.py 61d35907ce030172fdfe8c06c48b8ad475b571e6f1c345195381aa7195e173ed lines.py $ time python lines.py file_iter ('there are', 1048576, 'lines') took 124.87ms with (file.__iter__) real 0m0.139s user 0m0.131s sys 0m0.022s $ time python lines.py to_lines ('there are', 1048576, 'lines') took 433.11ms with to_lines() real 0m0.451s user 0m0.444s sys 0m0.021s $ # now testing with pypy 4.0.1: $ pypy --version Python 2.7.10 (5f8302b8bf9f53056e40426f10c72151564e5b19, Jan 15 2016, 18:28:10) [PyPy 4.0.1 with GCC 4.9.3] $ pypy lines.py generate created textfile.txt $ sha256sum lines.py 61d35907ce030172fdfe8c06c48b8ad475b571e6f1c345195381aa7195e173ed lines.py $ time pypy lines.py file_iter ('there are', 1048576, 'lines') took 5583.92ms with (file.__iter__) real 0m5.741s user 0m1.603s sys 0m4.229s $ time pypy lines.py to_lines ('there are', 1048576, 'lines') took 94.72ms with to_lines() real 0m0.242s user 0m0.207s sys 0m0.049s ``` _______________________________________________ pypy-issue mailing list pypy-issue@python.org https://mail.python.org/mailman/listinfo/pypy-issue