mmaenpaa <[email protected]> added the comment: It seems that Pypy's jit has real problems when dealing with large number of regular expressions that are used often. CPython and Pypy without jit don't have problems and they seem to scale linearly when increasing number of regular expressions.
Steps to test slowdown when creating and using many regular expressions at the same time: 1. Create and compile n number of random regular expressions 2. Create predetermined number of random strings 3. Measure time it takes to test all generated strings against all regular expressions with re.match When testing with 20000 random strings, Pypy starts to slowdown when dealing with more than 10 regular expressions. When dealing with more than 25 regular expressions, Pypy is actually faster with jit disabled. Attached test program was run with Pypy 2.2.1 and CPython 2.7.3 on 64-bit Debian Wheezy. Jit was disabled with "--jit off" command line parameter. $ python run-regexp-bug.py 20000 random strings: 20000 n pypy(s) cpython(s) pypy-nojit(s) 1 0.006 0.010 0.020 5 0.029 0.039 0.065 10 0.054 0.078 0.125 25 0.403 0.197 0.298 50 0.789 0.391 0.616 75 1.405 0.576 0.944 100 2.049 0.763 1.193 200 7.454 1.606 2.459 300 18.311 2.313 3.816 400 30.136 3.079 4.944 500 44.634 3.849 6.282 600 63.083 4.871 8.198 700 79.475 5.488 8.837 800 98.820 6.152 10.070 900 122.998 7.306 11.248 1000 143.714 8.037 12.766 1100 171.839 8.479 13.758 1200 200.731 9.274 15.087 ---------- nosy: +mmaenpaa status: unread -> chatting ________________________________________ PyPy bug tracker <[email protected]> <https://bugs.pypy.org/issue1347> ________________________________________ _______________________________________________ pypy-issue mailing list [email protected] https://mail.python.org/mailman/listinfo/pypy-issue
