New issue 2777: re: incorrect behaviour for long patterns that are used repeatedly (possible JIT bug?) https://bitbucket.org/pypy/pypy/issues/2777/re-incorrect-behaviour-for-long-patterns
Andrew Stepanov: I've observed that `re` module gives incorrect results for very long patterns that are used repeatedly (possible JIT bug?) The following code produces an error on both pypy2 & pypy3 (latest version from mercurial) although it passes on CPython3.5 ``` import re pattern = ".a" * 2500 text = "a" * 6000 match = re.compile(pattern).match for idx in range(len(text) - len(pattern) + 1): substr = text[idx:idx+len(pattern)] if match(substr) is None: raise RuntimeError("This shouldn't have happened at {}".format(idx)) ``` ``` Traceback (most recent call last): File "pypy_re_bug.py", line 9, in <module> raise RuntimeError("This shouldn't have happened at {}".format(idx)) RuntimeError: This shouldn't have happened at 632 ``` This also happens for other long patterns (I tried `pattern = "." * 5000`, `pattern = "a" * 5000` and random strings from `{".", "a"}` alphabet of lengths >= 5000) The exact number of iterations before the error occurs can vary slightly, e.g. if I move `match = re.compile(pattern).match` inside the loop, I get exception at iteration 668 on pypy3 and 643 on pypy2. The code works fine for shorter patterns. _______________________________________________ pypy-issue mailing list pypy-issue@python.org https://mail.python.org/mailman/listinfo/pypy-issue