[issue23690] re functions never release GIL
STINNER Victor added the comment: Aren't Python strings immutable? Yes. But the re module supports more types than just str and bytes. For example, bytearray is also accepted: re.match(b'^abc', b'abc') _sre.SRE_Match object; span=(0, 3), match=b'abc' re.match(b'^abc', bytearray(b'abc')) _sre.SRE_Match object; span=(0, 3), match=b'abc' Also, match functions still permit execution of signal handlers, which can execute any Python code. Correct, signal handlers are called. If you mutate the string currently used in the pattern matching, you can probably crash Python. I hope that nobody does such ugly things in Python signal handlers :-) If GIL is needed during matching, can it be released temporarily to permit thread switching? It's possible to modify the _sre module to release the GIL in some cases. It's possible to release the GIL for immutables string, and keep the GIL for mutable strings. To do this, you have to audit the source code. First, ensure that no global variable is used. For example, the state must not be shared (it's ok, it's allocated on the stack, thread stacks are not shared). If you start to release the GIL, you have to search for all functions which must be called with the GIL hold. For example, memory allocators, but also all functions manipulating Python objects. Hint: seach PyObject*. For example, getslice() must be called with the GIL hold. Since the GIL is a lock, you should benchmark to ensure that sequences of acquire/release the GIL doesn't kill performances with a single thread, and with multiple threads. Anyway, a benchmark will be needed. To be clear: I'm *not* interested to optimize the _sre module to release the GIL (to support parallel executions). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23690 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23690] re functions never release GIL
Evgeny Kapun added the comment: Aren't Python strings immutable? Also, match functions still permit execution of signal handlers, which can execute any Python code. If GIL is needed during matching, can it be released temporarily to permit thread switching? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23690 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23690] re functions never release GIL
New submission from Evgeny Kapun: Looks like function in re module (match, fullmatch and so on) don't release GIL, even though these operations can take much time. As a result, other threads can't run while a pattern is being matched, and thread switching doesn't happen as well. -- components: Regular Expressions messages: 238316 nosy: abacabadabacaba, ezio.melotti, mrabarnett priority: normal severity: normal status: open title: re functions never release GIL type: resource usage versions: Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23690 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23690] re functions never release GIL
STINNER Victor added the comment: Supporting to release the GIL would require to redesign the _sre module. For example, the getstring() gets a view of a Python string, it doesn't copy the string. So we must hold the GIL, otherwise the Python string can be modified by other threads. Copying a very long string may be slower than just match the pattern :-/ During the pattern matching, other Python functions are called, these functions require the GIL to be hold. Example: PyObject_Malloc(). -- nosy: +haypo type: resource usage - performance ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23690 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com