[issue23690] re functions never release GIL

2015-03-17 Thread STINNER Victor

STINNER Victor added the comment:

 Aren't Python strings immutable?

Yes. But the re module supports more types than just str and bytes. For 
example, bytearray is also accepted:

 re.match(b'^abc', b'abc')
_sre.SRE_Match object; span=(0, 3), match=b'abc'
 re.match(b'^abc', bytearray(b'abc'))
_sre.SRE_Match object; span=(0, 3), match=b'abc'

 Also, match functions still permit execution of signal handlers, which can 
 execute any Python code.

Correct, signal handlers are called. If you mutate the string currently used in 
the pattern matching, you can probably crash Python. I hope that nobody does 
such ugly things in Python signal handlers :-)

 If GIL is needed during matching, can it be released temporarily to permit 
 thread switching?

It's possible to modify the _sre module to release the GIL in some cases. It's 
possible to release the GIL for immutables string, and keep the GIL for mutable 
strings. To do this, you have to audit the source code. First, ensure that no 
global variable is used. For example, the state must not be shared (it's ok, 
it's allocated on the stack, thread stacks are not shared).

If you start to release the GIL, you have to search for all functions which 
must be called with the GIL hold. For example, memory allocators, but also all 
functions manipulating Python objects. Hint: seach PyObject*. For example, 
getslice() must be called with the GIL hold.

Since the GIL is a lock, you should benchmark to ensure that sequences of 
acquire/release the GIL doesn't kill performances with a single thread, and 
with multiple threads. Anyway, a benchmark will be needed.

To be clear: I'm *not* interested to optimize the _sre module to release the 
GIL (to support parallel executions).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23690
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23690] re functions never release GIL

2015-03-17 Thread Evgeny Kapun

Evgeny Kapun added the comment:

Aren't Python strings immutable?

Also, match functions still permit execution of signal handlers, which can 
execute any Python code.

If GIL is needed during matching, can it be released temporarily to permit 
thread switching?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23690
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23690] re functions never release GIL

2015-03-17 Thread Evgeny Kapun

New submission from Evgeny Kapun:

Looks like function in re module (match, fullmatch and so on) don't release 
GIL, even though these operations can take much time. As a result, other 
threads can't run while a pattern is being matched, and thread switching 
doesn't happen as well.

--
components: Regular Expressions
messages: 238316
nosy: abacabadabacaba, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: re functions never release GIL
type: resource usage
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23690
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23690] re functions never release GIL

2015-03-17 Thread STINNER Victor

STINNER Victor added the comment:

Supporting to release the GIL would require to redesign the _sre module.

For example, the getstring() gets a view of a Python string, it doesn't copy 
the string. So we must hold the GIL, otherwise the Python string can be 
modified by other threads. Copying a very long string may be slower than just 
match the pattern :-/

During the pattern matching, other Python functions are called, these functions 
require the GIL to be hold. Example: PyObject_Malloc().

--
nosy: +haypo
type: resource usage - performance

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23690
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com