Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-08 Thread Antoine Pitrou
Le Thu, 07 Jan 2010 22:11:36 +0100, Martin v. Löwis a écrit :
 
 Even if we do use the new API, and correctly, it still might be
 confusing if the contents of the buffer changes underneath.

Well, no more confusing than when you compute a SHA1 hash or zlib-
compress the buffer, is it?

Regards

Antoine


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-08 Thread Guido van Rossum
On Fri, Jan 8, 2010 at 6:27 AM, Antoine Pitrou solip...@pitrou.net wrote:
 Le Thu, 07 Jan 2010 22:11:36 +0100, Martin v. Löwis a écrit :

 Even if we do use the new API, and correctly, it still might be
 confusing if the contents of the buffer changes underneath.

 Well, no more confusing than when you compute a SHA1 hash or zlib-
 compress the buffer, is it?

That depends. Algorithms that make exactly one pass over the buffer
will run fine (maybe producing a meaningless result). But the regex
matcher may scan the buffer repeatedly (for backtracking purposes) and
it would take a considerable analysis to prove that cannot mess up its
internal data structures if the data underneath changes. (I give it a
decent chance that it's fine, but since it was written without ever
considering this possibility I'm not 100% sure.)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-07 Thread Stefan Behnel

Guido van Rossum, 07.01.2010 05:29:

A better rule would be you may access the memory buffer in a PyString
or PyUnicode object with the GIL released as long as you own a
reference to the string object. Everything else is out of bounds (or
not worth the bother).


Is that a yes regarding the OP's original question about releasing the 
GIL during regexp searches?


Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-07 Thread Antoine Pitrou
MRAB python at mrabarnett.plus.com writes:
 
 I know that it needs to have the GIL during memory-management calls, but
 does it for calls like Py_UNICODE_TOLOWER or PyErr_SetString? Is there
 an easy way to find out?

There is no easy way to do so. The only safe way is to examine all the
functions or macros you want to call with the GIL released, and assess whether
it is safe to call them. As already pointed out, no reference count should be
changed, and generally no mutable container should be accessed, except if that
container is known not to be referenced anywhere else (that would be the case
for e.g. a list that your function has created and is busy populating).

I agree that releasing the GIL when doing non-trivial regex searches is a
worthwhile research, so please don't give up immediately :-)

Regards

Antoine Pitrou.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-07 Thread Martin v. Löwis
 A better rule would be you may access the memory buffer in a PyString
 or PyUnicode object with the GIL released as long as you own a
 reference to the string object. Everything else is out of bounds (or
 not worth the bother).
 
 Is that a yes regarding the OP's original question about releasing the
 GIL during regexp searches?

No, because the regex engine may also operate on buffers that start
moving around when you release the GIL.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-07 Thread Martin v. Löwis
 I've been wondering whether it's possible to release the GIL in the
 regex engine during matching.

I don't think that's possible. The regex engine can also operate on
objects whose representation may move in memory when you don't hold
the GIL (e.g. buffers that get mutated). Even if they stay in place -
if their contents changes, regex results may be confusing.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-07 Thread James Y Knight

On Jan 7, 2010, at 3:27 PM, Martin v. Löwis wrote:


I've been wondering whether it's possible to release the GIL in the
regex engine during matching.


I don't think that's possible. The regex engine can also operate on
objects whose representation may move in memory when you don't hold
the GIL (e.g. buffers that get mutated). Even if they stay in place -
if their contents changes, regex results may be confusing.


It seems probably worthwhile to optimize for the common case of using  
the regexp engine on an immutable object of type str or bytes, and  
allow releasing the GIL in *that* case, even if you have to keep it  
for the general case.


James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-07 Thread Martin v. Löwis
 I've been wondering whether it's possible to release the GIL in the
 regex engine during matching.

 I don't think that's possible. The regex engine can also operate on
 objects whose representation may move in memory when you don't hold
 the GIL (e.g. buffers that get mutated). Even if they stay in place -
 if their contents changes, regex results may be confusing.
 
 It seems probably worthwhile to optimize for the common case of using
 the regexp engine on an immutable object of type str or bytes, and
 allow releasing the GIL in *that* case, even if you have to keep it for
 the general case.

Right. This problem was the one that I thought of first.

Thinking about these things is fairly difficult (to me, at least), so
I think I could only tell whether I would consider a patch thread-safe
that released the GIL around matching under selected circumstances -
if I had the patch available. I don't see any obvious reason (assuming
Guido's list of conditions holds - i.e. you are holding references to
everything you access).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-07 Thread Antoine Pitrou
Martin v. Löwis martin at v.loewis.de writes:
 
 I don't think that's possible. The regex engine can also operate on
 objects whose representation may move in memory when you don't hold
 the GIL (e.g. buffers that get mutated).

Why is it a problem? If we get a buffer through the new buffer API, the object
should ensure that the representation isn't moved away until the buffer is 
released.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-07 Thread Martin v. Löwis
 I've been wondering whether it's possible to release the GIL in the
 regex engine during matching.

Ok, here is another problem: SRE_OP_REPEAT uses PyObject_MALLOC,
which requires the GIL (it then also may call PyErr_NoMemory,
which also requires the GIL).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com



Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-07 Thread Martin v. Löwis
 I don't think that's possible. The regex engine can also operate on
 objects whose representation may move in memory when you don't hold
 the GIL (e.g. buffers that get mutated).
 
 Why is it a problem? If we get a buffer through the new buffer API, the object
 should ensure that the representation isn't moved away until the buffer is 
 released.

In 2.7, we currently get the buffer with bf_getreadbuffer. In 3.x, we have

/* Release the buffer immediately --- possibly dangerous
   but doing something else would require some re-factoring
*/
PyBuffer_Release(view);


Even if we do use the new API, and correctly, it still might be
confusing if the contents of the buffer changes underneath.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] GIL required for _all_ Python calls?

2010-01-06 Thread MRAB

Hi,

I've been wondering whether it's possible to release the GIL in the
regex engine during matching.

I know that it needs to have the GIL during memory-management calls, but
does it for calls like Py_UNICODE_TOLOWER or PyErr_SetString? Is there
an easy way to find out? Or is it just a case of checking the source
files for mentions of the GIL? The header file for PyList_New, for
example, doesn't mention it!

Thanks
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-06 Thread John Arbash Meinel
MRAB wrote:
 Hi,
 
 I've been wondering whether it's possible to release the GIL in the
 regex engine during matching.
 
 I know that it needs to have the GIL during memory-management calls, but
 does it for calls like Py_UNICODE_TOLOWER or PyErr_SetString? Is there
 an easy way to find out? Or is it just a case of checking the source
 files for mentions of the GIL? The header file for PyList_New, for
 example, doesn't mention it!
 
 Thanks

Anything that Py_INCREF or Py_DECREF's should have the GIL, or you may
get concurrent updating of the value, and then the final value is wrong.
(two threads do 5+1 getting 6, rather than 7, and when the decref, you
end up at 4 rather than back at 5).

AFAIK, the only things that don't require the GIL are macro functions,
like PyString_AS_STRING or PyTuple_SET_ITEM. PyErr_SetString, for
example, will be increfing and setting the exception state, so certainly
needs the GIL to be held.

John
=:-

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-06 Thread Benjamin Peterson
2010/1/6 John Arbash Meinel john.arbash.mei...@gmail.com:
 Anything that Py_INCREF or Py_DECREF's should have the GIL, or you may
 get concurrent updating of the value, and then the final value is wrong.
 (two threads do 5+1 getting 6, rather than 7, and when the decref, you
 end up at 4 rather than back at 5).

Correct.


 AFAIK, the only things that don't require the GIL are macro functions,
 like PyString_AS_STRING or PyTuple_SET_ITEM. PyErr_SetString, for
 example, will be increfing and setting the exception state, so certainly
 needs the GIL to be held.

As a general rule, I would say, no Py* macros are safe without the gil
either (the exception being Py_END_ALLOW_THREADS), since they mutate
Python objects which must be protected.



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL required for _all_ Python calls?

2010-01-06 Thread Guido van Rossum
On Wed, Jan 6, 2010 at 7:32 PM, Benjamin Peterson benja...@python.org wrote:
 2010/1/6 John Arbash Meinel john.arbash.mei...@gmail.com:
  AFAIK, the only things that don't require the GIL are macro functions,
  like PyString_AS_STRING or PyTuple_SET_ITEM. PyErr_SetString, for
  example, will be increfing and setting the exception state, so certainly
  needs the GIL to be held.

 As a general rule, I would say, no Py* macros are safe without the gil
 either (the exception being Py_END_ALLOW_THREADS), since they mutate
 Python objects which must be protected.

That's keeping it on the safe side, since there are some macros like
PyString_AS_STRING() that are also safe, *if* you are owning at least
one reference to the string object.

At the same time, no Py* macros is not quite strong enough, since if
you called PyString_AS_STRING() before releasing the GIL but you don't
own a reference to the string object, the string might be deallocated
behind your back by another thread.

A better rule would be you may access the memory buffer in a PyString
or PyUnicode object with the GIL released as long as you own a
reference to the string object. Everything else is out of bounds (or
not worth the bother).

--
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com