Re: [Python-Dev] Segfault in python 2.5

2006-10-19 Thread Steve Holden
Mike Klaas wrote:
 On 10/18/06, Tim Peters [EMAIL PROTECTED] wrote:
[...]
 Shouldn't the thread state generally be the same anyway? (I seem to
 recall some gloomy warning against resuming generators in separate
 threads).
 
Is this an indication that generators aren't thread-safe?

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd  http://www.holdenweb.com
Skype: holdenweb   http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Segfault in python 2.5

2006-10-18 Thread Mike Klaas
[http://sourceforge.net/tracker/index.php?func=detailaid=1579370group_id=5470atid=105470]

Hello,

I'm managed to provoke a segfault in python2.5 (occasionally it just a
invalid argument to internal function error).  I've posted a
traceback and a general idea of what the code consists of in the
sourceforge entry.  Unfortunately, I've been attempting for hours to
reduce the problem to a completely self-contained script, but it is
resisting my efforts due to timing problems.

Should I continue in that vein, or is it more useful to provide more
detailed results from gdb?

Thanks,
-Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Segfault in python 2.5

2006-10-18 Thread Michael Hudson
Mike Klaas [EMAIL PROTECTED] writes:

 [http://sourceforge.net/tracker/index.php?func=detailaid=1579370group_id=5470atid=105470]

 Hello,

 I'm managed to provoke a segfault in python2.5 (occasionally it just a
 invalid argument to internal function error).  I've posted a
 traceback and a general idea of what the code consists of in the
 sourceforge entry. 

I've been reading the bug report with interest, but unless I can
reproduce it it's mighty hard for me to debug, as I'm sure you know.

 Unfortunately, I've been attempting for hours to
 reduce the problem to a completely self-contained script, but it is
 resisting my efforts due to timing problems.

 Should I continue in that vein, or is it more useful to provide more
 detailed results from gdb?

Well, I don't think that there's much point in posting masses of
details from gdb.  You might want to try trying to fix the bug
yourself I guess, trying to figure out where the bad pointers come
from, etc.

Are you absolutely sure that the fault does not lie with any extension
modules you may be using?  Memory scribbling bugs have been known to
cause arbitrarily confusing problems...

Cheers,
mwh

-- 
  I'm not sure that the ability to create routing diagrams 
  similar to pretzels with mad cow disease is actually a 
  marketable skill. -- Steve Levin
   -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Segfault in python 2.5

2006-10-18 Thread Jack Jansen

On 18-Oct-2006, at 22:08 , Michael Hudson wrote:
 Unfortunately, I've been attempting for hours to
 reduce the problem to a completely self-contained script, but it is
 resisting my efforts due to timing problems.

Has anyone ever tried to use helgrind (the valgrind module, not the  
heavy metal band:-) on Python?
--
Jack Jansen, [EMAIL PROTECTED], http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma  
Goldman


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Segfault in python 2.5

2006-10-18 Thread Mike Klaas
On 10/18/06, Michael Hudson [EMAIL PROTECTED] wrote:
 Mike Klaas [EMAIL PROTECTED] writes:

 I've been reading the bug report with interest, but unless I can
 reproduce it it's mighty hard for me to debug, as I'm sure you know.

Indeed.

  Unfortunately, I've been attempting for hours to
  reduce the problem to a completely self-contained script, but it is
  resisting my efforts due to timing problems.
 
  Should I continue in that vein, or is it more useful to provide more
  detailed results from gdb?

 Well, I don't think that there's much point in posting masses of
 details from gdb.  You might want to try trying to fix the bug
 yourself I guess, trying to figure out where the bad pointers come
 from, etc.

I've peered at the code, but my knowledge of the python core is
superficial at best.  The fact that it is occuring as a result of a
long string of garbage collection/dealloc/etc. and involves threading
lowers my confidence further.   That said, I'm beginning to think that
to reproduce this in a standalone script will require understanding
the problem in greater depth regardless...

 Are you absolutely sure that the fault does not lie with any extension
 modules you may be using?  Memory scribbling bugs have been known to
 cause arbitrarily confusing problems...

I've had sufficient experience being arbitrarily confused to never be
sure about such things, but I am quite confident.  The script I posted
in the bug report is all stock python save for the operation in 's.
That operation is pickling and unpickling (using pickle, not cPickle)
a somewhat complicated pure-python instance several times.  It's doing
nothing with the actual instance--it just happens to take the right
amount of time to trigger the segfault.  It's still not perfect--this
trimmed-down version segfaults only sporatically, while the original
python script segfaults reliably.

-Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Segfault in python 2.5

2006-10-18 Thread Tim Peters
[Michael Hudson]
 I've been reading the bug report with interest, but unless I can
 reproduce it it's mighty hard for me to debug, as I'm sure you know.

[Mike Klaas]
 Indeed.

Note that I just attached a much simpler pure-Python script that fails
very quickly, on Windows, using a debug build.  Read the new comment
to learn why both Windows and debug build are essential to it
failing reliably and quickly ;-)

 Unfortunately, I've been attempting for hours to reduce the problem to a
 completely self-contained script, but it is resisting my efforts
due to timing
 problems.

Yes, but you did good!  This is still just an educated guess on my
part, but my education here is hard to match ;-):  this new business
of generators deciding to clean up after themselves if they're left
hanging appears to have made it possible for a generator to hold on to
a frame whose thread state has been free()'d, after the thread that
created the generator has gone away.  Then when the generator gets
collected as trash, the new exception-based clean up abandoned
generator gimmick tries to access the generator's frame's thread
state, but that's just a raw C struct (not a Python object with
reachability-based lifetime), and the thread free()'d that struct when
the thread went away.  The important timing-based vagary here is
whether dead-thread cleanup gets performed before the main thread
tries to clean up the trash generator.

 I've peered at the code, but my knowledge of the python core is
 superficial at best.  The fact that it is occuring as a result of a
 long string of garbage collection/dealloc/etc. and involves threading
 lowers my confidence further.   That said, I'm beginning to think that
 to reproduce this in a standalone script will require understanding
 the problem in greater depth regardless...

Or upgrade to Windows ;-)

 Are you absolutely sure that the fault does not lie with any extension
 modules you may be using?  Memory scribbling bugs have been known to
 cause arbitrarily confusing problems...

Unless I've changed the symptom, it's been reduced to minimal pure
Python.  It does require a thread T, and creating a generator in T,
where the generator object's lifetime is controlled by the main
thread, and where T vanishes before the generator has exited of its
own accord.

Offhand I don't know how to repair it.  Thread states /aren't/ Python
objects, and there's no provision for a thread state to outlive the
thread it represents.

 I've had sufficient experience being arbitrarily confused to never be
 sure about such things, but I am quite confident.  The script I posted
 in the bug report is all stock python save for the operation in 's.
 That operation is pickling and unpickling (using pickle, not cPickle)
 a somewhat complicated pure-python instance several times.

FYI, in my whittled script, your `getdocs()` became simply:

def getdocs():
while True:
yield None

and it's called only once, via self.docIter.next().  In fact, the
while True: isn't needed there either (given that it's only resumed
once now).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Segfault in python 2.5

2006-10-18 Thread Mike Klaas
On 10/18/06, Tim Peters [EMAIL PROTECTED] wrote:
 [Mike Klaas]
  Indeed.

 Note that I just attached a much simpler pure-Python script that fails
 very quickly, on Windows, using a debug build.  Read the new comment
 to learn why both Windows and debug build are essential to it
 failing reliably and quickly ;-)

Thanks!  Next time I find a bug, installing Windows will  certainly be
my first step g.


 Yes, but you did good!  This is still just an educated guess on my
 part, but my education here is hard to match ;-):  this new business
 of generators deciding to clean up after themselves if they're left
 hanging appears to have made it possible for a generator to hold on to
 a frame whose thread state has been free()'d, after the thread that
 created the generator has gone away.  Then when the generator gets
 collected as trash, the new exception-based clean up abandoned
 generator gimmick tries to access the generator's frame's thread
 state, but that's just a raw C struct (not a Python object with
 reachability-based lifetime), and the thread free()'d that struct when
 the thread went away.  The important timing-based vagary here is
 whether dead-thread cleanup gets performed before the main thread
 tries to clean up the trash generator.

Indeed--and normally it doesn't happen that way.  My/your script never
crashes on the first iteration because the thread's target is the
generator and thus it gets DECREF'd before the thread terminates.  But
the exception from the first iteration holds on to a reference to the
frame/generator so when it gets cleaned up (in the second iteration,
due to a new exception overwriting it) the generator is freed after
the thread is destroyed.  At least, I think...


 Offhand I don't know how to repair it.  Thread states /aren't/ Python
 objects, and there's no provision for a thread state to outlive the
 thread it represents.

Take this with a grain of salt, but ISTM that the problem can be
repaired by resetting the generator's frame threadstate to the current
threadstate:

(in genobject.c:gen_send_ex():80)
Py_XINCREF(tstate-frame);
assert(f-f_back == NULL);
f-f_back = tstate-frame;
+f-f_tstate = tstate;

gen-gi_running = 1;
result = PyEval_EvalFrameEx(f, exc);
gen-gi_running = 0;

Shouldn't the thread state generally be the same anyway? (I seem to
recall some gloomy warning against resuming generators in separate
threads).

This solution is surely wrong--if f_tstate != tstate, then the
generator _is_ being resumed in another thread and so the generated
traceback will be wrong (among other issues which surely occur by
fudging a frame's threadstate).  Perhaps it could be set conditionally
by gen_close before signalling the exception?  A lie, but a smaller
lie than a segfault.  We could advertise that the exception ocurring
from generator .close() isn't guaranteed to have an accurate traceback
in this case.

Take all this with a grain of un-core-savvy salt.

Thanks again for investigating this, Tim,
-Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com