So, I have been working with multithreaded stuff, even done some fixes.
There is this function, slp_kill_tasks_with_stacks, which is supposed to kill
tasklets with stacks. It turns out, that for all threads but the main thread,
it kills _all_ tasklets, because such tasklets each have their own cstate
object. Yes, all tasklets on worker threads are linked into the cstate
circular list, in order to have them killed when a thread goes away. This is
achieved through the function tasklet_ensure_linkage(). Whether this is
required, is another topic.
Now, the last thing this function does to a tasklet thus killed is to change
its cstate->tstate to the initial tstate. That's right, it makes it so that it
appears to have come from the main thread. I have been experimenting with
setting this to NULl instead, to indicate that it comes from a dead thread.
Anyway, this is when I found out that slp_kill_tasks_with_stacks actually does
not work at all. When this function is entered, when the thtread state is
being cleared, tstate->ts.main (and tstate->ts.current) is already NULL, having
been cleared here, in tasklet_end():
/* capture all exceptions */
if (ismain) {
/*
* Main wants to exit. We clean up, but leave the
* runnables chain intact.
*/
ts->st.main = NULL;
Py_DECREF(retval);
retval = schedule_task_destruct(task, task);
Py_DECREF(task);
return retval;
}
So, all the gymnastics that slp_kill_tasks_with_stacks does with trying to
place tasklets before the "current" tasklet and so on are senseless since thre
is no current tasklet. In parcicular: When a tasklet that is to be killed is
linked into the current queue, it becomes the "main" tasklet and also the
"current" tasklet. PyTasklet_Kill then attempts to kill the current tasklet.
The kill function attempts to get the current frame, but then this code kicks
in:
/*
* silently do nothing if the tasklet is dead.
* simple raising would kill ourself in this case.
*/
if (slp_get_frame(task) == NULL)
Py_RETURN_NONE;
And
/* CAUTION: This function returns a borrowed reference */
PyFrameObject *
slp_get_frame(PyTaskletObject *task)
{
PyThreadState *ts = PyThreadState_GET();
return ts->st.current == task ? ts->frame : task->f.frame;
}
So, slp_get_frame returns the ts->frame which already is NULL.
In effect, it leaves the killed tasklet then in a strange state, where its next
and prev members point to itself, but it has a NULL frame.
So, how can we fix tasklet killing on exit? It seems necessary, at least for
threads with real C state in order to release any special C state that may be
pendin on the C stack.
First, off, the code in tasklet_end, which clears the main tasklet, is
incompatible with the kill_task_with_cstate.
Secondly, the latter function does appear very fragile. What happens if a
tasklet that is being killed refuses to die? Or creates new tasklets? I
wonder if we should have a super exception, one that is not catchable as a
system_exit, to ensure that they die.... Or maybe just require programs not to
handle tasklet exit or do any magic tasklet stuff during such a handle clause.
But that's a different story.
What I think should happen is probably something like this. Assuming that
kill_tasks_with_cstate can be called with the man and current tasklets still
valid:
1) All runnable tasklets are removed from the runnable queue and decrefed,
possibly resulting in them dying on their own or being killed, leaving only the
main tasklet in the queue.
2) Then proceed with killing each tasklet. Now that there is only one in
the queue left, there is no need to worry about the queue ordering.
Anyway, enough ranting. I'm not sure what to do here. It is annoying that the
main tasklet is cleared like this. But perhaps it is necessary. Perhaps it is
necessary to create a new temporary main tasklet... Who knows.
K
_______________________________________________
Stackless mailing list
[email protected]
http://www.stackless.com/mailman/listinfo/stackless