Re: [Freebob-devel] [linux-audio-dev] ieee1394 deadlock on RT kernels

Pieter Palmers Mon, 26 Jun 2006 13:36:29 -0700

Lee Revell wrote:

On Mon, 2006-06-26 at 21:44 +0200, Pieter Palmers wrote:

Lee Revell wrote:
On Mon, 2006-06-26 at 21:05 +0200, Pieter Palmers wrote:
Lee Revell wrote:
On Mon, 2006-06-26 at 16:51 +0200, Pieter Palmers wrote:
Of course. My monday-morning bad temper is over by now, and I hope Ididn't transfer it to any of you. I'll provide the panic, one way oranother.
Can you reproduce the problem on a non-RT kernel?
No, it only occurs with RT kernels, and only with those configured forPREEMPT_RT. If I use PREEMPT_DESKTOP, there is no problem. (withthreaded IRQ's etc... only switched over the preemption level in thekernel config).
I've uploaded the photo's of the panic here:
http://freebob.sourceforge.net/old/img_3378.jpg (without flash)
http://freebob.sourceforge.net/old/img_3377.jpg (with flash)
both are of suboptimal quality unfortunately, but all info is readableon one or the other.
Can you add debug printk's before and after tasklet_kill() in
ohci1394_unregister_iso_tasklet to see where it locks up?
That's the first thing I did: the printk before tasklet_kill succeeds,the one right after the tasklet_kill doesn't.


OK that's what I suspected.

It seems that the -rt patch changes tasklet_kill:

Unpatched 2.6.17:

void tasklet_kill(struct tasklet_struct *t)
{
        if (in_interrupt())
                printk("Attempt to kill tasklet from interrupt\n");

        while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
                do
                        yield();
                while (test_bit(TASKLET_STATE_SCHED, &t->state));
        }
        tasklet_unlock_wait(t);
        clear_bit(TASKLET_STATE_SCHED, &t->state);
}

2.6.17-rt:

void tasklet_kill(struct tasklet_struct *t)
{
        if (in_interrupt())
                printk("Attempt to kill tasklet from interrupt\n");

        while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {

domsleep(1);

                while (test_bit(TASKLET_STATE_SCHED, &t->state));
        }
        tasklet_unlock_wait(t);
        clear_bit(TASKLET_STATE_SCHED, &t->state);
}

You should ask Ingo & the other -rt developers what the intent of this
change was.  Obviously it loops forever waiting for the state bit to
change.


because you are not allowed to yield() in an RT context?

I wish I had been a little more elaborate on my initial mail, as itwould have saved us some time, and communication troubles (on my partthat is). I already spotted the msleep() change in the patch, and Ialready tried reverting it. That gives you a nice new panic message,something like 'BUG: yield()'ing in ...'.

I'm wondering why a patched, but not 'complete preemption' configuredkernel works fine. This change is present in them too, so it probablyhas something to do with the msleep() implementation.

Another strange thing is: why doesn't the tasklet finish, so that it canbe 'unscheduled'? I have my IRQ priorities higher than any other RTthreads, so I would expect that the tasklet can finish. Or istasklet_kill not-preemtible? that would be very strange as I wouldexpect that busy waiting on something in a non-preemptible code path ona single-cpu system always deadlocks.



Greets,

Pieter

Re: [Freebob-devel] [linux-audio-dev] ieee1394 deadlock on RT kernels

Reply via email to