Lee Revell wrote:
On Mon, 2006-06-26 at 21:44 +0200, Pieter Palmers wrote:
Lee Revell wrote:
On Mon, 2006-06-26 at 21:05 +0200, Pieter Palmers wrote:
Lee Revell wrote:
On Mon, 2006-06-26 at 16:51 +0200, Pieter Palmers wrote:
Of course. My monday-morning bad temper is over by now, and I hope I didn't transfer it to any of you. I'll provide the panic, one way or another.

Can you reproduce the problem on a non-RT kernel?

No, it only occurs with RT kernels, and only with those configured for PREEMPT_RT. If I use PREEMPT_DESKTOP, there is no problem. (with threaded IRQ's etc... only switched over the preemption level in the kernel config).

I've uploaded the photo's of the panic here:
http://freebob.sourceforge.net/old/img_3378.jpg (without flash)
http://freebob.sourceforge.net/old/img_3377.jpg (with flash)

both are of suboptimal quality unfortunately, but all info is readable on one or the other.
Can you add debug printk's before and after tasklet_kill() in
ohci1394_unregister_iso_tasklet to see where it locks up?

That's the first thing I did: the printk before tasklet_kill succeeds, the one right after the tasklet_kill doesn't.

OK that's what I suspected.

It seems that the -rt patch changes tasklet_kill:

Unpatched 2.6.17:

void tasklet_kill(struct tasklet_struct *t)
{
        if (in_interrupt())
                printk("Attempt to kill tasklet from interrupt\n");

        while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
                do
                        yield();
                while (test_bit(TASKLET_STATE_SCHED, &t->state));
        }
        tasklet_unlock_wait(t);
        clear_bit(TASKLET_STATE_SCHED, &t->state);
}

2.6.17-rt:

void tasklet_kill(struct tasklet_struct *t)
{
        if (in_interrupt())
                printk("Attempt to kill tasklet from interrupt\n");

        while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
do msleep(1);
                while (test_bit(TASKLET_STATE_SCHED, &t->state));
        }
        tasklet_unlock_wait(t);
        clear_bit(TASKLET_STATE_SCHED, &t->state);
}

You should ask Ingo & the other -rt developers what the intent of this
change was.  Obviously it loops forever waiting for the state bit to
change.


because you are not allowed to yield() in an RT context?

I wish I had been a little more elaborate on my initial mail, as it would have saved us some time, and communication troubles (on my part that is). I already spotted the msleep() change in the patch, and I already tried reverting it. That gives you a nice new panic message, something like 'BUG: yield()'ing in ...'.

I'm wondering why a patched, but not 'complete preemption' configured kernel works fine. This change is present in them too, so it probably has something to do with the msleep() implementation.

Another strange thing is: why doesn't the tasklet finish, so that it can be 'unscheduled'? I have my IRQ priorities higher than any other RT threads, so I would expect that the tasklet can finish. Or is tasklet_kill not-preemtible? that would be very strange as I would expect that busy waiting on something in a non-preemptible code path on a single-cpu system always deadlocks.


Greets,

Pieter

Reply via email to