Dear kdb developpers and supporters,
I'm trying to correct a bug related to breakpoints in SMP mode.
At the present time, I'm running kdb on a Tiger 4, I sipmly set a breakpoint
on default_idle function (the idle function called by the Linux scheduler)
and type "go" command, after a few, kdb is definitively blocked.
Running this test with an In-Target Probe tool, what I notice is really
surprising :
- all the CPUs enter kdb about the same time (the difference does not
exceed 10ms)
due to a break instruction inserted at the desired location. That's normal !
- but each CPU then modifies the state of the others and sends them an
IPI without any spinlock
protection.
- a CPU already running kdb (due to a break condition) may reenter kdb
due to IPI event reception,
this behavior is dangerous.
Finally, my result is that kdb dead locks, because the running CPU (not
in HOLD state) is spinlocking
on kdb_printf_lock, and the other ones are in HOLD state (then looping
in kdb_main_loop).
My conclusion is that we should make a global review on kdb working in
SMP mode.
We should reconsider spin lock management, protection against IPI, CPU
state table
management to be able to run breakpoints correctly in SMP mode.
I'm waiting for your comments.
Best regards,
Francois WELLENREITER
---------------------------
Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe.