Cliff Wickman (on Tue, 28 Oct 2008 14:08:06 -0500) wrote:
>Hi Jay, Keith, Dmitri,
>
>> Keith wrote:
>> I don't see how this can work. kdba_arch_init() currently maps
>> KDB_VECTOR to kdb_interrupt() on each cpu as that cpu is brought up.
>> IOW, KDB_VECTOR ends up being defined on ALL cpus before any KDB
>> interrupt is sent.
>>
>> Your kdb_takeover_vector() only maps KDB_VECTOR to kdb_interrupt() on
>> the CURRENT cpu, then sends KDB_VECTOR to all the other cpus. How do
>> the other cpus know what to do with KDB_VECTOR when they receive it?
>> They will have no definition for KDB_VECTOR so they will receive an
>> unexpected interrupt.
>
>When a vector arrives at any cpu it's going to index into the idt_table
>for a handler.
>There's only one idt_table, shared by all cpu's.
>Am I missing something fundamental?
My mistake, I was thinking that setting an interrupt gate actually
modified a per-cpu register - wrong.
However there is a separate problem with your patch. You now wait in
smp_kdb_stop() until all cpus are in KDB. If any cpu is completely
hung so it cannot be interrupted then smp_kdb_stop() will never return
and KDB will now appear to hang.
The existing code avoids this by
kdb() -> smp_kdb_stop() - issue KDB_VECTOR as normal interrupt but do not
wait for cpus
kdb() -> kdba_main_loop()
kdba_main_loop() -> kdb_save_running()
kdb_save_running() -> kdb_main_loop()
kdb_main_loop() -> kdb_wait_for_cpus()
kdb_wait_for_cpus() waits until the other cpus are in KDB. If a cpu
does not respond to KDB_VECTOR after a few seconds then
kdb_wait_for_cpus() hits the missing cpus with NMI.
This two step approach (send KDB_VECTOR as normal interrupt, wait then
send NMI) is used because NMI can be serviced at any time, even when
the target cpu is in the middle of servicing an interrupt. This can
result in incomplete register state which leads to broken backtraces.
IOW, sending NMI first would actually make debugging harder.
Given the above logic, if you are going to take over an existing
interrupt vector then the vector needs to be acquired near the start of
kdb() and released near the end of kdb(), and only on the master cpu.
Note: there is no overwhelming need for KDB_VECTOR to have a high
priority. As long as it is received within a few seconds then all is
well.
---------------------------
Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe.