Hi, not sure if this is already known, at least I failed to find any related report: The way faults are handled on x86 during debugger memory access breaks the preemption counter of the interrupted task. Try to issue a "print *(int *)0" over CONFIG_PREEMPT and then continue execution, you will get endless "scheduling while atomic" warnings.
Reason: Before kgdb touches any memory on the target, kgdb_fault_setjmp is invoked and stores the caller state in kgdb_fault_jmp_regs. Now, if a fault occurs later on, kgdb_notify detects that this is a fixable fault (kgdb_may_fault) and restores the previous state immediately. But because atomic_notifier_call_chain wraps the invocation of the kgdb_notify handler into rcu_read_lock/unlock, the premature fixup-return leaves no chance to restore this lock (i.e. the preemption counter). Looking at this issue from an outsider perspective, I wondered why the fault return context isn't patched instead of jumping back directly. Somehow this looks cleaner and more robust to me. So I hacked the proof-of-concept below, and it actually solved my problem over CONFIG_PREEMPT without obvious regressions (so far). I saw that e.g. MIPS jumps back from fixup_exception. While this seems to be immune against the issue I found, it is still not conforming with the way fixable faults are handled normally in the kernel and may break on future changes as well. So, if there are no pitfalls hidden, I would suggest to refactor this part for all archs. At least for x86 I could offer to work out a patch (as time permits). Jan --- linux-2.6.17.13.orig/arch/i386/kernel/kgdb.c +++ linux-2.6.17.13/arch/i386/kernel/kgdb.c @@ -311,7 +311,14 @@ static int kgdb_notify(struct notifier_b /* Bad memory access? */ if (cmd == DIE_PAGE_FAULT_NO_CONTEXT && atomic_read(&debugger_active) && kgdb_may_fault) { - kgdb_fault_longjmp(kgdb_fault_jmp_regs); + //kgdb_fault_longjmp(kgdb_fault_jmp_regs); + regs->ebx = kgdb_fault_jmp_regs[0]; + regs->esi = kgdb_fault_jmp_regs[1]; + regs->edi = kgdb_fault_jmp_regs[2]; + regs->ebp = kgdb_fault_jmp_regs[3]; + regs->esp = kgdb_fault_jmp_regs[4]; + regs->eip = kgdb_fault_jmp_regs[5]; + regs->eax = 1; return NOTIFY_STOP; } else if (cmd == DIE_PAGE_FAULT) /* A normal page fault, ignore. */
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport