[Kgdb-bugreport] [PATCH] Further fix to single step user->kernel
I solved the GPF problem I mentioned in an earlier e-mail (see the information below the first patch). The patch below is needed so the single step exception bits are cleared. It was easiest to use the predefined handler exception routine to do the work of executing the continue if the KGDB is request to step from kernel->user space. If there are no objections, I'll apply this patch after the review period. Signed-off-by: Jason Wessel <[EMAIL PROTECTED]> Index: linux-2.6.21/arch/i386/kernel/kgdb.c === --- linux-2.6.21.orig/arch/i386/kernel/kgdb.c +++ linux-2.6.21/arch/i386/kernel/kgdb.c @@ -273,6 +273,7 @@ static int kgdb_notify(struct notifier_b * eat the exception and continue the process */ printk(KERN_ERR "KGDB: trap/step from kernel to user space, resuming...\n"); + kgdb_arch_handle_exception(args->trapnr, args->signr, args->err, "c","",regs); return NOTIFY_STOP; } else if (cmd == DIE_NMI_IPI || cmd == DIE_NMI || user_mode(regs) || (cmd == DIE_DEBUG && atomic_read(&debugger_active))) Index: linux-2.6.21/arch/x86_64/kernel/kgdb.c === --- linux-2.6.21.orig/arch/x86_64/kernel/kgdb.c +++ linux-2.6.21/arch/x86_64/kernel/kgdb.c @@ -295,6 +295,7 @@ static int kgdb_notify(struct notifier_b * eat the exception and continue the process */ printk(KERN_ERR "KGDB: trap/step from kernel to user space, resuming...\n"); + kgdb_arch_handle_exception(args->trapnr, args->signr, args->err, "c","",regs); return NOTIFY_STOP; } else if (cmd == DIE_PAGE_FAULT || user_mode(regs) || cmd == DIE_NMI_IPI || (cmd == DIE_DEBUG && The real source of the GPF problem was a defect in gdb. If you execute a "next" operation that needs to single step out of the current frame and it results in a continue inside KGDB, the next breakpoint that is hit will not have the EIP decremented by 1 properly. gdb will continue single stepping because it is still looking for the end of the frame. Ultimately you will have some random fault. In my case it was a GPF due to the instruction addresses being off by 1 and the "random code" was accessing the machine in a bad way. The following is not really the right way to fix gdb, but it the patch illustrates the problem going away and that it is in fact gdb's fault. Another way to fix this in gdb would be to calculate the next instruction when single stepping and compare it with where the target actually stopped and then check the breakpoint list. No signed-off-by here because this is an example... Index: gdb-6.6/gdb/infrun.c === --- gdb-6.6.orig/gdb/infrun.c +++ gdb-6.6/gdb/infrun.c @@ -1222,8 +1222,7 @@ adjust_pc_after_break (struct execution_ single-stepping in this case. */ if (ptid_equal (ecs->ptid, inferior_ptid) && currently_stepping (ecs)) { - if (prev_pc == breakpoint_pc - && software_breakpoint_inserted_here_p (breakpoint_pc)) + if (software_breakpoint_inserted_here_p (breakpoint_pc)) /* Hardware single-stepped a software breakpoint (as occures when the inferior is resumed with PC pointing at not-yet-hit software breakpoint). Since the - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
Re: [Kgdb-bugreport] Getting signal 0 while running kgdb 2.4 on qemu
On 5/16/07, Amit S. Kale <[EMAIL PROTECTED]> wrote: > Did you get around this problem yet? After trying the latest version of kgdb with 2.6.17, the problem is gone. Thanks, Neo > > Can you post the output of gdb and kgdb communication for more info? To get > the output run gdb command "set debug remote 1" as soon as you start gdb. > After this gdb will print log of the communication that takes place between > the two. > -Amit > > On Thursday 26 April 2007 12:35, Neo Jia wrote: > > hi, > > > > Both the host and target platform are Linux IA32. I am trying to debug > > Linux kernel 2.6.15.5 with the patch files from > > http://kgdb.linsyssoft.com/getting.htm. But I got the following > > message in my gdb console and the program terminates. > > > > Program terminated with signal 0, Signal 0. > > The program no longer exists. > > (gdb) handle 0 nostop print pass > > Only signals 1-15 are valid as numeric signals. > > Use "info signals" for a list of symbolic signals. > > > > Is it a bug or a wrong configuration? I also post to qemu dev list to > > see if it is a problem on their side or not. > > > > Thanks, > > Neo > -- I would remember that if researchers were not ambitious probably today we haven't the technology we are using! - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
Re: [Kgdb-bugreport] [PATCH] Fix singlestep exception kernel->user x86_64 and ia32
> -Original Message- > From: Amit S. Kale [mailto:[EMAIL PROTECTED] > Sent: Wednesday, May 16, 2007 4:19 AM > To: Wessel, Jason > Cc: kgdb-bugreport@lists.sourceforge.net; Sergei Shtylyov; Tom Rini > Subject: Re: [PATCH] Fix singlestep exception kernel->user > x86_64 and ia32 > > Jason, > > This check > atomic_read(&cpu_doing_single_step) != -1 may result in a > loss of debug events on other cpus > > Changing it to > atomic_read(&cpu_doing_single_step) == raw_smp_processor_id() > corrects that problem > > I see that you've checked in this change. Would you mind > waiting for 24 hours after posting a patch? Thanks. > -Amit > Thanks for the input Amit. I'll make that fix immediately, and add a bit of time to the review period. I had done and early commit here to start a new branch on my side. I am evaluating the changes in the 2.6.22-rc1 kernel. At the moment, I have found a new corner case where there is a segmentation fault as a result of a GPF trap in the 2.6.22-rc1 port. I am going to see if I can sort it out or duplicate the problem in the 2.6.21 branch. Thanks, Jason. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
Re: [Kgdb-bugreport] Getting signal 0 while running kgdb 2.4 on qemu
Did you get around this problem yet? Can you post the output of gdb and kgdb communication for more info? To get the output run gdb command "set debug remote 1" as soon as you start gdb. After this gdb will print log of the communication that takes place between the two. -Amit On Thursday 26 April 2007 12:35, Neo Jia wrote: > hi, > > Both the host and target platform are Linux IA32. I am trying to debug > Linux kernel 2.6.15.5 with the patch files from > http://kgdb.linsyssoft.com/getting.htm. But I got the following > message in my gdb console and the program terminates. > > Program terminated with signal 0, Signal 0. > The program no longer exists. > (gdb) handle 0 nostop print pass > Only signals 1-15 are valid as numeric signals. > Use "info signals" for a list of symbolic signals. > > Is it a bug or a wrong configuration? I also post to qemu dev list to > see if it is a problem on their side or not. > > Thanks, > Neo - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
Re: [Kgdb-bugreport] [PATCH] Fix singlestep exception kernel->user x86_64 and ia32
Jason, This check atomic_read(&cpu_doing_single_step) != -1 may result in a loss of debug events on other cpus Changing it to atomic_read(&cpu_doing_single_step) == raw_smp_processor_id() corrects that problem I see that you've checked in this change. Would you mind waiting for 24 hours after posting a patch? Thanks. -Amit On Wednesday 16 May 2007 00:36, Jason Wessel wrote: > If there are no objections, I'd like to apply this patch as it fixes a > critical gap when using source stepping that calls a low level > singlestep that steps the boundary from the kernel to the user space. > It is particularly bad when this corner case kills "init" as you get an > unexpected reboot. > > --- > > This patch is to fix another corner case where kgdb can pass a single > step trap to user space which was intended for the kernel. Now a > source level "next" which single steps over an iret instruction will > cause KGDB to continue and print an error to the console, vs the > unpredictable consequence of sending a trap to unsuspecting code in > the user space (normally resulting in process termination). > > Signed-off-by: Jason Wessel <[EMAIL PROTECTED]> > > --- > arch/i386/kernel/kgdb.c | 31 +++ > arch/x86_64/kernel/kgdb.c | 17 - > 2 files changed, 31 insertions(+), 17 deletions(-) > > Index: linux-2.6.21-standard/arch/i386/kernel/kgdb.c > === > --- linux-2.6.21-standard.orig/arch/i386/kernel/kgdb.c > +++ linux-2.6.21-standard/arch/i386/kernel/kgdb.c > @@ -222,7 +222,7 @@ int kgdb_arch_handle_exception(int e_vec > if (remcom_in_buffer[0] == 's') { > linux_regs->eflags |= TF_MASK; > debugger_step = 1; > -atomic_set(&cpu_doing_single_step,smp_processor_id()); > +atomic_set(&cpu_doing_single_step,raw_smp_processor_id()); > } > > asm volatile ("movl %%db6, %0\n":"=r" (dr6)); > @@ -255,22 +255,29 @@ static int kgdb_notify(struct notifier_b > > /* Bad memory access? */ > if (cmd == DIE_PAGE_FAULT_NO_CONTEXT && atomic_read(&debugger_active) > -&& kgdb_may_fault) { > +&& kgdb_may_fault) { > kgdb_fault_longjmp(kgdb_fault_jmp_regs); > return NOTIFY_STOP; > } else if (cmd == DIE_PAGE_FAULT) > /* A normal page fault, ignore. */ > return NOTIFY_DONE; > - else if ((cmd == DIE_NMI || cmd == DIE_NMI_IPI || > - cmd == DIE_NMIWATCHDOG) && atomic_read(&debugger_active)) { > - /* CPU roundup */ > - kgdb_nmihook(smp_processor_id(), regs); > - return NOTIFY_STOP; > - } else if (cmd == DIE_NMI_IPI || cmd == DIE_NMI || > user_mode(regs) || > - (cmd == DIE_DEBUG && > atomic_read(&debugger_active))) - /* Normal watchdog event or > userspace debugging, or spurious -* debug exception, > ignore. */ > - return NOTIFY_DONE; > +else if ((cmd == DIE_NMI || cmd == DIE_NMI_IPI || > + cmd == DIE_NMIWATCHDOG) && atomic_read(&debugger_active)) { > +/* CPU roundup */ > +kgdb_nmihook(raw_smp_processor_id(), regs); > +return NOTIFY_STOP; > +} else if (cmd == DIE_DEBUG && atomic_read(&cpu_doing_single_step) > != -1 > + && user_mode(regs)) { > +/* single step exception from kernel space to user space so > + * eat the exception and continue the process > + */ > +printk(KERN_ERR "KGDB: trap/step from kernel to user space, > resuming...\n"); > +return NOTIFY_STOP; > +} else if (cmd == DIE_NMI_IPI || cmd == DIE_NMI || user_mode(regs) || > + (cmd == DIE_DEBUG && atomic_read(&debugger_active))) > +/* Normal watchdog event or userspace debugging, or spurious > + * debug exception, ignore. */ > +return NOTIFY_DONE; > > kgdb_handle_exception(args->trapnr, args->signr, args->err, regs); > > Index: linux-2.6.21-standard/arch/x86_64/kernel/kgdb.c > === > --- linux-2.6.21-standard.orig/arch/x86_64/kernel/kgdb.c > +++ linux-2.6.21-standard/arch/x86_64/kernel/kgdb.c > @@ -183,7 +183,7 @@ int kgdb_arch_handle_exception(int e_vec > debugger_step = 1; > if (kgdb_contthread) > atomic_set(&cpu_doing_single_step, > - smp_processor_id()); > + raw_smp_processor_id()); > > } > > @@ -237,7 +237,7 @@ void kgdb_shadowinfo(struct pt_regs *reg > static char intr_desc[] = "Stack at interrupt entrypoint"; > static char exc_desc[] = "Stack at exception entrypoint"; > struct pt_regs *stregs; > -int cpu = smp_processor_id(); > +int cpu = raw_smp_processor_id(); > > if ((stregs = in_interrupt_stack(regs->rsp, cpu))) > kgdb_mem2hex(intr_desc, buffer, strlen(intr_desc));