On Thu, May 20, 2010 at 11:10:03PM +1000, Paul Mackerras wrote: > On Thu, May 20, 2010 at 09:36:03AM +0530, K.Prasad wrote: > (Had this mail composed along with the patchset...but mail server issues caused delay in sending this...)
Hi Paul, While we continue to discuss some of the design decisions, I thought I'd ready up a patchset to capture the changes we agreed upon to (lest we miss them). I have sent a new version of the patchset here: linuxppc-dev message-id: 20100524040137.ga20...@in.ibm.com. Please see more responses below. > > > Right. However, the thread is running the signal handler without the > > > DABR being set, which is unfortunate. > > > > > > > In order to keep the breakpoint active during signal handling, a > > PowerPC specific signal handling code, say do_signal_pending() in > > arch/powerpc/kernel/signal.c, must be tapped to check for/restore > > the breakpoint for the process (if it exists). > > What I would suggest is something slightly different: anything that > causes the thread to change where it's executing -- signal delivery, > modification of NIP with ptrace -- should cancel the single-step and > reinstall the breakpoint in the DABR. In other words we just forget > that we hit the breakpoint, and rely on hitting it again if we ever > get back to that instruction. I think that is by far the most > reliable approach. > > That means that the hw-breakpoint code needs to export a function > called, say, thread_change_pc(), which is called whenever we're > changing a thread's userspace PC (NIP) value. If the hw-breakpoint > code has put that thread into single-step mode, we cancel the > single-step and if the thread is current, set DABR. > I have made changes to signal-handling code on the suggested lines (as seen here: linuxppc-dev message-id:20100524040342.ge20...@in.ibm.com) wherein the debug registers are populated before signal-delivery and cleaned during signal-return. However handling of nested interrupts, where second exception is taken inside the signal handler is still flimsy and the system would experience two hw-breakpoint exceptions. To overcome the same, we will need a flag in 'struct thread_struct' or perhaps in 'struct arch_hw_breakpoint' to indicate a breakpoint previously taken in signal-handling context. Given that the repurcussions of a double-hit are not dangerous, and unsure of how an addition to thread_struct might be received, I've skipped those changes for now. > > I'm afraid if this is more complexity than we want to handle in the > > first patchset. I agree that this will create a 'blind-spot' of code > > which cannot not be monitored using breakpoints and may limit debugging > > efforts (specially for memory corruption); however suspecting that signal > > handlers (especially those that don't return to original instruction) > > would be rare, I feel that this could be a 'feature' that can be brought > > later-in. What do you think? > > I think the thread_change_pc() approach is quite a bit simpler, and I > think it's better to get this right at the outset rather than have it > cause bugs later on, when we've all forgotten all the curly > details. :) Yes, the details are mostly captured in the latest patchset. Had to make some 'bold' changes to overcome the issues though :-) > > > > Imagine this scenario: we get the DABR match, set MSR_SE and return to > > > the task. In the meantime another higher-priority task has become > > > runnable and our need_resched flag is set, so we call schedule() on > > > the way back out to usermode. The other task runs and then blocks and > > > our task gets scheduled again. As part of the context switch, > > > arch_install_hw_breakpoint() will get called and will set DABR. So > > > we'll return to usermode with both DABR and MSE_SE set. > > > > > > > I didn't foresee such a possibility. I think this can be handled by > > putting a check in arch_install_hw_breakpoint() as shown below: > > > > int arch_install_hw_breakpoint(struct perf_event *bp) > > { > > struct arch_hw_breakpoint *info = counter_arch_bp(bp); > > struct perf_event **slot = &__get_cpu_var(bp_per_reg); > > > > *slot = bp; > > if (!current->thread.last_hit_ubp) > > set_dabr(info->address | info->type | DABR_TRANSLATION); > > return 0; > > } > > Yes, basically, but don't we need to handle per-cpu breakpoints as > well? That is, we only want the extra check if this breakpoint is a > per-task breakpoint. Or am I not seeing enough context here? > Until version XVIII of the patchset, the antiquated notion of user- and kernel-space still existed to some extent. Through changes made here (reference: linuxppc-dev message-id: 20100524040418.gf20...@in.ibm.com) per-task (whose pid > 0) and per-cpu breakpoints are suitably identified. Now, per-task breakpoints can be one of the following - User-space breakpoints bound to a process 'struct task_struct'. - Kernel-space breakpoints bound to a process 'struct task_struct'. - Kernel-thread breakpoint registered through register_user_hw_breakpoint() and still identified through 'struct task_struct' after cpu-migration. The above type of breakpoints may be restricted to a given cpu, yet they are identifiable through the 'task_struct'. I am unable to think of a case where a cpu-bound breakpoint request (essentially a kernel-space address request) without a process context (i.e. no identifiable 'task_struct') could be scheduled and migrated across CPUs. Moving to version XX, the emulation of instructions during non-pertask breakpoints does not provide a scope for pre-emption at all. Hence the above check should suffice in either case. > Also, I have just posted extensions to emulate_single_step() to handle > loads and stores. I think this should be enough to handle DABR > interrupts that occur in kernel mode, so we should never need to > actually single-step in that case -- if emulate_step fails we should > just print a warning, uninstall the breakpoint, and continue. That > way we don't have to worry about all the interrupts and other > eventualities that could occur while single-stepping in the kernel. > It was beginning to respond to this section of the mail did I realise that I failed to implement the suggestion to unregister breakpoint if emulate_singlestep() failed; and hence the quick successive release of version XX of the patchset :-) It is true that if emulate_single_step() was powerful, it simplifies the design to a large extent. However during my test (with the extensions to emulate_single_step() applied), I found that the startup test for ftrace (over ksym_tracer) failed as the breakpoint got unregistered (as a consequence of emulate_single_step() failure). I'm not sure if I missed something here...pasting the relevant disassembly of code below. 0xc000000000138a2c <trace_selftest_startup_ksym+124>: mr r29,r3 0xc000000000138a30 <trace_selftest_startup_ksym+128>: blt cr7,0xc000000000138aac <trace_selftest_startup_ksym+252> 0xc000000000138a34 <trace_selftest_startup_ksym+132>: lwz r0,0(r28) <===== FAILED INSTRUCTION 0xc000000000138a38 <trace_selftest_startup_ksym+136>: cmpwi cr7,r0,0 0xc000000000138a3c <trace_selftest_startup_ksym+140>: bne cr7,0xc000000000138a48 <trace_selftest_startup_ksym+152> > For DABR interrupts that occur in user mode, I think the approach of > single-stepping together with calls to thread_change_pc() in the > signal delivery and ptrace code should handle all the cases, at least > for per-task breakpoints. Per-cpu breakpoints that get hit in user > mode will be a bit trickier, but I assume we can restrict per-cpu > breakpoints to kernel addresses for now. If you want help adding the > thread_change_pc() calls to signal delivery and ptrace, let me know. The newer patchsets should be taking care of such combinations i.e. kernel-space address for a per-task/per-cpu breakpoint and user-space breakpoint which is per-cpu. Implementations found in version XIX and XX, handle the constraints in their own way (although version XX is much simpler if emulate_single_step() mostly works). In the absence of any adverse comments and given a powerful emulate_single_step() which works for most instructions, I'd prefer to push the version XX of the patchset for upstream integration. Thanking you again for your help/review comments. Let me know your thoughts on the newer patchset. Thanks, K.Prasad _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev