Re: debug registers and fork
Andi, On Mon, Mar 05, 2007 at 06:25:16PM +0100, Andi Kleen wrote: > On Tuesday 27 February 2007 00:51, Stephane Eranian wrote: > > > > I have come across an issue with a monitoring using the > > hardware debug registers on ia64/i386/x86-64. > > > > It seems that the way debug registers are inherited across fork > > differs between ia-64 and i386/x86-64. On ia-64, the debug registers > > are NEVER inherited in the child. The copy_thread() routine clears > > the necessary thread flags to avoid reloading the debug registers in > > the child. > > > > Now, on x86-64, it appears that the TIF_DEBUG flag is inherited via > > setup_thread_stack(). By virtue of dup_task_struct() the debug registers > > get copied into the child task on fork. So the child has active breakpoints, > > unless I am mistaken somewhere. > > > > Given the way the ptrace() interface works, I would tend to > > think that the ia-64 way is the correct one. Any comment? > > IA64 is probably correct, but changing this might break existing programs. This could only break programs using the hardware debug registers and following across clone. > Would that be worth the change? What advantage would you have from it. > It would be nice to have a uniform behavior across architectures on this. this is purely on OS policy descision. As I said, there is enough support in ptrace (and probably in utrace) to catch the clone event and setup the breakpoints in the new task if requested. The reason I detected this is because the pfmon tool was entering some infinite loop on SIGTRAP searching for breakpoints it had not set on a child process. The pfmon tool was designed on IA-64, so it did not assume breakpoints were systematically inherited quite the contrary as it provides an option to explicitely inherit them. > > Furthermore, on i386/x86-64, when switching out from a task with TIF_DEBUG > > enabled to another which does not, it seems we do not clear the debug > > registers (at least dr7) so they become inactive. > > You mean they leak? Perhaps they should be cleared. > Well, a ptrace expert pointed out to me that there is a controlled leak to the next thread. That leak is terminated in traps.c by noticing that you get a breakpoint trap but in a task which has debugreg[7]=0. In that case, the kernel simply clears dr7 to shut down any subsequent breakpoint traps. That kind of lazy stop, saves having to clear dr7 in __switch_to() when switching from a task using the debug registers to one which does not. i still have to convince myself this works in all cases, especially if that task suddenly starts using the debug registers and program the other debug registers BEFORE dr7 (assuming it was zero). -- -Stephane - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug registers and fork
On Tuesday 27 February 2007 00:51, Stephane Eranian wrote: > Hello, > > I have come across an issue with a monitoring using the > hardware debug registers on ia64/i386/x86-64. > > It seems that the way debug registers are inherited across fork > differs between ia-64 and i386/x86-64. On ia-64, the debug registers > are NEVER inherited in the child. The copy_thread() routine clears > the necessary thread flags to avoid reloading the debug registers in > the child. > > Now, on x86-64, it appears that the TIF_DEBUG flag is inherited via > setup_thread_stack(). By virtue of dup_task_struct() the debug registers > get copied into the child task on fork. So the child has active breakpoints, > unless I am mistaken somewhere. > > Given the way the ptrace() interface works, I would tend to > think that the ia-64 way is the correct one. Any comment? IA64 is probably correct, but changing this might break existing programs. Would that be worth the change? What advantage would you have from it. > Furthermore, on i386/x86-64, when switching out from a task with TIF_DEBUG > enabled to another which does not, it seems we do not clear the debug > registers (at least dr7) so they become inactive. You mean they leak? Perhaps they should be cleared. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug registers and fork
On Tuesday 27 February 2007 00:51, Stephane Eranian wrote: Hello, I have come across an issue with a monitoring using the hardware debug registers on ia64/i386/x86-64. It seems that the way debug registers are inherited across fork differs between ia-64 and i386/x86-64. On ia-64, the debug registers are NEVER inherited in the child. The copy_thread() routine clears the necessary thread flags to avoid reloading the debug registers in the child. Now, on x86-64, it appears that the TIF_DEBUG flag is inherited via setup_thread_stack(). By virtue of dup_task_struct() the debug registers get copied into the child task on fork. So the child has active breakpoints, unless I am mistaken somewhere. Given the way the ptrace() interface works, I would tend to think that the ia-64 way is the correct one. Any comment? IA64 is probably correct, but changing this might break existing programs. Would that be worth the change? What advantage would you have from it. Furthermore, on i386/x86-64, when switching out from a task with TIF_DEBUG enabled to another which does not, it seems we do not clear the debug registers (at least dr7) so they become inactive. You mean they leak? Perhaps they should be cleared. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug registers and fork
Andi, On Mon, Mar 05, 2007 at 06:25:16PM +0100, Andi Kleen wrote: On Tuesday 27 February 2007 00:51, Stephane Eranian wrote: I have come across an issue with a monitoring using the hardware debug registers on ia64/i386/x86-64. It seems that the way debug registers are inherited across fork differs between ia-64 and i386/x86-64. On ia-64, the debug registers are NEVER inherited in the child. The copy_thread() routine clears the necessary thread flags to avoid reloading the debug registers in the child. Now, on x86-64, it appears that the TIF_DEBUG flag is inherited via setup_thread_stack(). By virtue of dup_task_struct() the debug registers get copied into the child task on fork. So the child has active breakpoints, unless I am mistaken somewhere. Given the way the ptrace() interface works, I would tend to think that the ia-64 way is the correct one. Any comment? IA64 is probably correct, but changing this might break existing programs. This could only break programs using the hardware debug registers and following across clone. Would that be worth the change? What advantage would you have from it. It would be nice to have a uniform behavior across architectures on this. this is purely on OS policy descision. As I said, there is enough support in ptrace (and probably in utrace) to catch the clone event and setup the breakpoints in the new task if requested. The reason I detected this is because the pfmon tool was entering some infinite loop on SIGTRAP searching for breakpoints it had not set on a child process. The pfmon tool was designed on IA-64, so it did not assume breakpoints were systematically inherited quite the contrary as it provides an option to explicitely inherit them. Furthermore, on i386/x86-64, when switching out from a task with TIF_DEBUG enabled to another which does not, it seems we do not clear the debug registers (at least dr7) so they become inactive. You mean they leak? Perhaps they should be cleared. Well, a ptrace expert pointed out to me that there is a controlled leak to the next thread. That leak is terminated in traps.c by noticing that you get a breakpoint trap but in a task which has debugreg[7]=0. In that case, the kernel simply clears dr7 to shut down any subsequent breakpoint traps. That kind of lazy stop, saves having to clear dr7 in __switch_to() when switching from a task using the debug registers to one which does not. i still have to convince myself this works in all cases, especially if that task suddenly starts using the debug registers and program the other debug registers BEFORE dr7 (assuming it was zero). -- -Stephane - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug registers and fork
Alan, On Wed, Feb 28, 2007 at 07:01:17PM -0500, Alan Stern wrote: > On Wed, 28 Feb 2007, Roland McGrath wrote: > > > It is true that debug registers are inherited by fork and clone. > > I am 99% sure that this was never specifically intended, but it > > has been this way for a long time (since 2.4 at least). It's an > > implicit consequence of the do_fork implementation style, which > > does a blind copy of the whole task_struct and then explicitly > > reinitializes some individual fields. I suppose this has some > > benefit or other, but it is very prone to new pieces of state > > getting implicitly copied without the person adding that new state > > ever consciously deciding what its inheritance semantics should be. > > > > Alan Stern is working on a revamp of the x86 debug register > > support. This is a fine opportunity to clean this area up and > > decide positively what the semantics ought to be. > > Absolutely. Right now I just have a placeholder function with a note > about checking for CLONE_PTRACE. The cleanest solution, far and away, > would be to have the child process inherit no breakpoints and no debug > register values. > I agree and that is how we have it on IA-64. With debugging, there is always another process involved and no matter what I think it needs to be aware of the new child. I don't think autoamtic inheritance is good. It should always be trigger by the controlling process (e.g., debugger). There is enough support in ptrace to catch the fork/vfork/pthread_create and decide what to do. This is how I have coded perfmon so that hardware performance counters are never automatically inherited. -- -Stephane - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug registers and fork
On Wed, 28 Feb 2007, Roland McGrath wrote: > It is true that debug registers are inherited by fork and clone. > I am 99% sure that this was never specifically intended, but it > has been this way for a long time (since 2.4 at least). It's an > implicit consequence of the do_fork implementation style, which > does a blind copy of the whole task_struct and then explicitly > reinitializes some individual fields. I suppose this has some > benefit or other, but it is very prone to new pieces of state > getting implicitly copied without the person adding that new state > ever consciously deciding what its inheritance semantics should be. > > Alan Stern is working on a revamp of the x86 debug register > support. This is a fine opportunity to clean this area up and > decide positively what the semantics ought to be. Absolutely. Right now I just have a placeholder function with a note about checking for CLONE_PTRACE. The cleanest solution, far and away, would be to have the child process inherit no breakpoints and no debug register values. Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug registers and fork
It is true that debug registers are inherited by fork and clone. I am 99% sure that this was never specifically intended, but it has been this way for a long time (since 2.4 at least). It's an implicit consequence of the do_fork implementation style, which does a blind copy of the whole task_struct and then explicitly reinitializes some individual fields. I suppose this has some benefit or other, but it is very prone to new pieces of state getting implicitly copied without the person adding that new state ever consciously deciding what its inheritance semantics should be. Alan Stern is working on a revamp of the x86 debug register support. This is a fine opportunity to clean this area up and decide positively what the semantics ought to be. When his stuff gets ported to other machines, that will be a natural way to make the analogous stuff coherent and sensible on all machines that have debug-feature CPU state. AFAIK, gdb expects this behavior but not in the positive sense. Rather, it finds the kernel's semantics here unhelpful, and has to work around them. If it has watchpoints on a thread that might fork, it has to catch the child just to clear the debug registers even if it never really wanted to be tracing that child. Otherwise, the fork/clone child that was never ptrace'd at all (and its children!) might get a spurious SIGTRAP later and dump core for no apparent reason; at least exec does clear the debug registers (flush_thread). Since the debugger interface is the only way to set the debug registers, this kernel behavior seems rather insane on the face of it. OTOH, there is always the argument to leave existing behavior as it is for compatibility's sake. (I won't be shocked to find some loony application that uses ptrace on its own threads to set debug registers with the expectation of running a SIGTRAP handler; such things have been seen out there, though we no longer allow exactly that with NPTL threads.) I'm pretty sure gdb won't mind if the inheritance goes away, though we should check with gdb people to be sure before changing any semantics. Personally, I don't care whether the semantics of fork when the debug registers were previously set by ptrace change. Existing applications already have to cope with the lossage to work now, and won't be able to go without those workarounds later anyway if they want to support older kernels. With Alan's stuff, particular facilities cooperate coherently on maintaining this thread state, and inheritance semantics for each particular use will be specified explicitly how that use wants it. Eventually I think all "raw" use of the debug registers (as by the current ptrace interfaces) will be obsolete anyway. It is true that %dr7 is not cleared when switching to a task where it's logically 0, but that is intentional and not a problem AFAIK. The trap handler (arch/{i386,x86_64}/kernel/traps.c:do_debug) first checks if %dr7 is logically 0 in the current task, and if so it swallows the trap and clears %dr7 in hardware. This also has been this way for a very long time. I assume that whenever it was first implemented, someone found reason to think that clearing %dr7 was more costly overall than the possibility of a spurious trap (relatively quite unlikely compared to 100% of context switches). (I have no idea what the overhead is on current or older hardware.) I have no reason to think there is anything wrong with how this behaves. Thanks, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug registers and fork
It is true that debug registers are inherited by fork and clone. I am 99% sure that this was never specifically intended, but it has been this way for a long time (since 2.4 at least). It's an implicit consequence of the do_fork implementation style, which does a blind copy of the whole task_struct and then explicitly reinitializes some individual fields. I suppose this has some benefit or other, but it is very prone to new pieces of state getting implicitly copied without the person adding that new state ever consciously deciding what its inheritance semantics should be. Alan Stern is working on a revamp of the x86 debug register support. This is a fine opportunity to clean this area up and decide positively what the semantics ought to be. When his stuff gets ported to other machines, that will be a natural way to make the analogous stuff coherent and sensible on all machines that have debug-feature CPU state. AFAIK, gdb expects this behavior but not in the positive sense. Rather, it finds the kernel's semantics here unhelpful, and has to work around them. If it has watchpoints on a thread that might fork, it has to catch the child just to clear the debug registers even if it never really wanted to be tracing that child. Otherwise, the fork/clone child that was never ptrace'd at all (and its children!) might get a spurious SIGTRAP later and dump core for no apparent reason; at least exec does clear the debug registers (flush_thread). Since the debugger interface is the only way to set the debug registers, this kernel behavior seems rather insane on the face of it. OTOH, there is always the argument to leave existing behavior as it is for compatibility's sake. (I won't be shocked to find some loony application that uses ptrace on its own threads to set debug registers with the expectation of running a SIGTRAP handler; such things have been seen out there, though we no longer allow exactly that with NPTL threads.) I'm pretty sure gdb won't mind if the inheritance goes away, though we should check with gdb people to be sure before changing any semantics. Personally, I don't care whether the semantics of fork when the debug registers were previously set by ptrace change. Existing applications already have to cope with the lossage to work now, and won't be able to go without those workarounds later anyway if they want to support older kernels. With Alan's stuff, particular facilities cooperate coherently on maintaining this thread state, and inheritance semantics for each particular use will be specified explicitly how that use wants it. Eventually I think all raw use of the debug registers (as by the current ptrace interfaces) will be obsolete anyway. It is true that %dr7 is not cleared when switching to a task where it's logically 0, but that is intentional and not a problem AFAIK. The trap handler (arch/{i386,x86_64}/kernel/traps.c:do_debug) first checks if %dr7 is logically 0 in the current task, and if so it swallows the trap and clears %dr7 in hardware. This also has been this way for a very long time. I assume that whenever it was first implemented, someone found reason to think that clearing %dr7 was more costly overall than the possibility of a spurious trap (relatively quite unlikely compared to 100% of context switches). (I have no idea what the overhead is on current or older hardware.) I have no reason to think there is anything wrong with how this behaves. Thanks, Roland - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug registers and fork
On Wed, 28 Feb 2007, Roland McGrath wrote: It is true that debug registers are inherited by fork and clone. I am 99% sure that this was never specifically intended, but it has been this way for a long time (since 2.4 at least). It's an implicit consequence of the do_fork implementation style, which does a blind copy of the whole task_struct and then explicitly reinitializes some individual fields. I suppose this has some benefit or other, but it is very prone to new pieces of state getting implicitly copied without the person adding that new state ever consciously deciding what its inheritance semantics should be. Alan Stern is working on a revamp of the x86 debug register support. This is a fine opportunity to clean this area up and decide positively what the semantics ought to be. Absolutely. Right now I just have a placeholder function with a note about checking for CLONE_PTRACE. The cleanest solution, far and away, would be to have the child process inherit no breakpoints and no debug register values. Alan Stern - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug registers and fork
Alan, On Wed, Feb 28, 2007 at 07:01:17PM -0500, Alan Stern wrote: On Wed, 28 Feb 2007, Roland McGrath wrote: It is true that debug registers are inherited by fork and clone. I am 99% sure that this was never specifically intended, but it has been this way for a long time (since 2.4 at least). It's an implicit consequence of the do_fork implementation style, which does a blind copy of the whole task_struct and then explicitly reinitializes some individual fields. I suppose this has some benefit or other, but it is very prone to new pieces of state getting implicitly copied without the person adding that new state ever consciously deciding what its inheritance semantics should be. Alan Stern is working on a revamp of the x86 debug register support. This is a fine opportunity to clean this area up and decide positively what the semantics ought to be. Absolutely. Right now I just have a placeholder function with a note about checking for CLONE_PTRACE. The cleanest solution, far and away, would be to have the child process inherit no breakpoints and no debug register values. I agree and that is how we have it on IA-64. With debugging, there is always another process involved and no matter what I think it needs to be aware of the new child. I don't think autoamtic inheritance is good. It should always be trigger by the controlling process (e.g., debugger). There is enough support in ptrace to catch the fork/vfork/pthread_create and decide what to do. This is how I have coded perfmon so that hardware performance counters are never automatically inherited. -- -Stephane - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug registers and fork
> On Mon, 26 Feb 2007 15:51:54 -0800 Stephane Eranian <[EMAIL PROTECTED]> wrote: > Hello, > > I have come across an issue with a monitoring using the > hardware debug registers on ia64/i386/x86-64. > > It seems that the way debug registers are inherited across fork > differs between ia-64 and i386/x86-64. On ia-64, the debug registers > are NEVER inherited in the child. The copy_thread() routine clears > the necessary thread flags to avoid reloading the debug registers in > the child. > > Now, on x86-64, it appears that the TIF_DEBUG flag is inherited via > setup_thread_stack(). By virtue of dup_task_struct() the debug registers > get copied into the child task on fork. So the child has active breakpoints, > unless I am mistaken somewhere. > > Given the way the ptrace() interface works, I would tend to > think that the ia-64 way is the correct one. Any comment? > > Furthermore, on i386/x86-64, when switching out from a task with TIF_DEBUG > enabled to another which does not, it seems we do not clear the debug > registers (at least dr7) so they become inactive. > Let's cc Roland - he's totally rewritten ptrace and probably knows this stuff. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug registers and fork
On Mon, 26 Feb 2007 15:51:54 -0800 Stephane Eranian [EMAIL PROTECTED] wrote: Hello, I have come across an issue with a monitoring using the hardware debug registers on ia64/i386/x86-64. It seems that the way debug registers are inherited across fork differs between ia-64 and i386/x86-64. On ia-64, the debug registers are NEVER inherited in the child. The copy_thread() routine clears the necessary thread flags to avoid reloading the debug registers in the child. Now, on x86-64, it appears that the TIF_DEBUG flag is inherited via setup_thread_stack(). By virtue of dup_task_struct() the debug registers get copied into the child task on fork. So the child has active breakpoints, unless I am mistaken somewhere. Given the way the ptrace() interface works, I would tend to think that the ia-64 way is the correct one. Any comment? Furthermore, on i386/x86-64, when switching out from a task with TIF_DEBUG enabled to another which does not, it seems we do not clear the debug registers (at least dr7) so they become inactive. Let's cc Roland - he's totally rewritten ptrace and probably knows this stuff. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
debug registers and fork
Hello, I have come across an issue with a monitoring using the hardware debug registers on ia64/i386/x86-64. It seems that the way debug registers are inherited across fork differs between ia-64 and i386/x86-64. On ia-64, the debug registers are NEVER inherited in the child. The copy_thread() routine clears the necessary thread flags to avoid reloading the debug registers in the child. Now, on x86-64, it appears that the TIF_DEBUG flag is inherited via setup_thread_stack(). By virtue of dup_task_struct() the debug registers get copied into the child task on fork. So the child has active breakpoints, unless I am mistaken somewhere. Given the way the ptrace() interface works, I would tend to think that the ia-64 way is the correct one. Any comment? Furthermore, on i386/x86-64, when switching out from a task with TIF_DEBUG enabled to another which does not, it seems we do not clear the debug registers (at least dr7) so they become inactive. -- -Stephane - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
debug registers and fork
Hello, I have come across an issue with a monitoring using the hardware debug registers on ia64/i386/x86-64. It seems that the way debug registers are inherited across fork differs between ia-64 and i386/x86-64. On ia-64, the debug registers are NEVER inherited in the child. The copy_thread() routine clears the necessary thread flags to avoid reloading the debug registers in the child. Now, on x86-64, it appears that the TIF_DEBUG flag is inherited via setup_thread_stack(). By virtue of dup_task_struct() the debug registers get copied into the child task on fork. So the child has active breakpoints, unless I am mistaken somewhere. Given the way the ptrace() interface works, I would tend to think that the ia-64 way is the correct one. Any comment? Furthermore, on i386/x86-64, when switching out from a task with TIF_DEBUG enabled to another which does not, it seems we do not clear the debug registers (at least dr7) so they become inactive. -- -Stephane - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/