Re: ptrace attach in multi-threaded processes

2016-07-16 Thread Konstantin Belousov
On Fri, Jul 15, 2016 at 11:01:59AM -0700, Mark Johnston wrote:
> Thanks, this seems to give the desired behaviour in the single-threaded
> case. I'll write a test case for the multi-threaded case next.
> 
> Am I correct in thinking that r302179 could be reverted if your change
> is committed?
I suspect that it is not.

Suppose that we have a single-threaded process which only thread
is right on the syscall exit path when the debugger is attached,
and debugger requested PTRACE_SCX stops. Then the debuggee reaches
the ptracestop(SIGTRAP) stop point before cursig(9) is ever called.
So despite the patch, first reported signal is SIGTRAP and not the
attaching STOP.  If debugger detaches right after that, the process
should still be left in stopped state.

You may object that gcore(1) does not request SCX, but my point is that
even with the patch, first reported signal could be other than the
STOP.  I suspect that some other ptracestop() call might cause that
behaviour either now or in future even with the default event mask.

So I would left the r302179 in place.

Below is the patch with reverted next_xthread() bits. I reverted them
because Peter found the changes to require much more work to not cause
regressions.

diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c
index 1da4b99..9e1a494 100644
--- a/sys/kern/kern_sig.c
+++ b/sys/kern/kern_sig.c
@@ -2726,7 +2726,20 @@ issignal(struct thread *td)
SIG_STOPSIGMASK(sigpending);
if (SIGISEMPTY(sigpending)) /* no signal to send */
return (0);
-   sig = sig_ffs(&sigpending);
+   if ((p->p_flag & (P_TRACED | P_PPTRACE)) == P_TRACED &&
+   (p->p_flag2 & P2_PTRACE_FSTP) != 0 &&
+   SIGISMEMBER(sigpending, SIGSTOP)) {
+   /*
+* If debugger just attached, always consume
+* SIGSTOP from ptrace(PT_ATTACH) first, to
+* execute the debugger attach ritual in
+* order.
+*/
+   sig = SIGSTOP;
+   p->p_flag2 &= ~P2_PTRACE_FSTP;
+   } else {
+   sig = sig_ffs(&sigpending);
+   }
 
if (p->p_stops & S_SIG) {
mtx_unlock(&ps->ps_mtx);
@@ -2743,7 +2756,7 @@ issignal(struct thread *td)
sigqueue_delete(&p->p_sigqueue, sig);
continue;
}
-   if (p->p_flag & P_TRACED && (p->p_flag & P_PPTRACE) == 0) {
+   if ((p->p_flag & (P_TRACED | P_PPTRACE)) == P_TRACED) {
/*
 * If traced, always stop.
 * Remove old signal from queue before the stop.
@@ -2846,6 +2859,8 @@ issignal(struct thread *td)
mtx_unlock(&ps->ps_mtx);
WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK,
&p->p_mtx.lock_object, "Catching SIGSTOP");
+   sigqueue_delete(&td->td_sigqueue, sig);
+   sigqueue_delete(&p->p_sigqueue, sig);
p->p_flag |= P_STOPPED_SIG;
p->p_xsig = sig;
PROC_SLOCK(p);
@@ -2853,7 +2868,7 @@ issignal(struct thread *td)
thread_suspend_switch(td, p);
PROC_SUNLOCK(p);
mtx_lock(&ps->ps_mtx);
-   break;
+   goto next;
} else if (prop & SA_IGNORE) {
/*
 * Except for SIGCONT, shouldn't get here.
@@ -2884,6 +2899,7 @@ issignal(struct thread *td)
}
sigqueue_delete(&td->td_sigqueue, sig); /* take the signal! */
sigqueue_delete(&p->p_sigqueue, sig);
+next:;
}
/* NOTREACHED */
 }
diff --git a/sys/kern/sys_process.c b/sys/kern/sys_process.c
index f1477ce..86e7c52 100644
--- a/sys/kern/sys_process.c
+++ b/sys/kern/sys_process.c
@@ -900,6 +900,7 @@ kern_ptrace(struct thread *td, int req, pid_t pid, void 
*addr, int data)
case PT_TRACE_ME:
/* set my trace flag and "owner" so it can read/write me */
p->p_flag |= P_TRACED;
+   p->p_flag2 |= P2_PTRACE_FSTP;
p->p_ptevents = PTRACE_DEFAULT;
if (p->p_flag & P_PPWAIT)
p->p_flag |= P_PPTRACE;
@@ -919,6 +920,7 @@ kern_ptrace(struct thread *td, int req, pid_t pid, void 
*addr, int data)
 * on a "detach".
 */
p->p_flag |= P_TRACED;
+   p->p_flag2 |= P2_PTRACE_FSTP;
p->p_ptevents = PTRACE_DEFAULT;
p->p_oppid = p->p_p

Re: ptrace attach in multi-threaded processes

2016-07-15 Thread Mark Johnston
On Fri, Jul 15, 2016 at 10:27:20AM +0300, Konstantin Belousov wrote:
> On Thu, Jul 14, 2016 at 11:16:05AM -0700, Mark Johnston wrote:
> > Please see the program here:
> > https://people.freebsd.org/~markj/ptrace_stop.c
> > 
> > It cheats a bit: it uses SIGSTOP to stop the child before sending a
> > SIGHUP to it. However, this is just for convenience; note that PT_ATTACH
> > will result in a call to thread_unsuspend() on the child, so PT_ATTACH's
> > SIGSTOP will be delivered to a running process. When ptrace attaches,
> > the child stops and WSTOPSIG(status) == SIGHUP. When ptrace detaches,
> > the child is left stopped.
> No, it is not for convenience, it relies on another bug to get the effect,
> see below.

I see. I should have noted that the result can be reproduced without the
first SIGSTOP, just not reliably. That is, I still occasionally get the
following output when the kill(SIGSTOP) and subsequent waitpid() call
are removed:

stopping signal is 1
waiting on child...
child is stopped after detach (sig 17)

> 
> As I understand you intent, you prefer to get SIGSTOP from the first
> waitpid(2) call after successful PT_ATTACH, am I right ?

Hm, I don't care very much about that. I was just addressing your claim
that the "debugger interface guarantees that SIGSTOP is noted." 

> At least for
> single-threaded case, this can be achieved with a flag indicating that
> we a doing first cursig(9) action after the attach, and preferring
> SIGSTOP over any other queued signal.  The new flag P2_PTRACE_FSTP
> does just that.  For mt case, I believe that some enchancements to
> my proc_next_xthread() would fix that.

This seems like a sound approach to me. It provides the guarantee I
referenced above, and ensures that the SIGSTOP from PT_ATTACH is
delivered before PT_DETACH.

> 
> But when debugging the code, I found that it still does not work reliably
> for your test.  The reason is that issignal() consumes a queued stop signal
> after the thread_suspend_switch().  It allows the attach to occur, but then
> sigqueue_delete() calls ('take the signal!') eat the signal for attach. It
> seems that we should consume stops before going to stop state.  An open
> question is how much this hurts when another (non-debugging) SIGSTOP is
> queued while in stopped state.
> 
> Please try this.

Thanks, this seems to give the desired behaviour in the single-threaded
case. I'll write a test case for the multi-threaded case next.

Am I correct in thinking that r302179 could be reverted if your change
is committed?
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ptrace attach in multi-threaded processes

2016-07-15 Thread Konstantin Belousov
On Thu, Jul 14, 2016 at 11:16:05AM -0700, Mark Johnston wrote:
> Please see the program here:
> https://people.freebsd.org/~markj/ptrace_stop.c
> 
> It cheats a bit: it uses SIGSTOP to stop the child before sending a
> SIGHUP to it. However, this is just for convenience; note that PT_ATTACH
> will result in a call to thread_unsuspend() on the child, so PT_ATTACH's
> SIGSTOP will be delivered to a running process. When ptrace attaches,
> the child stops and WSTOPSIG(status) == SIGHUP. When ptrace detaches,
> the child is left stopped.
No, it is not for convenience, it relies on another bug to get the effect,
see below.

As I understand you intent, you prefer to get SIGSTOP from the first
waitpid(2) call after successful PT_ATTACH, am I right ?  At least for
single-threaded case, this can be achieved with a flag indicating that
we a doing first cursig(9) action after the attach, and preferring
SIGSTOP over any other queued signal.  The new flag P2_PTRACE_FSTP
does just that.  For mt case, I believe that some enchancements to
my proc_next_xthread() would fix that.

But when debugging the code, I found that it still does not work reliably
for your test.  The reason is that issignal() consumes a queued stop signal
after the thread_suspend_switch().  It allows the attach to occur, but then
sigqueue_delete() calls ('take the signal!') eat the signal for attach. It
seems that we should consume stops before going to stop state.  An open
question is how much this hurts when another (non-debugging) SIGSTOP is
queued while in stopped state.

Please try this.

diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c
index 2a5e6de..36ed15f 100644
--- a/sys/kern/kern_sig.c
+++ b/sys/kern/kern_sig.c
@@ -2525,22 +2525,21 @@ ptracestop(struct thread *td, int sig)
PROC_SUNLOCK(p);
return (sig);
}
-   /*
-* Just make wait() to work, the last stopped thread
-* will win.
-*/
-   p->p_xsig = sig;
-   p->p_xthread = td;
-   p->p_flag |= (P_STOPPED_SIG|P_STOPPED_TRACE);
-   sig_suspend_threads(td, p, 0);
-   if ((td->td_dbgflags & TDB_STOPATFORK) != 0) {
-   td->td_dbgflags &= ~TDB_STOPATFORK;
-   cv_broadcast(&p->p_dbgwait);
+   if (p->p_xthread == NULL)
+   p->p_xthread = td;
+   if (p->p_xthread == td) {
+   p->p_xsig = sig;
+   p->p_flag |= P_STOPPED_SIG | P_STOPPED_TRACE;
+   sig_suspend_threads(td, p, 0);
+   if ((td->td_dbgflags & TDB_STOPATFORK) != 0) {
+   td->td_dbgflags &= ~TDB_STOPATFORK;
+   cv_broadcast(&p->p_dbgwait);
+   }
}
 stopme:
thread_suspend_switch(td, p);
if (p->p_xthread == td)
-   p->p_xthread = NULL;
+   proc_next_xthread(p);
if (!(p->p_flag & P_TRACED))
break;
if (td->td_dbgflags & TDB_SUSPEND) {
@@ -2725,7 +2724,20 @@ issignal(struct thread *td)
SIG_STOPSIGMASK(sigpending);
if (SIGISEMPTY(sigpending)) /* no signal to send */
return (0);
-   sig = sig_ffs(&sigpending);
+   if ((p->p_flag & (P_TRACED | P_PPTRACE)) == P_TRACED &&
+   (p->p_flag2 & P2_PTRACE_FSTP) != 0 &&
+   SIGISMEMBER(sigpending, SIGSTOP)) {
+   /*
+* If debugger just attached, always consume
+* SIGSTOP from ptrace(PT_ATTACH) first, to
+* execute the debugger attach ritual in
+* order.
+*/
+   sig = SIGSTOP;
+   p->p_flag2 &= ~P2_PTRACE_FSTP;
+   } else {
+   sig = sig_ffs(&sigpending);
+   }
 
if (p->p_stops & S_SIG) {
mtx_unlock(&ps->ps_mtx);
@@ -2742,7 +2754,7 @@ issignal(struct thread *td)
sigqueue_delete(&p->p_sigqueue, sig);
continue;
}
-   if (p->p_flag & P_TRACED && (p->p_flag & P_PPTRACE) == 0) {
+   if ((p->p_flag & (P_TRACED | P_PPTRACE)) == P_TRACED) {
/*
 * If traced, always stop.
 * Remove old signal from queue before the stop.
@@ -2845,6 +2857,8 @@ issignal(struct thread *td)
mtx_unlock(&ps->ps_mtx);
WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK,
&p->p_mtx.lock_object, "Catching SIGSTOP");
+   sigqueue_delete

Re: ptrace attach in multi-threaded processes

2016-07-14 Thread Mark Johnston
On Thu, Jul 14, 2016 at 08:25:37AM +0300, Konstantin Belousov wrote:
> On Wed, Jul 13, 2016 at 01:01:39PM -0700, Mark Johnston wrote:
> > On Wed, Jul 13, 2016 at 10:19:47PM +0300, Konstantin Belousov wrote:
> > > Hmm, I think no, we can not make such change. Issue is, debugger
> > > interface guarantees (at least for single-threaded programs it is
> > > done correctly) that SIGSTOP is noted. In my opinion, it would be the
> > > incompatible API change.
> > 
> > But this guarantee is not honoured in the single-threaded case where
> > PT_ATTACH sends SIGSTOP after another signal is already pending. This
> > other signal will stop the process in ptracestop(), so SIGSTOP will not
> > be reported until after a PT_CONTINUE or PT_DETACH, which seems to
> > violate the interface as you described it. Am I missing some reason
> > that this cannot occur? If not, I'll write a test case for the
> > single-threaded case first.
> 
> Please give me some initial test case, I am fine with single-threaded case.
> I do not think that the mt test would be much different ?

Please see the program here:
https://people.freebsd.org/~markj/ptrace_stop.c

It cheats a bit: it uses SIGSTOP to stop the child before sending a
SIGHUP to it. However, this is just for convenience; note that PT_ATTACH
will result in a call to thread_unsuspend() on the child, so PT_ATTACH's
SIGSTOP will be delivered to a running process. When ptrace attaches,
the child stops and WSTOPSIG(status) == SIGHUP. When ptrace detaches,
the child is left stopped.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ptrace attach in multi-threaded processes

2016-07-13 Thread Konstantin Belousov
On Wed, Jul 13, 2016 at 01:01:39PM -0700, Mark Johnston wrote:
> On Wed, Jul 13, 2016 at 10:19:47PM +0300, Konstantin Belousov wrote:
> > On Wed, Jul 13, 2016 at 09:42:47AM -0700, Mark Johnston wrote:
> > > I'm having trouble determining if the diff changes any userland-visible
> > > behaviour. It seems that the only potential problem with the current
> > > p_xthread handling is in stopevent(), since a thread calling stopevent()
> > > from postsig() may clear p_xthread after it was set by another thread in
> > > ptracestop(). But I also don't understand why we call stopevent(S_SIG)
> > > from both issignal() and postsig() - this would appear to stop the
> > > thread twice for the same signal.
> > You mean that the patch would not fix your issue ? Quite possible, it
> > might require some more code to 'move the torch' to next xthread, so to
> > say. When you write the test case, I will spend efforts on the working
> > patch.
> 
> I don't think this addresses my issue of the process remaining stopped
> after the PT_DETACH, but see below.
Patch tries to add some coordination to ptracestop(), I do not object
to the statement that what was done is not enough.

> 
> > 
> > That said, I do not think that we should change anything about stopevent(),
> > since this is code which is on life support.  If we cannot remove procfs
> > debugging interface, let not change it at least in incompatible ways.
> > 
> > > 
> > > With respect to the desired direction, do you agree that the SIGSTOP
> > > from PT_ATTACH should effectively be ignored if a different signal stops
> > > the process first? As I said in a previous post, it seems that the
> > > SA_STOP property of PT_ATTACH's SIGSTOP is not used in the common case,
> > > since ptracestop() will stop the process if any signal is received, and
> > > the PT_DETACH operation will typically overwrite the SIGSTOP with 0 in
> > > td_xsig.
> > Hmm, I think no, we can not make such change. Issue is, debugger
> > interface guarantees (at least for single-threaded programs it is
> > done correctly) that SIGSTOP is noted. In my opinion, it would be the
> > incompatible API change.
> 
> But this guarantee is not honoured in the single-threaded case where
> PT_ATTACH sends SIGSTOP after another signal is already pending. This
> other signal will stop the process in ptracestop(), so SIGSTOP will not
> be reported until after a PT_CONTINUE or PT_DETACH, which seems to
> violate the interface as you described it. Am I missing some reason
> that this cannot occur? If not, I'll write a test case for the
> single-threaded case first.

Please give me some initial test case, I am fine with single-threaded case.
I do not think that the mt test would be much different ?
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ptrace attach in multi-threaded processes

2016-07-13 Thread Mark Johnston
On Wed, Jul 13, 2016 at 10:19:47PM +0300, Konstantin Belousov wrote:
> On Wed, Jul 13, 2016 at 09:42:47AM -0700, Mark Johnston wrote:
> > I'm having trouble determining if the diff changes any userland-visible
> > behaviour. It seems that the only potential problem with the current
> > p_xthread handling is in stopevent(), since a thread calling stopevent()
> > from postsig() may clear p_xthread after it was set by another thread in
> > ptracestop(). But I also don't understand why we call stopevent(S_SIG)
> > from both issignal() and postsig() - this would appear to stop the
> > thread twice for the same signal.
> You mean that the patch would not fix your issue ? Quite possible, it
> might require some more code to 'move the torch' to next xthread, so to
> say. When you write the test case, I will spend efforts on the working
> patch.

I don't think this addresses my issue of the process remaining stopped
after the PT_DETACH, but see below.

> 
> That said, I do not think that we should change anything about stopevent(),
> since this is code which is on life support.  If we cannot remove procfs
> debugging interface, let not change it at least in incompatible ways.
> 
> > 
> > With respect to the desired direction, do you agree that the SIGSTOP
> > from PT_ATTACH should effectively be ignored if a different signal stops
> > the process first? As I said in a previous post, it seems that the
> > SA_STOP property of PT_ATTACH's SIGSTOP is not used in the common case,
> > since ptracestop() will stop the process if any signal is received, and
> > the PT_DETACH operation will typically overwrite the SIGSTOP with 0 in
> > td_xsig.
> Hmm, I think no, we can not make such change. Issue is, debugger
> interface guarantees (at least for single-threaded programs it is
> done correctly) that SIGSTOP is noted. In my opinion, it would be the
> incompatible API change.

But this guarantee is not honoured in the single-threaded case where
PT_ATTACH sends SIGSTOP after another signal is already pending. This
other signal will stop the process in ptracestop(), so SIGSTOP will not
be reported until after a PT_CONTINUE or PT_DETACH, which seems to
violate the interface as you described it. Am I missing some reason
that this cannot occur? If not, I'll write a test case for the
single-threaded case first.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ptrace attach in multi-threaded processes

2016-07-13 Thread Konstantin Belousov
On Wed, Jul 13, 2016 at 09:42:47AM -0700, Mark Johnston wrote:
> On Wed, Jul 13, 2016 at 07:54:39AM +0300, Konstantin Belousov wrote:
> > I finally see.  Might be something like the patch below is a step in
> > the desired direction.  Idea is in the proc_next_xthread(): p_xthread
> > should be set to the next thread with a pending signal.  Do you have a
> > test case that demonstrates the issue ?
> 
> Not yet, I'll work on one. I've only seen this occur once in an Isilon
> test cluster.
> 
> The diff makes sense to me, thanks. I'd find the code easier to read if
> proc_next_xthread() was a pure function that returned the flagged
> thread instead of setting p_xthread.
I did coded it this way,
struct thread *proc_next_xthread(struct proc *p);
initially, but I did not liked that all uses are
p->p_xthread = proc_next_xthread(p);

> 
> I'm having trouble determining if the diff changes any userland-visible
> behaviour. It seems that the only potential problem with the current
> p_xthread handling is in stopevent(), since a thread calling stopevent()
> from postsig() may clear p_xthread after it was set by another thread in
> ptracestop(). But I also don't understand why we call stopevent(S_SIG)
> from both issignal() and postsig() - this would appear to stop the
> thread twice for the same signal.
You mean that the patch would not fix your issue ? Quite possible, it
might require some more code to 'move the torch' to next xthread, so to
say. When you write the test case, I will spend efforts on the working
patch.

That said, I do not think that we should change anything about stopevent(),
since this is code which is on life support.  If we cannot remove procfs
debugging interface, let not change it at least in incompatible ways.

> 
> With respect to the desired direction, do you agree that the SIGSTOP
> from PT_ATTACH should effectively be ignored if a different signal stops
> the process first? As I said in a previous post, it seems that the
> SA_STOP property of PT_ATTACH's SIGSTOP is not used in the common case,
> since ptracestop() will stop the process if any signal is received, and
> the PT_DETACH operation will typically overwrite the SIGSTOP with 0 in
> td_xsig.
Hmm, I think no, we can not make such change. Issue is, debugger
interface guarantees (at least for single-threaded programs it is
done correctly) that SIGSTOP is noted. In my opinion, it would be the
incompatible API change.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ptrace attach in multi-threaded processes

2016-07-13 Thread Mark Johnston
On Wed, Jul 13, 2016 at 07:54:39AM +0300, Konstantin Belousov wrote:
> I finally see.  Might be something like the patch below is a step in
> the desired direction.  Idea is in the proc_next_xthread(): p_xthread
> should be set to the next thread with a pending signal.  Do you have a
> test case that demonstrates the issue ?

Not yet, I'll work on one. I've only seen this occur once in an Isilon
test cluster.

The diff makes sense to me, thanks. I'd find the code easier to read if
proc_next_xthread() was a pure function that returned the flagged
thread instead of setting p_xthread.

I'm having trouble determining if the diff changes any userland-visible
behaviour. It seems that the only potential problem with the current
p_xthread handling is in stopevent(), since a thread calling stopevent()
from postsig() may clear p_xthread after it was set by another thread in
ptracestop(). But I also don't understand why we call stopevent(S_SIG)
from both issignal() and postsig() - this would appear to stop the
thread twice for the same signal.

With respect to the desired direction, do you agree that the SIGSTOP
from PT_ATTACH should effectively be ignored if a different signal stops
the process first? As I said in a previous post, it seems that the
SA_STOP property of PT_ATTACH's SIGSTOP is not used in the common case,
since ptracestop() will stop the process if any signal is received, and
the PT_DETACH operation will typically overwrite the SIGSTOP with 0 in
td_xsig.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ptrace attach in multi-threaded processes

2016-07-12 Thread Konstantin Belousov
On Tue, Jul 12, 2016 at 09:02:10PM -0700, Mark Johnston wrote:
> Hm, the only SIGSTOP in my scenario is the one generated by PT_ATTACH.
> The problem occurs when this SIGSTOP races with *any* other signal that's
> being delivered to the process and for which the process has registered
> a handler. For instance, SIGHUP after a log rotation.
> 
> If a real SIGSTOP races with PT_ATTACH, then I would indeed expect to
> find the process in a stopped state after the detach. Does this make
> more sense?

I finally see.  Might be something like the patch below is a step in
the desired direction.  Idea is in the proc_next_xthread(): p_xthread
should be set to the next thread with a pending signal.  Do you have a
test case that demonstrates the issue ?

diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c
index 2a5e6de..1106f3a 100644
--- a/sys/kern/kern_sig.c
+++ b/sys/kern/kern_sig.c
@@ -2525,22 +2525,21 @@ ptracestop(struct thread *td, int sig)
PROC_SUNLOCK(p);
return (sig);
}
-   /*
-* Just make wait() to work, the last stopped thread
-* will win.
-*/
-   p->p_xsig = sig;
-   p->p_xthread = td;
-   p->p_flag |= (P_STOPPED_SIG|P_STOPPED_TRACE);
-   sig_suspend_threads(td, p, 0);
-   if ((td->td_dbgflags & TDB_STOPATFORK) != 0) {
-   td->td_dbgflags &= ~TDB_STOPATFORK;
-   cv_broadcast(&p->p_dbgwait);
+   if (p->p_xthread == NULL)
+   p->p_xthread = td;
+   if (p->p_xthread == td) {
+   p->p_xsig = sig;
+   p->p_flag |= (P_STOPPED_SIG|P_STOPPED_TRACE);
+   sig_suspend_threads(td, p, 0);
+   if ((td->td_dbgflags & TDB_STOPATFORK) != 0) {
+   td->td_dbgflags &= ~TDB_STOPATFORK;
+   cv_broadcast(&p->p_dbgwait);
+   }
}
 stopme:
thread_suspend_switch(td, p);
if (p->p_xthread == td)
-   p->p_xthread = NULL;
+   proc_next_xthread(p);
if (!(p->p_flag & P_TRACED))
break;
if (td->td_dbgflags & TDB_SUSPEND) {
diff --git a/sys/kern/sys_process.c b/sys/kern/sys_process.c
index a6037e3..3d9950b 100644
--- a/sys/kern/sys_process.c
+++ b/sys/kern/sys_process.c
@@ -1057,7 +1057,7 @@ kern_ptrace(struct thread *td, int req, pid_t pid, void 
*addr, int data)
proctree_locked = 0;
}
p->p_xsig = data;
-   p->p_xthread = NULL;
+   proc_next_xthread(p);
if ((p->p_flag & (P_STOPPED_SIG | P_STOPPED_TRACE)) != 0) {
/* deliver or queue signal */
td2->td_dbgflags &= ~TDB_XSIG;
@@ -1065,7 +1065,8 @@ kern_ptrace(struct thread *td, int req, pid_t pid, void 
*addr, int data)
 
if (req == PT_DETACH) {
FOREACH_THREAD_IN_PROC(p, td3)
-   td3->td_dbgflags &= ~TDB_SUSPEND; 
+   td3->td_dbgflags &= ~(TDB_SUSPEND |
+   TDB_XSIG);
}
/*
 * unsuspend all threads, to not let a thread run,
@@ -1376,9 +1377,24 @@ stopevent(struct proc *p, unsigned int event, unsigned 
int val)
do {
if (event != S_EXIT)
p->p_xsig = val;
-   p->p_xthread = NULL;
+   proc_next_xthread(p);
p->p_stype = event; /* Which event caused the stop? */
wakeup(&p->p_stype);/* Wake up any PIOCWAIT'ing procs */
msleep(&p->p_step, &p->p_mtx, PWAIT, "stopevent", 0);
} while (p->p_step);
 }
+
+void
+proc_next_xthread(struct proc *p)
+{
+   struct thread *td;
+
+   PROC_LOCK_ASSERT(p, MA_OWNED);
+   FOREACH_THREAD_IN_PROC(p, td) {
+   if (td == p->p_xthread)
+   continue;
+   if ((td->td_dbgflags & TDB_XSIG) != 0)
+   break;
+   }
+   p->p_xthread = td;
+}
diff --git a/sys/sys/proc.h b/sys/sys/proc.h
index f533db6..a3132d9 100644
--- a/sys/sys/proc.h
+++ b/sys/sys/proc.h
@@ -999,6 +999,7 @@ int proc_getenvv(struct thread *td, struct proc *p, struct 
sbuf *sb);
 void   procinit(void);
 void   proc_linkup0(struct proc *p, struct thread *td);
 void   proc_linkup(struct proc *p, struct thread *td);
+void   proc_next_xthread(struct proc *p);
 struct proc *proc_realparent(struct proc *child);
 void   proc_reap(struct thread *td, struct proc *p, int *status, int options);
 void   proc_reparent(struct proc *child, struct proc *newparent);
_

Re: ptrace attach in multi-threaded processes

2016-07-12 Thread Mark Johnston
On Wed, Jul 13, 2016 at 06:30:36AM +0300, Konstantin Belousov wrote:
> On Tue, Jul 12, 2016 at 11:24:14AM -0700, Mark Johnston wrote:
> > On Tue, Jul 12, 2016 at 08:51:50PM +0300, Konstantin Belousov wrote:
> > > On Tue, Jul 12, 2016 at 10:05:02AM -0700, Mark Johnston wrote:
> > > > On Tue, Jul 12, 2016 at 08:57:53AM +0300, Konstantin Belousov wrote:
> > > > I suppose it is not strictly incorrect. I find it surprising that a
> > > > PT_ATTACH followed by a PT_DETACH may leave the process in a different
> > > > state than it was in before the attach. This means that it is not
> > > > possible to gcore a process without potentially leaving it stopped, for
> > > > instance. This result may occur in a single-threaded process
> > > > as well, since a signal may already be queued when the PT_ATTACH handler
> > > > sends SIGSTOP.
> > > I still miss somethine.  Isn't this an expected outcome from sending a
> > > signal with STOP action ?
> > 
> > It is. But I also expect a PT_DETACH operation to resume a stopped
> > process, assuming that a second SIGSTOP was not posted while the
> > process was suspended.
> But as far as the situation was discussed, it seems that real SIGSTOP raced
> with PT_ATTACH. And the offered interpretation that SIGSTOP was delivered
> 'a bit later' than PT_ATTACH would fit into the description.

Hm, the only SIGSTOP in my scenario is the one generated by PT_ATTACH.
The problem occurs when this SIGSTOP races with *any* other signal that's
being delivered to the process and for which the process has registered
a handler. For instance, SIGHUP after a log rotation.

If a real SIGSTOP races with PT_ATTACH, then I would indeed expect to
find the process in a stopped state after the detach. Does this make
more sense?
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ptrace attach in multi-threaded processes

2016-07-12 Thread Konstantin Belousov
On Tue, Jul 12, 2016 at 11:24:14AM -0700, Mark Johnston wrote:
> On Tue, Jul 12, 2016 at 08:51:50PM +0300, Konstantin Belousov wrote:
> > On Tue, Jul 12, 2016 at 10:05:02AM -0700, Mark Johnston wrote:
> > > On Tue, Jul 12, 2016 at 08:57:53AM +0300, Konstantin Belousov wrote:
> > > I suppose it is not strictly incorrect. I find it surprising that a
> > > PT_ATTACH followed by a PT_DETACH may leave the process in a different
> > > state than it was in before the attach. This means that it is not
> > > possible to gcore a process without potentially leaving it stopped, for
> > > instance. This result may occur in a single-threaded process
> > > as well, since a signal may already be queued when the PT_ATTACH handler
> > > sends SIGSTOP.
> > I still miss somethine.  Isn't this an expected outcome from sending a
> > signal with STOP action ?
> 
> It is. But I also expect a PT_DETACH operation to resume a stopped
> process, assuming that a second SIGSTOP was not posted while the
> process was suspended.
But as far as the situation was discussed, it seems that real SIGSTOP raced
with PT_ATTACH. And the offered interpretation that SIGSTOP was delivered
'a bit later' than PT_ATTACH would fit into the description.

> 
> > 
> > > Indeed, I somehow missed that. I had assumed that the leaked TDB_XSIG
> > > represented a bug in ptracestop().
> > It could, I did not made any statements that deny the bug:
> 
> To be clear, the root of my issue comes from the following: the SIGSTOP
> from PT_ATTACH may be handled concurrently with a second signal
> delivered to a second thread in the same process. Then, the resulting
> behaviour depends on the order in which the recipient threads suspend in
> ptracestop(). If the thread that received SIGSTOP suspends last, its
> td_xsig will be overwritten with the userland-provided value in the
> PT_DETACH handler. If it suspends first, its td_xsig will be preserved,
> and upon PT_DETACH the process will be suspended again in issignal().
> 
> I'm not sure if this is considered a bug. ptracestop() is handling all
> signals (including the SIGSTOP generated by the PT_ATTACH handler) in a
> consistent way, but this results in inconsistent behaviour from the
> perspective of a ptrace(2) consumer.

Still I do not understand what is inconsistent.

Let look at it from the other side (before, we discussed the implementation
in kernel).  Is this happens in gcore(1) ?   If yes, gcore interaction
with ptrace(2) looks like this:
ptrace(PT_ATTACH, g_pid);
waitpid(g_pid, &g_status, 0);
...
if (sig == SIGSTOP)
sig = 0;
ptrace(PT_DETACH, g_pid, 1, sig);
It sounds as if it is desirable for you to modify gcore(1) to consume
all signals, or at least, all STOP signals before PT_DETACH.  I do not
understand why do you want it, but that would probably give you the
behaviour you want:
ptrace(PT_ATTACH, g_pid);
waitpid(g_pid, &g_status, 0);
...
/* still consume implicit SIGSTOP from attach */
if (sig == SIGSTOP)
sig = 0;
do {
error = waitpid(g_pid, &g_status, WNOHANG | WSTOPPED);
} while (error == 0);   
ptrace(PT_DETACH, g_pid, 1, sig);
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ptrace attach in multi-threaded processes

2016-07-12 Thread Mark Johnston
On Tue, Jul 12, 2016 at 08:51:50PM +0300, Konstantin Belousov wrote:
> On Tue, Jul 12, 2016 at 10:05:02AM -0700, Mark Johnston wrote:
> > On Tue, Jul 12, 2016 at 08:57:53AM +0300, Konstantin Belousov wrote:
> > I suppose it is not strictly incorrect. I find it surprising that a
> > PT_ATTACH followed by a PT_DETACH may leave the process in a different
> > state than it was in before the attach. This means that it is not
> > possible to gcore a process without potentially leaving it stopped, for
> > instance. This result may occur in a single-threaded process
> > as well, since a signal may already be queued when the PT_ATTACH handler
> > sends SIGSTOP.
> I still miss somethine.  Isn't this an expected outcome from sending a
> signal with STOP action ?

It is. But I also expect a PT_DETACH operation to resume a stopped
process, assuming that a second SIGSTOP was not posted while the
process was suspended.

> 
> > Indeed, I somehow missed that. I had assumed that the leaked TDB_XSIG
> > represented a bug in ptracestop().
> It could, I did not made any statements that deny the bug:

To be clear, the root of my issue comes from the following: the SIGSTOP
from PT_ATTACH may be handled concurrently with a second signal
delivered to a second thread in the same process. Then, the resulting
behaviour depends on the order in which the recipient threads suspend in
ptracestop(). If the thread that received SIGSTOP suspends last, its
td_xsig will be overwritten with the userland-provided value in the
PT_DETACH handler. If it suspends first, its td_xsig will be preserved,
and upon PT_DETACH the process will be suspended again in issignal().

I'm not sure if this is considered a bug. ptracestop() is handling all
signals (including the SIGSTOP generated by the PT_ATTACH handler) in a
consistent way, but this results in inconsistent behaviour from the
perspective of a ptrace(2) consumer.

> 
> > > > Moreover, in my scenario I see a thread with TDB_XSIG set even after
> > > > ptrace(PT_DETACH) was called (P_TRACED is cleared).
> > > This is interesting, we indeed do not clear the flag consistently.
> > > But again, the only consequence seems to be a possible invalid reporting
> > > of events.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ptrace attach in multi-threaded processes

2016-07-12 Thread Konstantin Belousov
On Tue, Jul 12, 2016 at 10:05:02AM -0700, Mark Johnston wrote:
> On Tue, Jul 12, 2016 at 08:57:53AM +0300, Konstantin Belousov wrote:
> I suppose it is not strictly incorrect. I find it surprising that a
> PT_ATTACH followed by a PT_DETACH may leave the process in a different
> state than it was in before the attach. This means that it is not
> possible to gcore a process without potentially leaving it stopped, for
> instance. This result may occur in a single-threaded process
> as well, since a signal may already be queued when the PT_ATTACH handler
> sends SIGSTOP.
I still miss somethine.  Isn't this an expected outcome from sending a
signal with STOP action ?

> Indeed, I somehow missed that. I had assumed that the leaked TDB_XSIG
> represented a bug in ptracestop().
It could, I did not made any statements that deny the bug:

> > > Moreover, in my scenario I see a thread with TDB_XSIG set even after
> > > ptrace(PT_DETACH) was called (P_TRACED is cleared).
> > This is interesting, we indeed do not clear the flag consistently.
> > But again, the only consequence seems to be a possible invalid reporting
> > of events.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ptrace attach in multi-threaded processes

2016-07-12 Thread Mark Johnston
On Tue, Jul 12, 2016 at 08:57:53AM +0300, Konstantin Belousov wrote:
> On Mon, Jul 11, 2016 at 06:19:38PM -0700, Mark Johnston wrote:
> > Hi,
> > 
> > It seems to be possible for ptrace(PT_ATTACH) to race with the delivery
> > of a signal to the same process. ptrace(PT_ATTACH) sets P_TRACED and
> > sends SIGSTOP to a thread in the target process. Consider the case where
> > a signal is delivered to a second thread, and both threads are executing
> > ast() concurrently. The two threads will both call issignal() and from
> > there call ptracestop() because P_TRACED is set, though they will be
> > serialized by the proc lock. If the thread receiving SIGSTOP wins the
> > race, it will suspend first and set p->p_xthread. The second thread will
> > also suspend in ptracestop(), overwriting the p_xthread field set by the
> > first thread. Later, ptrace(PT_DETACH) will unsuspend the threads, but
> > it will set td->td_xsig only in the second thread. This means that the
> > first thread will return SIGSTOP from ptracestop() and subsequently
> > suspend the process, which seems rather incorrect.
> Why ?  In particular, why delivering STOP after attach, in the described
> situation, is perceived as incorrect ?  Parallel STOPs, one from attach,
> and other from kill(2), must result in two stops.

I suppose it is not strictly incorrect. I find it surprising that a
PT_ATTACH followed by a PT_DETACH may leave the process in a different
state than it was in before the attach. This means that it is not
possible to gcore a process without potentially leaving it stopped, for
instance. This result may occur in a single-threaded process
as well, since a signal may already be queued when the PT_ATTACH handler
sends SIGSTOP.

To me it just seems a bit strange that ptrace's mechanism for stopping
the target - sending SIGSTOP - interacts this way with ptrace's handling
of signals - ptracestop()). Specifically, PT_ATTACH does not rely on the
SA_STOP property of SIGSTOP to stop the process, but rather on the
special signal handling in ptracestop().

> 
> The bit about overwriting p_xsig/p_xthread indeed initially sound worrysome,
> but probably not too much.  The only consequence of reassigning p_xthread
> is the selection of the 'lead' thread in sys_process.c, it seems.
> 
> > 
> > The above is just a theory to explain an unexpectedly-stopped
> > multi-threaded process that I've observed. Is there some mechanism I'm
> > missing that prevents multiple threads from suspending in ptracestop()
> > at the same time? If not, then I think that's the root of the problem,
> > since p_xthread is pretty clearly not meant to be overwritten this way.
> Again, why ?
> 
> Note the comment 
>* Just make wait() to work, the last stopped thread
>  * will win.
> which seems to point to the situation.

Indeed, I somehow missed that. I had assumed that the leaked TDB_XSIG
represented a bug in ptracestop().

> 
> > Moreover, in my scenario I see a thread with TDB_XSIG set even after
> > ptrace(PT_DETACH) was called (P_TRACED is cleared).
> This is interesting, we indeed do not clear the flag consistently.
> But again, the only consequence seems to be a possible invalid reporting
> of events.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ptrace attach in multi-threaded processes

2016-07-12 Thread Konstantin Belousov
On Mon, Jul 11, 2016 at 06:19:38PM -0700, Mark Johnston wrote:
> Hi,
> 
> It seems to be possible for ptrace(PT_ATTACH) to race with the delivery
> of a signal to the same process. ptrace(PT_ATTACH) sets P_TRACED and
> sends SIGSTOP to a thread in the target process. Consider the case where
> a signal is delivered to a second thread, and both threads are executing
> ast() concurrently. The two threads will both call issignal() and from
> there call ptracestop() because P_TRACED is set, though they will be
> serialized by the proc lock. If the thread receiving SIGSTOP wins the
> race, it will suspend first and set p->p_xthread. The second thread will
> also suspend in ptracestop(), overwriting the p_xthread field set by the
> first thread. Later, ptrace(PT_DETACH) will unsuspend the threads, but
> it will set td->td_xsig only in the second thread. This means that the
> first thread will return SIGSTOP from ptracestop() and subsequently
> suspend the process, which seems rather incorrect.
Why ?  In particular, why delivering STOP after attach, in the described
situation, is perceived as incorrect ?  Parallel STOPs, one from attach,
and other from kill(2), must result in two stops.

The bit about overwriting p_xsig/p_xthread indeed initially sound worrysome,
but probably not too much.  The only consequence of reassigning p_xthread
is the selection of the 'lead' thread in sys_process.c, it seems.

> 
> The above is just a theory to explain an unexpectedly-stopped
> multi-threaded process that I've observed. Is there some mechanism I'm
> missing that prevents multiple threads from suspending in ptracestop()
> at the same time? If not, then I think that's the root of the problem,
> since p_xthread is pretty clearly not meant to be overwritten this way.
Again, why ?

Note the comment 
 * Just make wait() to work, the last stopped thread
 * will win.
which seems to point to the situation.

> Moreover, in my scenario I see a thread with TDB_XSIG set even after
> ptrace(PT_DETACH) was called (P_TRACED is cleared).
This is interesting, we indeed do not clear the flag consistently.
But again, the only consequence seems to be a possible invalid reporting
of events.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"