On Wed, Jun 19, 2013 at 03:51:44PM +0200, Denys Vlasenko wrote:
> On 06/19/2013 02:35 PM, Dmitry V. Levin wrote:
> > On Wed, Jun 19, 2013 at 01:11:56PM +0200, Denys Vlasenko wrote:
> >> Dmitry, I am a bit worried about the flow in this function still.
> >> Let's take a look:
> >>
> >>                 error = ptrace(PTRACE_DETACH, tcp->pid, 0, 0);
> >>                 if (error == 0) {
> >>                         /* On a clear day, you can see forever. */
> >>                 }
> >>                 else if (errno != ESRCH) {
> >>                         /* Shouldn't happen. */
> >>                         perror_msg("detach: ptrace(PTRACE_DETACH, ...)");
> >>                 }
> >>                 else
> >>                 /* ESRCH: process is either not stopped or doesn't exist. 
> >> */
> >>                 if (my_tkill(tcp->pid, 0) < 0) {
> >>                         if (errno != ESRCH)
> >>                                 /* Shouldn't happen. */
> >>                                 perror_msg("detach: checking sanity");
> >>                         /* else: process doesn't exist. */
> >> ^^^^^^^^^^^^^^^^^
> >> Well, it may not exist already, but was it *waited for*?
> >> IOW: we may still need to enter waitpid loop.
> >> This may rarely trigger - say, we do "strace -p PROCESS",
> >> and process exits just as we ^C the strace,
> >> and we may end up here.
> >> OTOH, not-waited-for child reparents to init when we exit,
> >> so... do we ever detach() NOT not strace exit, where dead
> >> children are a problem? I see one location:
> >>   if (event == PTRACE_EVENT_EXEC) {
> >>       if (detach_on_execve && !skip_one_b_execve)
> >>               detach(tcp); /* do "-b execve" thingy */
> >> Maybe in the name of correctness we should wait for the process
> >> if we see ESRCH? Possibly with WHOHANG for paranoid reasons.
> > 
> > In case of "-b execve", the tracee is in syscall-stop state already, so
> 
> To nitpick, it is in PTRACE_EVENT_EXEC stop...
> 
> ...or rather, we only know that it *was* in PTRACE_EVENT_EXEC.
> 
> It may no longer be true if it was suddenly nuked by SIGKILL
> a microsecond later while we are calling detach() on it.

Or an untraced thread called execve() at this moment.

> Then DETACH fails with ESRCH, tkill(0) fails with ESRCH (I guess...),
> and with current code we do nothing, leaving a zombie.
> 
> Actually, that may be a good thing, we want *its parent*
> to consume its exit status. But that parent can be *us* if we act
> on our own child.

Yes, exactly.  If TCB_STRACE_CHILD bit is set, then strace is the parent
and therefore is expected to wait for it.

> > PTRACE_DETACH should succeed and there should be no need to wait (and if
> > PTRACE_DETACH failed, then the tracee is no more so strace is expected
> > to wait for it).
> 
> My point is, we *dont* wait if both DETACH and probing tkill(0)
> fail with ESRCH. This might be wrong in some situations.

I suppose in that case, if TCB_STRACE_CHILD bit is set, strace should
waitpid the tracee, expecting ECHILD or WIFSIGNALED status.


-- 
ldv

Attachment: pgpfKTKS37vb7.pgp
Description: PGP signature

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Strace-devel mailing list
Strace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/strace-devel

Reply via email to