On Wed, Jun 19, 2013 at 03:51:44PM +0200, Denys Vlasenko wrote:
> On 06/19/2013 02:35 PM, Dmitry V. Levin wrote:
> > On Wed, Jun 19, 2013 at 01:11:56PM +0200, Denys Vlasenko wrote:
> >> Dmitry, I am a bit worried about the flow in this function still.
> >> Let's take a look:
> >>
> >> error = ptrace(PTRACE_DETACH, tcp->pid, 0, 0);
> >> if (error == 0) {
> >> /* On a clear day, you can see forever. */
> >> }
> >> else if (errno != ESRCH) {
> >> /* Shouldn't happen. */
> >> perror_msg("detach: ptrace(PTRACE_DETACH, ...)");
> >> }
> >> else
> >> /* ESRCH: process is either not stopped or doesn't exist.
> >> */
> >> if (my_tkill(tcp->pid, 0) < 0) {
> >> if (errno != ESRCH)
> >> /* Shouldn't happen. */
> >> perror_msg("detach: checking sanity");
> >> /* else: process doesn't exist. */
> >> ^^^^^^^^^^^^^^^^^
> >> Well, it may not exist already, but was it *waited for*?
> >> IOW: we may still need to enter waitpid loop.
> >> This may rarely trigger - say, we do "strace -p PROCESS",
> >> and process exits just as we ^C the strace,
> >> and we may end up here.
> >> OTOH, not-waited-for child reparents to init when we exit,
> >> so... do we ever detach() NOT not strace exit, where dead
> >> children are a problem? I see one location:
> >> if (event == PTRACE_EVENT_EXEC) {
> >> if (detach_on_execve && !skip_one_b_execve)
> >> detach(tcp); /* do "-b execve" thingy */
> >> Maybe in the name of correctness we should wait for the process
> >> if we see ESRCH? Possibly with WHOHANG for paranoid reasons.
> >
> > In case of "-b execve", the tracee is in syscall-stop state already, so
>
> To nitpick, it is in PTRACE_EVENT_EXEC stop...
>
> ...or rather, we only know that it *was* in PTRACE_EVENT_EXEC.
>
> It may no longer be true if it was suddenly nuked by SIGKILL
> a microsecond later while we are calling detach() on it.Or an untraced thread called execve() at this moment. > Then DETACH fails with ESRCH, tkill(0) fails with ESRCH (I guess...), > and with current code we do nothing, leaving a zombie. > > Actually, that may be a good thing, we want *its parent* > to consume its exit status. But that parent can be *us* if we act > on our own child. Yes, exactly. If TCB_STRACE_CHILD bit is set, then strace is the parent and therefore is expected to wait for it. > > PTRACE_DETACH should succeed and there should be no need to wait (and if > > PTRACE_DETACH failed, then the tracee is no more so strace is expected > > to wait for it). > > My point is, we *dont* wait if both DETACH and probing tkill(0) > fail with ESRCH. This might be wrong in some situations. I suppose in that case, if TCB_STRACE_CHILD bit is set, strace should waitpid the tracee, expecting ECHILD or WIFSIGNALED status. -- ldv
pgpfKTKS37vb7.pgp
Description: PGP signature
------------------------------------------------------------------------------ This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev
_______________________________________________ Strace-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/strace-devel
