Re: strace lockup when tracing exec in go

2016-09-22 Thread Mike Galbraith
On Thu, 2016-09-22 at 11:53 +0200, Michal Hocko wrote:
> On Thu 22-09-16 11:40:09, Mike Galbraith wrote:

> > This patch doesn't help, nor does the previous patch... but with both
> > applied, all is well.  All you have to do now is figure out why :)
> 
> Ohh, I should be more explicit, this needs the mm_access part as well.
> Sorry for not being clear enough. So the full change is

Ah.  That was gonna happen after lunch, but since you already know it
works, I can get back to un-b0rking one of my trees.

-Mike

--
___
Strace-devel mailing list
Strace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/strace-devel


Re: strace lockup when tracing exec in go

2016-09-22 Thread Mike Galbraith
On Wed, 2016-09-21 at 17:29 +0200, Michal Hocko wrote:
> [I am CCing strace mailing list because even if this turns out to be a
>  kernel bug strace seems to be doing something unexpected - more on that
>  below]
> 
> Hi,
> Aleksa has reported the following lockup when stracing the following go
> program
> 
> % cat exec.go
> package main
> 
> import (
> "os"
> "syscall"
> )
> 
> func main() {
> syscall.Exec("/bin/echo", []string{"/bin/echo", "Hello"}, os.Environ())
> }
> $ go version
> go version go1.6.3 linux/amd64
> $ go build -o exec exec.go
> 
> $ strace -f ./exec
> [snip]
> [pid 10349] select(0, NULL, NULL, NULL, {0, 100} 
> [pid 10346] <... select resumed> )  = 0 (Timeout)
> [pid 10346] select(0, NULL, NULL, NULL, {0, 20} 
> [pid 10345] execve("/bin/echo", ["/bin/echo", "Hello"], [/* 95 vars */] 
> [pid 10346] <... select resumed> )  = 0 (Timeout)
> [pid 10349] <... select resumed> )  = 0 (Timeout)
> 
> execve will never finish unless the strace process get killed with
> SIGKILL. The following was debugged on 3.12 kernel and the current
> kernel seems to not trigger the issue as easily but I believe the same
> problem is there as well.
> 
> The further investigation shown that the tracer (strace) is stuck
> waiting for cred_guard_mutex
> [<>] mm_access+0x22/0xa0
> [<>] process_vm_rw_core.isra.1+0x112/0x6c0
> [<>] process_vm_rw+0xab/0x110
> [<>] SyS_process_vm_readv+0x15/0x20
> [<>] system_call_fastpath+0x16/0x1b
> [<>] 0x7f186f031c3a
> [<>] 0x
> 
> while the traced process (3 threads) are trying to perform the exec.
> That means that 2 threads are dead (in fact zombies) waiting in their
> final schedule.
> Call Trace:
>  [] do_exit+0x6f7/0xa70
>  [] do_group_exit+0x39/0xa0
>  [] get_signal_to_deliver+0x1e8/0x5c0
>  [] do_signal+0x42/0x670
>  [] do_notify_resume+0x78/0xc0
>  [] int_signal+0x12/0x17
>  [<7f3a33f3ffb9>] 0x7f3a33f3ffb8
> 
> and one is
> Call Trace:
>  [] flush_old_exec+0xdf/0x890
>  [] load_elf_binary+0x307/0xda0
>  [] search_binary_handler+0xae/0x1f0
>  [] do_execve_common.isra.26+0x64e/0x810
>  [] SyS_execve+0x31/0x50
>  [] stub_execve+0x69/0xa0
>  [<7f3a33f16527>] 0x7f3a33f16526
> 
> waiting for notify_count to drop down to 0
> while (sig->notify_count) {
> __set_current_state(TASK_KILLABLE);
> spin_unlock_irq(lock);
> schedule(); <
> if (unlikely(__fatal_signal_pending(tsk)))
> goto killed;
> spin_lock_irq(lock);
> }
> 
> this however doesn't happen because both threads which are dead
> are zombies waiting to be reaped by the parent and to call their
> release_task->__exit_signal. The tracer is blocked on the lock held by
> exec (in prepare_bprm_creds). This is the case in the current kernel as
> well AFAICS so the same should be possible as well. So is this a bug or
> something somewhere else makes sure that this will not happen in newer
> kernels?

master.today...

crash> bt 5138
PID: 5138   TASK: 8803fb4e4c80  CPU: 6   COMMAND: "strace"
 #0 [88038e323b68] __schedule at 81624660
 #1 [88038e323bb0] schedule at 81624b35
 #2 [88038e323bc8] schedule_preempt_disabled at 81624e0e
 #3 [88038e323bd8] __mutex_lock_killable_slowpath at 81625f9d
 #4 [88038e323c30] mutex_lock_killable at 8162605a
 #5 [88038e323c48] mm_access at 8105bbf7
 #6 [88038e323c70] process_vm_rw_core at 811823d3
 #7 [88038e323dc0] process_vm_rw at 8118287d
 #8 [88038e323f38] sys_process_vm_readv at 811829b9
 #9 [88038e323f50] entry_SYSCALL_64_fastpath at 81628736
RIP: 7faee6b2ac3a  RSP: 7ffdc95b62b8  RFLAGS: 0246
RAX: ffda  RBX: 019e7030  RCX: 7faee6b2ac3a
RDX: 0001  RSI: 7ffdc95b62d0  RDI: 1418
RBP: 1414   R8: 0001   R9: 
R10: 7ffdc95b62e0  R11: 0246  R12: 7faee73e66c0
R13:   R14:   R15: 
ORIG_RAX: 0136  CS: 0033  SS: 002b
crash> bt 5140
PID: 5140   TASK: 8803e703b300  CPU: 4   COMMAND: "exec"
 #0 [88038e0b3ce8] __schedule at 81624660
 #1 [88038e0b3d30] schedule at 81624b35
 #2 [88038e0b3d48] flush_old_exec at 811b1f26
 #3 [88038e0b3d88] load_elf_binary at 811fd70f
 #4 [88038e0b3e60] search_binary_handler at 811b13ac
 #5 [88038e0b3e98] do_execveat_common at 811b2f8f
 #6 [88038e0b3f00] sys_execve at 811b33ba
 #7 [88038e0b3f28] do_syscall_64 at 8100194e
RIP: 0045c1ef  RSP: 00c820043e00  RFLAGS: 0246
RAX: ffda  RBX: 00c82000e6c0  RCX: 0045c1ef
RDX: 

Re: strace lockup when tracing exec in go

2016-09-22 Thread Mike Galbraith
On Thu, 2016-09-22 at 10:36 +0200, Michal Hocko wrote:
> On Thu 22-09-16 10:01:26, Michal Hocko wrote:
> > On Thu 22-09-16 06:15:02, Mike Galbraith wrote:
> > [...]
> > > master.today...
> > 
> > Thanks for trying to reproduce this. My tiny laptop (2 cores, 2 threads
> > per core) cannot reproduce even in 10 minutes or so. I've tried to use
> > the same machine I was testing with 3.12 kernel (2 sockets, 8 cores per
> > soc. and 2 threas per core) and it hit almost instantly. I have tried 
> > mutex_lock_killable -> interruptible and it didn't help as I've
> > expected. So the current kernel doesn't do any magic to prevent from the
> > issue as well.
> > 
> > So I've stared into do_notify_parent some more and the following was
> > just very confusing
> > 
> > > > if (!tsk->ptrace && sig == SIGCHLD &&
> > > > (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN ||
> > > >  (psig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT))) {
> > > > > > /*
> > > > > >  * We are exiting and our parent doesn't care.  POSIX.1
> > > > > >  * defines special semantics for setting SIGCHLD to SIG_IGN
> > > > > >  * or setting the SA_NOCLDWAIT flag: we should be reaped
> > > > > >  * automatically and not left for our parent's wait4 call.
> > > > > >  * Rather than having the parent do it as a magic kind of
> > > > > >  * signal handler, we just set this to tell do_exit that we
> > > > > >  * can be cleaned up without becoming a zombie.  Note that
> > > > > >  * we still call __wake_up_parent in this case, because a
> > > > > >  * blocked sys_wait4 might now return -ECHILD.
> > > > > >  *
> > > > > >  * Whether we send SIGCHLD or not for SA_NOCLDWAIT
> > > > > >  * is implementation-defined: we do (if you don't want
> > > > > >  * it, just use SIG_IGN instead).
> > > > > >  */
> > > > > > autoreap = true;
> > > > > > if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN)
> > > > > > > > sig = 0;
> > > > }
> > 
> > it tries to prevent from what I am seeing in a way. If the SIGCHLD is
> > ignored then it just does autoreap and everything is fine. But this
> > doesn't seem to be the case here. In fact we are not sending the signal
> > because sig_task_ignored is true resp. sig_handler_ignored which can
> > fail even for handler == SIG_DFL && sig_kernel_ignore() and SIGCHLD
> > seems to be in SIG_KERNEL_IGNORE_MASK. So I've tried
> 
> Dohh, I've missed !tsk->ptrace check there. So we are not even going
> that via that path. So the sig_handler_ignored cannot possible help.
> I was just too lucky and didn't hit the lockup with the patch.
> 
> So what else might be wrong here? sig_ignored seems to be quite
> confusing
> 
>   > /*
>   >  * Tracers may want to know about even ignored signals.
>   >  */
>   > return !t->ptrace;
> 
> t is the tracer here but it shouldn't have t->ptrace because the child
> is not stopped. So do we need something like the following? Not tested
> yet
> 
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 1840c7f4e3c2..bd236ce4a29c 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -91,6 +91,10 @@ static int sig_ignored(struct task_struct *t, int sig, 
> bool force)
>  >> if (!sig_task_ignored(t, sig, force))
>  >>   > return 0;
>  
> +>> /* Do not ignore signals sent from child to the parent */
> +>> if (current->ptrace && current->parent == t)
> +>>   > return 0;
> +
>  >> /*
>  >>  * Tracers may want to know about even ignored signals.
>  >>  */

This patch doesn't help, nor does the previous patch... but with both
applied, all is well.  All you have to do now is figure out why :)

-Mike


--
___
Strace-devel mailing list
Strace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/strace-devel