Re: Deadlock involving truss -f, pdfork() and wait4()

2019-09-13 Thread Ryan Stone
As Conrad has pointed out, it's an explicit PID.  The test completes
successfully when not run under truss -f.

On Fri, Sep 13, 2019 at 2:37 PM Mark Johnston  wrote:
>
> On Fri, Sep 13, 2019 at 02:12:56PM -0400, Ryan Stone wrote:
> > This gets me a little further but now the wait4 call by the parent
> > never reaps the child and instead blocks forever:
>
> Does it perform a wildcarded wait(), or does it explicitly specify the
> PID of the child?  By design, the former will not return children
> created by pdfork().
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Deadlock involving truss -f, pdfork() and wait4()

2019-09-13 Thread Conrad Meyer
On Fri, Sep 13, 2019 at 11:37 AM Mark Johnston  wrote:
>
> On Fri, Sep 13, 2019 at 02:12:56PM -0400, Ryan Stone wrote:
> > This gets me a little further but now the wait4 call by the parent
> > never reaps the child and instead blocks forever:
>
> Does it perform a wildcarded wait(), or does it explicitly specify the
> PID of the child?  By design, the former will not return children
> created by pdfork().

Explicit PID:

https://people.freebsd.org/~rstone/pdfork.c

Best,
Conrad
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Deadlock involving truss -f, pdfork() and wait4()

2019-09-13 Thread Mark Johnston
On Fri, Sep 13, 2019 at 02:12:56PM -0400, Ryan Stone wrote:
> This gets me a little further but now the wait4 call by the parent
> never reaps the child and instead blocks forever:

Does it perform a wildcarded wait(), or does it explicitly specify the
PID of the child?  By design, the former will not return children
created by pdfork().
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Deadlock involving truss -f, pdfork() and wait4()

2019-09-13 Thread Ryan Stone
This gets me a little further but now the wait4 call by the parent
never reaps the child and instead blocks forever:

# truss -f ./pdfork -p
 706: mmap(0x0,131072,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34361
970688 (0x800221000)
 706: issetugid()   = 0 (0x0)
 706: open("/etc/libmap.conf",O_RDONLY|O_CLOEXEC,010440020) = 3 (0x3)
 706: fstat(3,{ mode=-rw-r--r-- ,inode=241058,size=47,blksize=32768 }) = 0 (0x0
)
 706: read(3,"# $FreeBSD$\nincludedir /usr/loc"...,47) = 47 (0x2f)
 706: close(3)  = 0 (0x0)
 706: open("/usr/local/etc/libmap.d",O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC,
0165) ERR#2 'No such file or directory'
 706: open("/var/run/ld-elf.so.hints",O_RDONLY|O_CLOEXEC,010002355) = 3 (0x3)
 706: read(3,"Ehnt\^A\0\0\0\M^@\0\0\0A\0\0\0\0"...,128) = 128 (0x80)
 706: fstat(3,{ mode=-r--r--r-- ,inode=72,size=193,blksize=4096 }) = 0 (0x0)
 706: pread(3,"/lib:/usr/lib:/usr/lib/compat:/u"...,65,0x80) = 65 (0x41)
 706: close(3)  = 0 (0x0)
 706: open("/lib/libc.so.7",O_RDONLY|O_CLOEXEC|O_VERIFY,057400) = 3 (0x3)
 706: fstat(3,{ mode=-r--r--r-- ,inode=402754,size=1915744,blksize=32768 }) = 0
(0x0)
 706: mmap(0x0,4096,PROT_READ,MAP_PRIVATE|MAP_PREFAULT_READ,3,0x0) = 3436210176
0 (0x800241000)
 706: mmap(0x0,4169728,PROT_NONE,MAP_GUARD,-1,0x0) = 34362105856 (0x800242000)
 706: mmap(0x800242000,524288,PROT_READ,MAP_PRIVATE|MAP_FIXED|MAP_NOCORE|MAP_PR
EFAULT_READ,3,0x0) = 34362105856 (0x800242000)
 706: mmap(0x8002c2000,1327104,PROT_READ|PROT_EXEC,MAP_PRIVATE|MAP_FIXED|MAP_NO
CORE|MAP_PREFAULT_READ,3,0x8) = 34362630144 (0x8002c2000)
 706: mmap(0x800406000,61440,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_FIXED|MAP_PRE
FAULT_READ,3,0x1c4000) = 34363957248 (0x800406000)
 706: mmap(0x800415000,2256896,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_FIXED|MAP_A
NON,-1,0x0) = 34364018688 (0x800415000)
 706: munmap(0x800241000,4096)  = 0 (0x0)
 706: close(3)  = 0 (0x0)
 706: mprotect(0x80040c000,36864,PROT_READ) = 0 (0x0)
 706: sysarch(AMD64_SET_FSBASE,0x7fffe110)  = 0 (0x0)
 706: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIG
TERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFS
Z|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
 706: mprotect(0x80040c000,36864,PROT_READ|PROT_WRITE) = 0 (0x0)
 706: sigprocmask(SIG_SETMASK,{ },0x0)  = 0 (0x0)
 706: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIG
TERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFS
Z|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
 706: mprotect(0x80040c000,36864,PROT_READ) = 0 (0x0)
 706: sigprocmask(SIG_SETMASK,{ },0x0)  = 0 (0x0)
 706: readlink("/etc/malloc.conf",0x7fffd830,1024) ERR#2 'No such file or d
irectory'
 706: issetugid()   = 0 (0x0)
 706: __sysctl(0x7fffd7e0,0x2,0x7fffd7dc,0x7fffd7d0,0x0,0x0) = 0 (0
x0)
 706: mmap(0x0,2097152,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON|MAP_ALIGNED(21
),-1,0x0) = 34368126976 (0x80080)
 706: cap_getmode({ 0 })= 0 (0x0)
 706: open("/dev/hpet0",O_RDONLY,00)= 3 (0x3)
 706: mmap(0x0,4096,PROT_READ,MAP_SHARED,3,0x0) = 34362101760 (0x800241000)
 706: close(3)  = 0 (0x0)
 706: mmap(0x0,32768,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON|MAP_ALIGNED(12),
-1,0x0) = 34366275584 (0x80063c000)
 706: mmap(0x0,4194304,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON|MAP_ALIGNED(21
),-1,0x0) = 34370224128 (0x800a0)
 706: mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON|MAP_ALIGNED(12),-
1,0x0) = 34366308352 (0x800644000)
 706: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIG
TERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFS
Z|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
 706: sigprocmask(SIG_SETMASK,{ },0x0)  = 0 (0x0)
 706: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIG
TERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFS
Z|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
 706: mprotect(0x203000,4096,PROT_READ) = 0 (0x0)
 706: sigprocmask(SIG_SETMASK,{ },0x0)  = 0 (0x0)
 706: pdfork(0x7fffeba8,0x0)= 708 (0x2c4)
 708: 
 708: nanosleep({ 1.0 })= 0 (0x0)
 708: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIG
TERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFS
Z|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ }) = 0 (0x0)
 708: sigprocmask(SIG_SETMASK,{ },0x0)  = 0 (0x0)
 708: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIG

Re: Deadlock involving truss -f, pdfork() and wait4()

2019-09-13 Thread Mariusz Zaborski
Hello Ryan,

Can you verify is this patch fix your issue:
https://reviews.freebsd.org/D20362

Thanks,
Mariusz

On Thu, 12 Sep 2019 at 21:37, Ryan Stone  wrote:
>
> I've hit an issue with a simple use of pdfork().  I have a process
> that calls pdfork() and the parent immediately does a wait4() on the
> child pid.  This works fine under normal conditions, but if the parent
> is run under truss -f, the three processes deadlock.  If I switch out
> pdfork() for fork(), the deadlock does not occur.
>
> This C file demonstrates the issue:
>
> https://people.freebsd.org/~rstone/pdfork.c
>
> If I run "truss -f ./pdfork", which uses fork(), it completes within a
> second.  If I run "truss -f ./pdfork -p", which uses pdfork(), the
> processes deadlock.  If I run "./pdfork -p" without truss, it
> completes normally.
>
> procstat reports the following kernel stacks:
>
> 27572 102043 truss   -   mi_switch+0xe2
> sleepq_catch_signals+0x425 sleepq_wait_sig+0xf _sleep+0x1bf
> kern_wait6+0x695 sys_wait6+0x9f amd64_syscall+0x36e
> fast_syscall_common+0x101
> 27573 102469 pdfork  -   mi_switch+0xe2
> sleepq_catch_signals+0x425 sleepq_wait_sig+0xf _sleep+0x1bf
> kern_wait6+0x695 sys_wait4+0x78 amd64_syscall+0x36e
> fast_syscall_common+0x101
> 27574 102053 pdfork  -   mi_switch+0xe2
> thread_suspend_switch+0xd4 ptracestop+0x13b fork_return+0x14e
> fork_exit+0x83 fork_trampoline+0xe
>
> As near as I can tell, truss is blocked waiting for ptrace events, the
> parent process is blocked in wait4, and the child process is perhaps
> waiting for its parent to exit the kernel so it can send the ptrace
> event?
>
> I really don't see anything obvious in the pdfork() code path that
> would cause this to happen when fork() doesn't have the problem.  It
> may be that pdfork() just changes the timing enough to expose a latent
> bug.
>
> I'm seeing this on a recentish current (r351363).
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Deadlock involving truss -f, pdfork() and wait4()

2019-09-12 Thread Ryan Stone
I've hit an issue with a simple use of pdfork().  I have a process
that calls pdfork() and the parent immediately does a wait4() on the
child pid.  This works fine under normal conditions, but if the parent
is run under truss -f, the three processes deadlock.  If I switch out
pdfork() for fork(), the deadlock does not occur.

This C file demonstrates the issue:

https://people.freebsd.org/~rstone/pdfork.c

If I run "truss -f ./pdfork", which uses fork(), it completes within a
second.  If I run "truss -f ./pdfork -p", which uses pdfork(), the
processes deadlock.  If I run "./pdfork -p" without truss, it
completes normally.

procstat reports the following kernel stacks:

27572 102043 truss   -   mi_switch+0xe2
sleepq_catch_signals+0x425 sleepq_wait_sig+0xf _sleep+0x1bf
kern_wait6+0x695 sys_wait6+0x9f amd64_syscall+0x36e
fast_syscall_common+0x101
27573 102469 pdfork  -   mi_switch+0xe2
sleepq_catch_signals+0x425 sleepq_wait_sig+0xf _sleep+0x1bf
kern_wait6+0x695 sys_wait4+0x78 amd64_syscall+0x36e
fast_syscall_common+0x101
27574 102053 pdfork  -   mi_switch+0xe2
thread_suspend_switch+0xd4 ptracestop+0x13b fork_return+0x14e
fork_exit+0x83 fork_trampoline+0xe

As near as I can tell, truss is blocked waiting for ptrace events, the
parent process is blocked in wait4, and the child process is perhaps
waiting for its parent to exit the kernel so it can send the ptrace
event?

I really don't see anything obvious in the pdfork() code path that
would cause this to happen when fork() doesn't have the problem.  It
may be that pdfork() just changes the timing enough to expose a latent
bug.

I'm seeing this on a recentish current (r351363).
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"