On Tue, Feb 13, 2018 at 05:14:38PM +0900, Ryota Ozaki wrote: > panic: kernel diagnostic assertion "(psref->psref_cpu == curcpu())" > failed: file "/(hidden)/sys/kern/subr_psref.c", line 317 passive > reference transferred from CPU 0 to CPU 3 > > I first thought that something went wrong in an ioctl handler > for example curlwp_bindx was called doubly and LP_BOUND was unset > then the LWP was migrated to another CPU. However, this kind of > assumptions was denied by KASSERTs in psref_release. So I doubted > of LP_BOUND and found there is a race condition on LWP migrations. > > curlwp_bind sets LP_BOUND to l_pflags of curlwp and that prevents > curlwp from migrating to another CPU until curlwp_bindx is called.
The bug you found (and I trimmed) looks like the culprit, but there is an extra problem which probably happens to not manifest itself in terms of code generation: the bind/unbind inlines lack compiler barriers. See KPREEMPT_* inlines for comparison. The diff is definitely trivial: diff --git a/sys/sys/lwp.h b/sys/sys/lwp.h index 47d162271f9c..f18b76b984e4 100644 --- a/sys/sys/lwp.h +++ b/sys/sys/lwp.h @@ -536,6 +536,7 @@ curlwp_bind(void) bound = curlwp->l_pflag & LP_BOUND; curlwp->l_pflag |= LP_BOUND; + __insn_barrier(); return bound; } @@ -545,6 +546,7 @@ curlwp_bindx(int bound) { KASSERT(curlwp->l_pflag & LP_BOUND); + __insn_barrier(); curlwp->l_pflag ^= bound ^ LP_BOUND; } -- Mateusz Guzik <mjguzik gmail.com>