On Tue, Feb 13, 2018 at 05:14:38PM +0900, Ryota Ozaki wrote:
> panic: kernel diagnostic assertion "(psref->psref_cpu == curcpu())"
> failed: file "/(hidden)/sys/kern/subr_psref.c", line 317 passive
> reference transferred from CPU 0 to CPU 3
> I first thought that something went wrong in an ioctl handler
> for example curlwp_bindx was called doubly and LP_BOUND was unset
> then the LWP was migrated to another CPU. However, this kind of
> assumptions was denied by KASSERTs in psref_release. So I doubted
> of LP_BOUND and found there is a race condition on LWP migrations.
> curlwp_bind sets LP_BOUND to l_pflags of curlwp and that prevents
> curlwp from migrating to another CPU until curlwp_bindx is called.
The bug you found (and I trimmed) looks like the culprit, but there is
an extra problem which probably happens to not manifest itself in terms
of code generation: the bind/unbind inlines lack compiler barriers. See
KPREEMPT_* inlines for comparison. The diff is definitely trivial:
diff --git a/sys/sys/lwp.h b/sys/sys/lwp.h
index 47d162271f9c..f18b76b984e4 100644
@@ -536,6 +536,7 @@ curlwp_bind(void)
bound = curlwp->l_pflag & LP_BOUND;
curlwp->l_pflag |= LP_BOUND;
@@ -545,6 +546,7 @@ curlwp_bindx(int bound)
KASSERT(curlwp->l_pflag & LP_BOUND);
curlwp->l_pflag ^= bound ^ LP_BOUND;
Mateusz Guzik <mjguzik gmail.com>