Re: Race condition between an LWP migration and curlwp_bind

2018-02-15 Thread Ryota Ozaki
On Thu, Feb 15, 2018 at 5:30 AM, Mateusz Guzik  wrote:
> On Tue, Feb 13, 2018 at 05:14:38PM +0900, Ryota Ozaki wrote:
>>   panic: kernel diagnostic assertion "(psref->psref_cpu == curcpu())"
>> failed: file "/(hidden)/sys/kern/subr_psref.c", line 317 passive
>> reference transferred from CPU 0 to CPU 3
>>
>> I first thought that something went wrong in an ioctl handler
>> for example curlwp_bindx was called doubly and LP_BOUND was unset
>> then the LWP was migrated to another CPU. However, this kind of
>> assumptions was denied by KASSERTs in psref_release. So I doubted
>> of LP_BOUND and found there is a race condition on LWP migrations.
>>
>> curlwp_bind sets LP_BOUND to l_pflags of curlwp and that prevents
>> curlwp from migrating to another CPU until curlwp_bindx is called.
>
> The bug you found (and I trimmed) looks like the culprit, but there is
> an extra problem which probably happens to not manifest itself in terms
> of code generation: the bind/unbind inlines lack compiler barriers. See
> KPREEMPT_* inlines for comparison. The diff is definitely trivial:
>
> diff --git a/sys/sys/lwp.h b/sys/sys/lwp.h
> index 47d162271f9c..f18b76b984e4 100644
> --- a/sys/sys/lwp.h
> +++ b/sys/sys/lwp.h
> @@ -536,6 +536,7 @@ curlwp_bind(void)
>
> bound = curlwp->l_pflag & LP_BOUND;
> curlwp->l_pflag |= LP_BOUND;
> +   __insn_barrier();
>
> return bound;
>  }
> @@ -545,6 +546,7 @@ curlwp_bindx(int bound)
>  {
>
> KASSERT(curlwp->l_pflag & LP_BOUND);
> +   __insn_barrier();
> curlwp->l_pflag ^= bound ^ LP_BOUND;
>  }
>
> --
> Mateusz Guzik 

Right, the barriers are basically required as you say.

Note that for the current usage of curlwp_bind, i.e., preventing LWP
migrations between psref_acquire and psref_release, the barriers are
not must, IIUC. Because psref_acquire and psref_release are functions
and no reordering probably happen between psref_* and curlwp_*. Even
if a compiler optimization tries to reorder, curlwp_* won't cross
psref_* because psref_* uses percpu that uses kpreempt APIs that call
__insn_barrier.

Anyway I'll add the barriers.

Thanks,
  ozaki-r


Re: Race condition between an LWP migration and curlwp_bind

2018-02-14 Thread Mateusz Guzik
On Tue, Feb 13, 2018 at 05:14:38PM +0900, Ryota Ozaki wrote:
>   panic: kernel diagnostic assertion "(psref->psref_cpu == curcpu())"
> failed: file "/(hidden)/sys/kern/subr_psref.c", line 317 passive
> reference transferred from CPU 0 to CPU 3
>
> I first thought that something went wrong in an ioctl handler
> for example curlwp_bindx was called doubly and LP_BOUND was unset
> then the LWP was migrated to another CPU. However, this kind of
> assumptions was denied by KASSERTs in psref_release. So I doubted
> of LP_BOUND and found there is a race condition on LWP migrations.
>
> curlwp_bind sets LP_BOUND to l_pflags of curlwp and that prevents
> curlwp from migrating to another CPU until curlwp_bindx is called.

The bug you found (and I trimmed) looks like the culprit, but there is
an extra problem which probably happens to not manifest itself in terms
of code generation: the bind/unbind inlines lack compiler barriers. See
KPREEMPT_* inlines for comparison. The diff is definitely trivial:

diff --git a/sys/sys/lwp.h b/sys/sys/lwp.h
index 47d162271f9c..f18b76b984e4 100644
--- a/sys/sys/lwp.h
+++ b/sys/sys/lwp.h
@@ -536,6 +536,7 @@ curlwp_bind(void)

bound = curlwp->l_pflag & LP_BOUND;
curlwp->l_pflag |= LP_BOUND;
+   __insn_barrier();

return bound;
 }
@@ -545,6 +546,7 @@ curlwp_bindx(int bound)
 {

KASSERT(curlwp->l_pflag & LP_BOUND);
+   __insn_barrier();
curlwp->l_pflag ^= bound ^ LP_BOUND;
 }

-- 
Mateusz Guzik 


Re: Race condition between an LWP migration and curlwp_bind

2018-02-13 Thread Ryota Ozaki
On Wed, Feb 14, 2018 at 8:05 AM, Christos Zoulas  wrote:
> In article 
> ,
> Ryota Ozaki   wrote:
>>
>>Is the fix appropriate?
>
> Looks right to me, but it is above my pay grade :-)

Thanks, me too :)

I hope that postponing a migration is not a big deal.

  ozaki-r


Re: Race condition between an LWP migration and curlwp_bind

2018-02-13 Thread Christos Zoulas
In article 
,
Ryota Ozaki   wrote:
>
>Is the fix appropriate?

Looks right to me, but it is above my pay grade :-)

christos



Race condition between an LWP migration and curlwp_bind

2018-02-13 Thread Ryota Ozaki
Hi,

I have been encountering the following panic infrequently
during torture tests.

  panic: kernel diagnostic assertion "(psref->psref_cpu == curcpu())"
failed: file "/(hidden)/sys/kern/subr_psref.c", line 317 passive
reference transferred from CPU 0 to CPU 3
  fatal breakpoint trap in supervisor mode
  trap type 1 code 0 rip 0x8021ce28 cs 0x8 rflags 0x206 cr2
0x7f7ff7b0e020 ilevel 0 rsp 0x80003335fb60
  curlwp 0xe40025abc480 pid 15571.1 lowest kstack 0x80003335c2c0
  Stopped in pid 15571.1 (tcpdump) at netbsd:breakpoint+0x10: leave
  db{3}> bt
  breakpoint() at netbsd:breakpoint+0x10
  vpanic() at netbsd:vpanic+0x145
  kern_assert() at netbsd:kern_assert+0x4d
  psref_release() at netbsd:psref_release+0x25c
  doifioctl() at netbsd:doifioctl+0x8ac
  sys_ioctl() at netbsd:sys_ioctl+0x106
  syscall() at netbsd:syscall+0x1f2
  --- syscall (number 54) ---
  7f7ff6b1756a:
  db{3}>

The panic indicates that an LWP that holds a psref is unexpectedly
migrated to another CPU. However, the migration should prevented
by curlwp_bind in doifioctl.

I first thought that something went wrong in an ioctl handler
for example curlwp_bindx was called doubly and LP_BOUND was unset
then the LWP was migrated to another CPU. However, this kind of
assumptions was denied by KASSERTs in psref_release. So I doubted
of LP_BOUND and found there is a race condition on LWP migrations.

curlwp_bind sets LP_BOUND to l_pflags of curlwp and that prevents
curlwp from migrating to another CPU until curlwp_bindx is called.
Meanwhile, there are several ways that an LWP is migrated to
another CPU and in any cases the scheduler postpones a migration
if a target LWP is running. For example the scheduler
periodically explores CPU-hogging LWPs and schedule them to migrate
(see sched_lwp_stats). At that point the scheduler checks LP_BOUND
flag and if it's set to a LWP, the scheduler doesn't schedule the LWP.
A scheduled LWP is migrated when it is leaving a running CPU (*),
i.e., mi_switch. And mi_switch does *NOT* check LP_BOUND flag.

(*) To be exact, sched_lwp_stats sets a CPU to be migrated to
a target LWP and mi_switch does schedule the LWP to be migrated
if a CPU is set to the LWP. Then sched_idle actually migrates
the LWP.

So here is a race condition:
- [CPU#A] An LWP is dispatched and running
- [CPU#B] The scheduler schedules the LWP to be migrated
- [CPU#A] The LWP calls curlwp_bind and sets LP_BOUND
- [CPU#A] The LWP is going to sleep and call mi_switch
- [CPU#A] The LWP is migrated to another CPU regardless
  of LP_BOUND

So this is a fix:
  http://www.netbsd.org/~ozaki-r/fix-LP_BOUND.diff

It checks the LP_BOUND flag of curlwp in mi_switch and
skips a migration if it's set.

Is the fix appropriate?

Thanks,
  ozaki-r