Re: [PATCH -next] fork: silence a false postive warning in __mmdrop

2020-09-08 Thread peterz
On Tue, Sep 08, 2020 at 12:50:44PM -0400, Qian Cai wrote:
> > No, you're talking nonsense. We must not free @mm when
> > 'current->active_mm == mm', never.
> 
> Yes, you are right. It still trigger this below on powerpc with today's
> linux-next by fuzzing for a while (saw a few times on recent linux-next before
> as well but so far mostly reproducible on powerpc here). Any idea?

If you can reliably reproduce this, the next thing is to trace mm_count
and figure out where it goes side-ways. I suppose we're looking for an
'extra' decrement.

Mark tried this for a while but gave up because he couldn't reliably
reproduce.


Re: [PATCH -next] fork: silence a false postive warning in __mmdrop

2020-09-08 Thread Qian Cai
On Wed, Jul 22, 2020 at 03:44:06PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 22, 2020 at 09:19:00AM -0400, Qian Cai wrote:
> > On Wed, Jul 22, 2020 at 12:06:37PM +0200, pet...@infradead.org wrote:
> > > On Thu, Jun 04, 2020 at 11:03:44AM -0400, Qian Cai wrote:
> > > > The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
> > > > offline CPUs") delayed,
> > > > 
> > > > idle->active_mm = _mm;
> > > > 
> > > > into finish_cpu() instead of idle_task_exit() which results in a false
> > > > positive warning that was originally designed in the commit 3eda69c92d47
> > > > ("kernel/fork.c: detect early free of a live mm").
> > > > 
> > > >  WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
> > > >  __mmdrop+0x230/0x2c0
> > > >  do_exit+0x424/0xfa0
> > > >  Call Trace:
> > > >  do_exit+0x424/0xfa0
> > > >  do_group_exit+0x64/0xd0
> > > >  sys_exit_group+0x24/0x30
> > > >  system_call_exception+0x108/0x1d0
> > > >  system_call_common+0xf0/0x278
> > > 
> > > Please explain; because afaict this is a use-after-free.
> > > 
> > > The thing is __mmdrop() is going to actually free the mm, so then what
> > > is finish_cpu()'s mmdrop() going to do?
> > > 
> > > ->active_mm() should have a refcount on the mm.
> > 
> > Well, the refcount issue you mentioned then happens all before bf2c59fce407 
> > was
> > introduced as well, but then it looks harmless because mmdrop() in 
> > finish_cpu()
> > will do,
> > 
> > if (unlikely(atomic_dec_and_test(>mm_count)))
> > __mmdrop(mm);
> 
> That's not harmless, that's a use-after-free. Those can cause memory
> corruption bugs and the like at best. Who knows what's at the location
> of mm->mm_count after we've already freed it.
> 
> > where that atomic_dec_and_test() see the negative refcount and will not 
> > involve
> > __mmdrop() again. It is not clear to me that once the CPU is offline if it
> > needs to care about its idle thread mm_count at all. Even if this refcount
> > issue is finally addressed, it could hit this warning in finish_cpu() 
> > without
> > this patch.
> > 
> > On the other hand, if you look at the commit 3eda69c92d47, it is clearly 
> > that
> > the assumption of,
> > 
> >WARN_ON_ONCE(mm == current->active_mm);
> > 
> > is totally gone due to bf2c59fce407. Thus, the patch is to fix that 
> > discrepancy
> > first and then I'll look at that the imbalance mmdrop()/mmgrab() elsewhere.
> 
> No, you're talking nonsense. We must not free @mm when
> 'current->active_mm == mm', never.

Yes, you are right. It still trigger this below on powerpc with today's
linux-next by fuzzing for a while (saw a few times on recent linux-next before
as well but so far mostly reproducible on powerpc here). Any idea?

[12802.547809][T191552] BUG mm_struct (Tainted: G   O ): Poison 
overwritten
[12802.547824][T191552] 
-
[12802.547824][T191552] 
[12802.547843][T191552] Disabling lock debugging due to kernel taint
[12802.547867][T191552] INFO: 0x0e2a54ec-0x0e2a54ec 
@offset=96464. First byte 0x6a instead of 0x6b
[12802.547889][T191552] INFO: Allocated in dup_mm+0x48/0x6d0 age=955 cpu=108 
pid=191552
[12802.547915][T191552] __slab_alloc+0xa4/0xf0
[12802.547937][T191552] kmem_cache_alloc+0x314/0x4a0
[12802.547959][T191552] dup_mm+0x48/0x6d0
dup_mm at kernel/fork.c:1344
[12802.547978][T191552] copy_process+0x11bc/0x19a0
[12802.548010][T191552] kernel_clone+0x120/0xb80
[12802.548031][T191552] __do_sys_clone+0x88/0xd0
[12802.548055][T191552] system_call_exception+0xf8/0x1d0
[12802.548083][T191552] system_call_common+0xe8/0x218
[12802.548093][T191552] INFO: Freed in __mmdrop+0x144/0x250 age=942 cpu=69 
pid=882503
[12802.548140][T191552] kmem_cache_free+0x47c/0x500
[12802.548161][T191552] __mmdrop+0x144/0x250
__mmdrop at kernel/fork.c:685
[12802.548170][T191552] do_exit+0x3f4/0xed0
[12802.548212][T191552] do_group_exit+0x5c/0xd0
[12802.548244][T191552] sys_exit_group+0x1c/0x20
[12802.548277][T191552] system_call_exception+0xf8/0x1d0
[12802.548309][T191552] system_call_common+0xe8/0x218
[12802.548342][T191552] INFO: Slab 0x48df84af objects=64 used=64 
fp=0x flags=0x87fff810200
[12802.548379][T191552] INFO: Object 0x583c5ba3 @offset=96384 
fp=0x681f5d04
[12802.548379][T191552] 
[12802.548419][T191552] Redzone 4a1ea01e: bb bb bb bb bb bb bb bb bb bb 
bb bb bb bb bb bb  
[12802.548445][T191552] Redzone 37d12952: bb bb bb bb bb bb bb bb bb bb 
bb bb bb bb bb bb  
[12802.548471][T191552] Redzone 8124eae0: bb bb bb bb bb bb bb bb bb bb 
bb bb bb bb bb bb  
[12802.548511][T191552] Redzone 9b782382: bb bb bb bb bb bb bb bb bb bb 
bb bb bb bb bb bb  
[12802.548559][T191552] Redzone 05c781f2: bb bb bb bb bb 

Re: [PATCH -next] fork: silence a false postive warning in __mmdrop

2020-07-22 Thread Peter Zijlstra
On Wed, Jul 22, 2020 at 09:19:00AM -0400, Qian Cai wrote:
> On Wed, Jul 22, 2020 at 12:06:37PM +0200, pet...@infradead.org wrote:
> > On Thu, Jun 04, 2020 at 11:03:44AM -0400, Qian Cai wrote:
> > > The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
> > > offline CPUs") delayed,
> > > 
> > > idle->active_mm = _mm;
> > > 
> > > into finish_cpu() instead of idle_task_exit() which results in a false
> > > positive warning that was originally designed in the commit 3eda69c92d47
> > > ("kernel/fork.c: detect early free of a live mm").
> > > 
> > >  WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
> > >  __mmdrop+0x230/0x2c0
> > >  do_exit+0x424/0xfa0
> > >  Call Trace:
> > >  do_exit+0x424/0xfa0
> > >  do_group_exit+0x64/0xd0
> > >  sys_exit_group+0x24/0x30
> > >  system_call_exception+0x108/0x1d0
> > >  system_call_common+0xf0/0x278
> > 
> > Please explain; because afaict this is a use-after-free.
> > 
> > The thing is __mmdrop() is going to actually free the mm, so then what
> > is finish_cpu()'s mmdrop() going to do?
> > 
> > ->active_mm() should have a refcount on the mm.
> 
> Well, the refcount issue you mentioned then happens all before bf2c59fce407 
> was
> introduced as well, but then it looks harmless because mmdrop() in 
> finish_cpu()
> will do,
> 
>   if (unlikely(atomic_dec_and_test(>mm_count)))
>   __mmdrop(mm);

That's not harmless, that's a use-after-free. Those can cause memory
corruption bugs and the like at best. Who knows what's at the location
of mm->mm_count after we've already freed it.

> where that atomic_dec_and_test() see the negative refcount and will not 
> involve
> __mmdrop() again. It is not clear to me that once the CPU is offline if it
> needs to care about its idle thread mm_count at all. Even if this refcount
> issue is finally addressed, it could hit this warning in finish_cpu() without
> this patch.
> 
> On the other hand, if you look at the commit 3eda69c92d47, it is clearly that
> the assumption of,
> 
>WARN_ON_ONCE(mm == current->active_mm);
> 
> is totally gone due to bf2c59fce407. Thus, the patch is to fix that 
> discrepancy
> first and then I'll look at that the imbalance mmdrop()/mmgrab() elsewhere.

No, you're talking nonsense. We must not free @mm when
'current->active_mm == mm', never.


Re: [PATCH -next] fork: silence a false postive warning in __mmdrop

2020-07-22 Thread Qian Cai
On Wed, Jul 22, 2020 at 12:06:37PM +0200, pet...@infradead.org wrote:
> On Thu, Jun 04, 2020 at 11:03:44AM -0400, Qian Cai wrote:
> > The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
> > offline CPUs") delayed,
> > 
> > idle->active_mm = _mm;
> > 
> > into finish_cpu() instead of idle_task_exit() which results in a false
> > positive warning that was originally designed in the commit 3eda69c92d47
> > ("kernel/fork.c: detect early free of a live mm").
> > 
> >  WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
> >  __mmdrop+0x230/0x2c0
> >  do_exit+0x424/0xfa0
> >  Call Trace:
> >  do_exit+0x424/0xfa0
> >  do_group_exit+0x64/0xd0
> >  sys_exit_group+0x24/0x30
> >  system_call_exception+0x108/0x1d0
> >  system_call_common+0xf0/0x278
> 
> Please explain; because afaict this is a use-after-free.
> 
> The thing is __mmdrop() is going to actually free the mm, so then what
> is finish_cpu()'s mmdrop() going to do?
> 
> ->active_mm() should have a refcount on the mm.

Well, the refcount issue you mentioned then happens all before bf2c59fce407 was
introduced as well, but then it looks harmless because mmdrop() in finish_cpu()
will do,

if (unlikely(atomic_dec_and_test(>mm_count)))
__mmdrop(mm);

where that atomic_dec_and_test() see the negative refcount and will not involve
__mmdrop() again. It is not clear to me that once the CPU is offline if it
needs to care about its idle thread mm_count at all. Even if this refcount
issue is finally addressed, it could hit this warning in finish_cpu() without
this patch.

On the other hand, if you look at the commit 3eda69c92d47, it is clearly that
the assumption of,

   WARN_ON_ONCE(mm == current->active_mm);

is totally gone due to bf2c59fce407. Thus, the patch is to fix that discrepancy
first and then I'll look at that the imbalance mmdrop()/mmgrab() elsewhere.

> 
> > Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
> > Signed-off-by: Qian Cai 
> > ---
> >  kernel/fork.c | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 142b23645d82..5334efd2a680 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
> >  {
> > BUG_ON(mm == _mm);
> > WARN_ON_ONCE(mm == current->mm);
> > -   WARN_ON_ONCE(mm == current->active_mm);
> > mm_free_pgd(mm);
> > destroy_context(mm);
> > mmu_notifier_subscriptions_destroy(mm);
> > -- 
> > 2.21.0 (Apple Git-122.2)
> > 


Re: [PATCH -next] fork: silence a false postive warning in __mmdrop

2020-07-22 Thread peterz
On Thu, Jun 04, 2020 at 11:03:44AM -0400, Qian Cai wrote:
> The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
> offline CPUs") delayed,
> 
> idle->active_mm = _mm;
> 
> into finish_cpu() instead of idle_task_exit() which results in a false
> positive warning that was originally designed in the commit 3eda69c92d47
> ("kernel/fork.c: detect early free of a live mm").
> 
>  WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
>  __mmdrop+0x230/0x2c0
>  do_exit+0x424/0xfa0
>  Call Trace:
>  do_exit+0x424/0xfa0
>  do_group_exit+0x64/0xd0
>  sys_exit_group+0x24/0x30
>  system_call_exception+0x108/0x1d0
>  system_call_common+0xf0/0x278

Please explain; because afaict this is a use-after-free.

The thing is __mmdrop() is going to actually free the mm, so then what
is finish_cpu()'s mmdrop() going to do?

->active_mm() should have a refcount on the mm.

> Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
> Signed-off-by: Qian Cai 
> ---
>  kernel/fork.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 142b23645d82..5334efd2a680 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
>  {
>   BUG_ON(mm == _mm);
>   WARN_ON_ONCE(mm == current->mm);
> - WARN_ON_ONCE(mm == current->active_mm);
>   mm_free_pgd(mm);
>   destroy_context(mm);
>   mmu_notifier_subscriptions_destroy(mm);
> -- 
> 2.21.0 (Apple Git-122.2)
> 


Re: [PATCH -next] fork: silence a false postive warning in __mmdrop

2020-07-21 Thread Qian Cai
On Thu, Jun 04, 2020 at 11:03:44AM -0400, Qian Cai wrote:
> The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
> offline CPUs") delayed,
> 
> idle->active_mm = _mm;
> 
> into finish_cpu() instead of idle_task_exit() which results in a false
> positive warning that was originally designed in the commit 3eda69c92d47
> ("kernel/fork.c: detect early free of a live mm").
> 
>  WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
>  __mmdrop+0x230/0x2c0
>  do_exit+0x424/0xfa0
>  Call Trace:
>  do_exit+0x424/0xfa0
>  do_group_exit+0x64/0xd0
>  sys_exit_group+0x24/0x30
>  system_call_exception+0x108/0x1d0
>  system_call_common+0xf0/0x278
> 
> Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
> Signed-off-by: Qian Cai 

Peter, Andrew, can you take a look at this trivial patch? The offensive commit
is now in the mainline, and it is quite easy to trigger this warning by a
non-root user.

# git clone https://gitlab.com/cailca/linux-mm
# cd linux-mm; make
# ./random -x 0-100 -f (it will switch to use a non-root user.)

[14933.435035][T1176230] [ cut here ]
[14933.435062][T1176230] WARNING: CPU: 25 PID: 1176230 at kernel/fork.c:679 
__mmdrop+0x1e0/0x270
[14933.435086][T1176230] Modules linked in: vfio_pci vfio_virqfd 
vfio_iommu_spapr_tce vfio vfio_spapr_eeh loop kvm_hv kvm ip_tables x_tables 
sd_mod tg3 firmware_class ahci libahci libphy libata dm_mirror dm_region_hash 
dm_log dm_mod
[14933.435176][T1176230] CPU: 25 PID: 1176230 Comm: trinity-c25 Not tainted 
5.8.0-rc6-next-20200721 #5
[14933.435200][T1176230] NIP:  c00c4560 LR: c00d4fb4 CTR: 

[14933.435223][T1176230] REGS: c0001520f910 TRAP: 0700   Not tainted  
(5.8.0-rc6-next-20200721)
[14933.435255][T1176230] MSR:  90029033   CR: 
24002822  XER: 2004
[14933.435294][T1176230] CFAR: c00c43f4 IRQMASK: 0 
[14933.435294][T1176230] GPR00: c00d4fb4 c0001520fba0 
c5911500 c00f6e227080 
[14933.435294][T1176230] GPR04: 0001 f026 
c007aa2a17e0 0001 
[14933.435294][T1176230] GPR08:  c00f6e227080 
0001 bfff 
[14933.435294][T1176230] GPR12: 2000 c00eb780 
10037be0 109ec8e0 
[14933.435294][T1176230] GPR16: 109eccb0 10037ff0 
0001 7fff8d00 
[14933.435294][T1176230] GPR20: 100380d0 100380a8 
1003a1d0 10037f10 
[14933.435294][T1176230] GPR24: 7fff8df100e0 ed54649a 
c00f6e227128 c0001520fcc8 
[14933.435294][T1176230] GPR28: c00fe7540890 c00f6e227080 
c00fe7540080 c00f6e227080 
[14933.435470][T1176230] NIP [c00c4560] __mmdrop+0x1e0/0x270
[14933.435493][T1176230] LR [c00d4fb4] do_exit+0x424/0xee0
[14933.435514][T1176230] Call Trace:
[14933.435524][T1176230] [c0001520fba0] [c00c000164c0] 
0xc00c000164c0 (unreliable)
[14933.435550][T1176230] [c0001520fc60] [c00d4fb4] 
do_exit+0x424/0xee0
[14933.435575][T1176230] [c0001520fd60] [c00d5b2c] 
do_group_exit+0x5c/0xd0
[14933.435600][T1176230] [c0001520fda0] [c00d5bbc] 
sys_exit_group+0x1c/0x20
[14933.435626][T1176230] [c0001520fdc0] [c002c978] 
system_call_exception+0xf8/0x180
[14933.435654][T1176230] [c0001520fe20] [c000c9e8] 
system_call_common+0xe8/0x214
[14933.435686][T1176230] Instruction dump:
[14933.435706][T1176230] 480c35e9 6000 4b68 6000 7c832378 3880 
482a8d81 6000 
[14933.435750][T1176230] 4bfffee8 6000 6000 6000 <0fe0> 
4bfffe94 6000 6000 
[14933.435795][T1176230] irq event stamp: 370444
[14933.435817][T1176230] hardirqs last  enabled at (370443): 
[] __slab_free+0x2e4/0x5e0
[14933.435868][T1176230] hardirqs last disabled at (370444): 
[] program_check_common_virt+0x2bc/0x310
[14933.435913][T1176230] softirqs last  enabled at (352366): 
[] __do_softirq+0x650/0x920
[14933.435948][T1176230] softirqs last disabled at (352359): 
[] irq_exit+0x17c/0x1b0
[14933.435971][T1176230] ---[ end trace 5823e9bcf4dee099 ]---

> ---
>  kernel/fork.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 142b23645d82..5334efd2a680 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
>  {
>   BUG_ON(mm == _mm);
>   WARN_ON_ONCE(mm == current->mm);
> - WARN_ON_ONCE(mm == current->active_mm);
>   mm_free_pgd(mm);
>   destroy_context(mm);
>   mmu_notifier_subscriptions_destroy(mm);
> -- 
> 2.21.0 (Apple Git-122.2)
> 


[PATCH -next] fork: silence a false postive warning in __mmdrop

2020-06-04 Thread Qian Cai
The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
offline CPUs") delayed,

idle->active_mm = _mm;

into finish_cpu() instead of idle_task_exit() which results in a false
positive warning that was originally designed in the commit 3eda69c92d47
("kernel/fork.c: detect early free of a live mm").

 WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
 __mmdrop+0x230/0x2c0
 do_exit+0x424/0xfa0
 Call Trace:
 do_exit+0x424/0xfa0
 do_group_exit+0x64/0xd0
 sys_exit_group+0x24/0x30
 system_call_exception+0x108/0x1d0
 system_call_common+0xf0/0x278

Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
Signed-off-by: Qian Cai 
---
 kernel/fork.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 142b23645d82..5334efd2a680 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
 {
BUG_ON(mm == _mm);
WARN_ON_ONCE(mm == current->mm);
-   WARN_ON_ONCE(mm == current->active_mm);
mm_free_pgd(mm);
destroy_context(mm);
mmu_notifier_subscriptions_destroy(mm);
-- 
2.21.0 (Apple Git-122.2)