Re: [PATCH 0/2] (Was: BUG_ON in rcu_sync_func triggered)

2016-10-13 Thread Oleg Nesterov
On 09/26, Oleg Nesterov wrote: > > Hello, > > The patches do not depend on each other. Yes, > 1/2 is the trivial fix, imo -stable material. The bug is very old it seems, > but today this race (leading to unbalanced unlock) manifests itself via > mysterious BUG_ON's in rcu/sync.c. Yes. Al,

Re: [PATCH 0/2] (Was: BUG_ON in rcu_sync_func triggered)

2016-10-13 Thread Oleg Nesterov
On 09/26, Oleg Nesterov wrote: > > Hello, > > The patches do not depend on each other. Yes, > 1/2 is the trivial fix, imo -stable material. The bug is very old it seems, > but today this race (leading to unbalanced unlock) manifests itself via > mysterious BUG_ON's in rcu/sync.c. Yes. Al,

[PATCH 0/2] (Was: BUG_ON in rcu_sync_func triggered)

2016-09-26 Thread Oleg Nesterov
Hello, The patches do not depend on each other. 1/2 is the trivial fix, imo -stable material. The bug is very old it seems, but today this race (leading to unbalanced unlock) manifests itself via mysterious BUG_ON's in rcu/sync.c. 2/2 is old, I forgot to send it before. It was already reviewed

[PATCH 0/2] (Was: BUG_ON in rcu_sync_func triggered)

2016-09-26 Thread Oleg Nesterov
Hello, The patches do not depend on each other. 1/2 is the trivial fix, imo -stable material. The bug is very old it seems, but today this race (leading to unbalanced unlock) manifests itself via mysterious BUG_ON's in rcu/sync.c. 2/2 is old, I forgot to send it before. It was already reviewed

Re: BUG_ON in rcu_sync_func triggered

2016-09-23 Thread Oleg Nesterov
On 09/23, Nikolay Borisov wrote: > > > --- a/fs/super.c > > +++ b/fs/super.c > > @@ -1344,7 +1344,9 @@ int thaw_super(struct super_block *sb) > > int error; > > > > down_write(>s_umount); > > - if (sb->s_writers.frozen == SB_UNFROZEN) { > > + if (sb->s_writers.frozen !=

Re: BUG_ON in rcu_sync_func triggered

2016-09-23 Thread Oleg Nesterov
On 09/23, Nikolay Borisov wrote: > > > --- a/fs/super.c > > +++ b/fs/super.c > > @@ -1344,7 +1344,9 @@ int thaw_super(struct super_block *sb) > > int error; > > > > down_write(>s_umount); > > - if (sb->s_writers.frozen == SB_UNFROZEN) { > > + if (sb->s_writers.frozen !=

Re: BUG_ON in rcu_sync_func triggered

2016-09-23 Thread Nikolay Borisov
On Wed, Sep 14, 2016 at 3:58 PM, Oleg Nesterov wrote: > On 09/14, Nikolay Borisov wrote: >> >> [ 557.006656] [] dump_stack+0x6b/0xa0 >> [ 557.012737] [] warn_slowpath_common+0x95/0xe0 >> [ 557.019781] [] warn_slowpath_null+0x1a/0x20 >> [ 557.026645] []

Re: BUG_ON in rcu_sync_func triggered

2016-09-23 Thread Nikolay Borisov
On Wed, Sep 14, 2016 at 3:58 PM, Oleg Nesterov wrote: > On 09/14, Nikolay Borisov wrote: >> >> [ 557.006656] [] dump_stack+0x6b/0xa0 >> [ 557.012737] [] warn_slowpath_common+0x95/0xe0 >> [ 557.019781] [] warn_slowpath_null+0x1a/0x20 >> [ 557.026645] [] rcu_sync_enter+0x148/0x1a0 >> [

Re: BUG_ON in rcu_sync_func triggered

2016-09-14 Thread Oleg Nesterov
On 09/14, Nikolay Borisov wrote: > > [ 557.006656] [] dump_stack+0x6b/0xa0 > [ 557.012737] [] warn_slowpath_common+0x95/0xe0 > [ 557.019781] [] warn_slowpath_null+0x1a/0x20 > [ 557.026645] [] rcu_sync_enter+0x148/0x1a0 > [ 557.033309] [] percpu_down_write+0x1e/0xf0 > [ 557.040074] [] ?

Re: BUG_ON in rcu_sync_func triggered

2016-09-14 Thread Oleg Nesterov
On 09/14, Nikolay Borisov wrote: > > [ 557.006656] [] dump_stack+0x6b/0xa0 > [ 557.012737] [] warn_slowpath_common+0x95/0xe0 > [ 557.019781] [] warn_slowpath_null+0x1a/0x20 > [ 557.026645] [] rcu_sync_enter+0x148/0x1a0 > [ 557.033309] [] percpu_down_write+0x1e/0xf0 > [ 557.040074] [] ?

Re: BUG_ON in rcu_sync_func triggered

2016-09-14 Thread Nikolay Borisov
On 09/13/2016 06:20 PM, Oleg Nesterov wrote: > On 09/13, Nikolay Borisov wrote: >> >> On 09/13/2016 05:35 PM, Nikolay Borisov wrote: >>> >>> On 09/13/2016 04:43 PM, Oleg Nesterov wrote: On 09/13, Oleg Nesterov wrote: > > OK... perhaps the unbalanced up_write... I'll try to look at

Re: BUG_ON in rcu_sync_func triggered

2016-09-14 Thread Nikolay Borisov
On 09/13/2016 06:20 PM, Oleg Nesterov wrote: > On 09/13, Nikolay Borisov wrote: >> >> On 09/13/2016 05:35 PM, Nikolay Borisov wrote: >>> >>> On 09/13/2016 04:43 PM, Oleg Nesterov wrote: On 09/13, Oleg Nesterov wrote: > > OK... perhaps the unbalanced up_write... I'll try to look at

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Nikolay Borisov
On 09/13/2016 05:38 PM, Nikolay Borisov wrote: > > > On 09/13/2016 05:35 PM, Nikolay Borisov wrote: >> >> >> On 09/13/2016 04:43 PM, Oleg Nesterov wrote: >>> On 09/13, Oleg Nesterov wrote: OK... perhaps the unbalanced up_write... I'll try to look at freeze/thaw code, >>> >>>

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Nikolay Borisov
On 09/13/2016 05:38 PM, Nikolay Borisov wrote: > > > On 09/13/2016 05:35 PM, Nikolay Borisov wrote: >> >> >> On 09/13/2016 04:43 PM, Oleg Nesterov wrote: >>> On 09/13, Oleg Nesterov wrote: OK... perhaps the unbalanced up_write... I'll try to look at freeze/thaw code, >>> >>>

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Oleg Nesterov
On 09/13, Nikolay Borisov wrote: > > On 09/13/2016 05:35 PM, Nikolay Borisov wrote: > > > > On 09/13/2016 04:43 PM, Oleg Nesterov wrote: > >> On 09/13, Oleg Nesterov wrote: > >>> > >>> OK... perhaps the unbalanced up_write... I'll try to look at freeze/thaw > >>> code, > >> > >> Heh, yes, it

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Oleg Nesterov
On 09/13, Nikolay Borisov wrote: > > On 09/13/2016 05:35 PM, Nikolay Borisov wrote: > > > > On 09/13/2016 04:43 PM, Oleg Nesterov wrote: > >> On 09/13, Oleg Nesterov wrote: > >>> > >>> OK... perhaps the unbalanced up_write... I'll try to look at freeze/thaw > >>> code, > >> > >> Heh, yes, it

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Nikolay Borisov
On 09/13/2016 04:43 PM, Oleg Nesterov wrote: > On 09/13, Oleg Nesterov wrote: >> >> OK... perhaps the unbalanced up_write... I'll try to look at freeze/thaw >> code, > > Heh, yes, it looks racy or I am totally confused. > >> could test the debugging patch below meanwhile? > > Yes please.

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Nikolay Borisov
On 09/13/2016 04:43 PM, Oleg Nesterov wrote: > On 09/13, Oleg Nesterov wrote: >> >> OK... perhaps the unbalanced up_write... I'll try to look at freeze/thaw >> code, > > Heh, yes, it looks racy or I am totally confused. > >> could test the debugging patch below meanwhile? > > Yes please.

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Nikolay Borisov
On 09/13/2016 05:35 PM, Nikolay Borisov wrote: > > > On 09/13/2016 04:43 PM, Oleg Nesterov wrote: >> On 09/13, Oleg Nesterov wrote: >>> >>> OK... perhaps the unbalanced up_write... I'll try to look at freeze/thaw >>> code, >> >> Heh, yes, it looks racy or I am totally confused. >> >>> could

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Nikolay Borisov
On 09/13/2016 05:35 PM, Nikolay Borisov wrote: > > > On 09/13/2016 04:43 PM, Oleg Nesterov wrote: >> On 09/13, Oleg Nesterov wrote: >>> >>> OK... perhaps the unbalanced up_write... I'll try to look at freeze/thaw >>> code, >> >> Heh, yes, it looks racy or I am totally confused. >> >>> could

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Oleg Nesterov
On 09/13, Oleg Nesterov wrote: > > OK... perhaps the unbalanced up_write... I'll try to look at freeze/thaw code, Heh, yes, it looks racy or I am totally confused. > could test the debugging patch below meanwhile? Yes please. I'll send you another patch (hopefully fix) later, but it would be

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Oleg Nesterov
On 09/13, Oleg Nesterov wrote: > > OK... perhaps the unbalanced up_write... I'll try to look at freeze/thaw code, Heh, yes, it looks racy or I am totally confused. > could test the debugging patch below meanwhile? Yes please. I'll send you another patch (hopefully fix) later, but it would be

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Oleg Nesterov
On 09/13, Nikolay Borisov wrote: > > > I just re-run the test with kernel 4.4.14 and PROVE_RCU and DEBUG_RCU_OBJECTS > enabled and here is what I got: Thanks again! Damn. This reminds me that I forgot to send the patch which reworks rcu/sync.c. Will do this week. But we need to investigate

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Oleg Nesterov
On 09/13, Nikolay Borisov wrote: > > > I just re-run the test with kernel 4.4.14 and PROVE_RCU and DEBUG_RCU_OBJECTS > enabled and here is what I got: Thanks again! Damn. This reminds me that I forgot to send the patch which reworks rcu/sync.c. Will do this week. But we need to investigate

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Nikolay Borisov
On 09/12/2016 04:01 PM, Oleg Nesterov wrote: > Hi Nikolay, > [SNIP..] >> >> >> The bug on in question is this: BUG_ON(rsp->gp_state != GP_PASSED); >> >> Have you seen something like that before - the kernel is fairly old 4.4.2, > > No... thanks, I'll try to look tomorrow. I just re-run the

Re: BUG_ON in rcu_sync_func triggered

2016-09-13 Thread Nikolay Borisov
On 09/12/2016 04:01 PM, Oleg Nesterov wrote: > Hi Nikolay, > [SNIP..] >> >> >> The bug on in question is this: BUG_ON(rsp->gp_state != GP_PASSED); >> >> Have you seen something like that before - the kernel is fairly old 4.4.2, > > No... thanks, I'll try to look tomorrow. I just re-run the

Re: BUG_ON in rcu_sync_func triggered

2016-09-12 Thread Oleg Nesterov
Hi Nikolay, On 09/12, Nikolay Borisov wrote: > > [ 2213.610208] [ cut here ] > [ 2213.614243] kernel BUG at kernel/rcu/sync.c:152! > [ 2213.618270] invalid opcode: [#1] SMP > [ 2213.696629] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.4.2-clouder2 #1 > [ 2213.702891]

Re: BUG_ON in rcu_sync_func triggered

2016-09-12 Thread Oleg Nesterov
Hi Nikolay, On 09/12, Nikolay Borisov wrote: > > [ 2213.610208] [ cut here ] > [ 2213.614243] kernel BUG at kernel/rcu/sync.c:152! > [ 2213.618270] invalid opcode: [#1] SMP > [ 2213.696629] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.4.2-clouder2 #1 > [ 2213.702891]