Re: suspicious RCU usage (netlink/rhashtable)
On Tue, Dec 22, 2015 at 04:50:20PM -0500, David Miller wrote: > > > > Simple fix is below. Though, I don't understand the history of the > > > > multiple locks in this structure to be sure it's correct. I'll send > > > > it as a formal patch. Please reject if it's not the right approach. > > > > > > > > diff --git a/lib/rhashtable.c b/lib/rhashtable.c > > > > index 1c149e9..cc80870 100644 > > > > --- a/lib/rhashtable.c > > > > +++ b/lib/rhashtable.c > > > > @@ -516,7 +516,8 @@ int rhashtable_walk_init(struct rhashtable *ht, > > > > struct rhashtable_iter *iter) > > > > return -ENOMEM; > > > > > > > > spin_lock(&ht->lock); > > > > - iter->walker->tbl = rht_dereference(ht->tbl, ht); > > > > + iter->walker->tbl = > > > > + rcu_dereference_protected(ht->tbl, > > lockdep_is_held(&ht->lock)); > > > > list_add(&iter->walker->list, &iter->walker->tbl->walkers); > > > > spin_unlock(&ht->lock); > > > > > > How can this be the "fix"? That's exactly what's in the tree. > > > > I should have made clear, this is Linus' tree I'm hitting this on, > > which matches what Craig posted. > > Ok, so this should be fixed in my 'net' tree and I'll send that to Linus > soon. Great, thanks Dave. Sorry for the fire-alarm :) Dave -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspicious RCU usage (netlink/rhashtable)
From: Dave Jones Date: Tue, 22 Dec 2015 16:47:34 -0500 > On Tue, Dec 22, 2015 at 04:42:25PM -0500, David Miller wrote: > > From: Craig Gallek > > Date: Tue, 22 Dec 2015 16:38:32 -0500 > > > > > On Tue, Dec 22, 2015 at 4:28 PM, David Miller > wrote: > > >> From: Craig Gallek > > >> Date: Tue, 22 Dec 2015 15:51:19 -0500 > > >> > > >>> I was actually just looking at this as well (though a slightly > > >>> different stack). The issue is with: c6ff5268293e rhashtable: Fix > > >>> walker list corruption > > >>> > > >>> It changed the lock acquired in rhashtable_walk_init to use the new > > >>> spinlock, but the rht_dereference macro expects the mutex. I was > > >>> still trying to track down which repository this change came in > > >>> through, though... > > >> > > >> Both cam via my networking tree. > > > Simple fix is below. Though, I don't understand the history of the > > > multiple locks in this structure to be sure it's correct. I'll send > > > it as a formal patch. Please reject if it's not the right approach. > > > > > > diff --git a/lib/rhashtable.c b/lib/rhashtable.c > > > index 1c149e9..cc80870 100644 > > > --- a/lib/rhashtable.c > > > +++ b/lib/rhashtable.c > > > @@ -516,7 +516,8 @@ int rhashtable_walk_init(struct rhashtable *ht, > > > struct rhashtable_iter *iter) > > > return -ENOMEM; > > > > > > spin_lock(&ht->lock); > > > - iter->walker->tbl = rht_dereference(ht->tbl, ht); > > > + iter->walker->tbl = > > > + rcu_dereference_protected(ht->tbl, > lockdep_is_held(&ht->lock)); > > > list_add(&iter->walker->list, &iter->walker->tbl->walkers); > > > spin_unlock(&ht->lock); > > > > How can this be the "fix"? That's exactly what's in the tree. > > I should have made clear, this is Linus' tree I'm hitting this on, > which matches what Craig posted. Ok, so this should be fixed in my 'net' tree and I'll send that to Linus soon. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspicious RCU usage (netlink/rhashtable)
On Tue, Dec 22, 2015 at 04:42:25PM -0500, David Miller wrote: > From: Craig Gallek > Date: Tue, 22 Dec 2015 16:38:32 -0500 > > > On Tue, Dec 22, 2015 at 4:28 PM, David Miller wrote: > >> From: Craig Gallek > >> Date: Tue, 22 Dec 2015 15:51:19 -0500 > >> > >>> I was actually just looking at this as well (though a slightly > >>> different stack). The issue is with: c6ff5268293e rhashtable: Fix > >>> walker list corruption > >>> > >>> It changed the lock acquired in rhashtable_walk_init to use the new > >>> spinlock, but the rht_dereference macro expects the mutex. I was > >>> still trying to track down which repository this change came in > >>> through, though... > >> > >> Both cam via my networking tree. > > Simple fix is below. Though, I don't understand the history of the > > multiple locks in this structure to be sure it's correct. I'll send > > it as a formal patch. Please reject if it's not the right approach. > > > > diff --git a/lib/rhashtable.c b/lib/rhashtable.c > > index 1c149e9..cc80870 100644 > > --- a/lib/rhashtable.c > > +++ b/lib/rhashtable.c > > @@ -516,7 +516,8 @@ int rhashtable_walk_init(struct rhashtable *ht, > > struct rhashtable_iter *iter) > > return -ENOMEM; > > > > spin_lock(&ht->lock); > > - iter->walker->tbl = rht_dereference(ht->tbl, ht); > > + iter->walker->tbl = > > + rcu_dereference_protected(ht->tbl, > > lockdep_is_held(&ht->lock)); > > list_add(&iter->walker->list, &iter->walker->tbl->walkers); > > spin_unlock(&ht->lock); > > How can this be the "fix"? That's exactly what's in the tree. I should have made clear, this is Linus' tree I'm hitting this on, which matches what Craig posted. Dave -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspicious RCU usage (netlink/rhashtable)
On Tue, Dec 22, 2015 at 4:42 PM, David Miller wrote: > From: Craig Gallek > Date: Tue, 22 Dec 2015 16:38:32 -0500 > >> On Tue, Dec 22, 2015 at 4:28 PM, David Miller wrote: >>> From: Craig Gallek >>> Date: Tue, 22 Dec 2015 15:51:19 -0500 >>> I was actually just looking at this as well (though a slightly different stack). The issue is with: c6ff5268293e rhashtable: Fix walker list corruption It changed the lock acquired in rhashtable_walk_init to use the new spinlock, but the rht_dereference macro expects the mutex. I was still trying to track down which repository this change came in through, though... >>> >>> Both cam via my networking tree. >> Simple fix is below. Though, I don't understand the history of the >> multiple locks in this structure to be sure it's correct. I'll send >> it as a formal patch. Please reject if it's not the right approach. >> >> diff --git a/lib/rhashtable.c b/lib/rhashtable.c >> index 1c149e9..cc80870 100644 >> --- a/lib/rhashtable.c >> +++ b/lib/rhashtable.c >> @@ -516,7 +516,8 @@ int rhashtable_walk_init(struct rhashtable *ht, >> struct rhashtable_iter *iter) >> return -ENOMEM; >> >> spin_lock(&ht->lock); >> - iter->walker->tbl = rht_dereference(ht->tbl, ht); >> + iter->walker->tbl = >> + rcu_dereference_protected(ht->tbl, >> lockdep_is_held(&ht->lock)); >> list_add(&iter->walker->list, &iter->walker->tbl->walkers); >> spin_unlock(&ht->lock); > > How can this be the "fix"? That's exactly what's in the tree. Ah, you're right, this fix was submitted to next in 179ccc0a7364 but hasn't made it into net-next yet. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspicious RCU usage (netlink/rhashtable)
From: Craig Gallek Date: Tue, 22 Dec 2015 16:38:32 -0500 > On Tue, Dec 22, 2015 at 4:28 PM, David Miller wrote: >> From: Craig Gallek >> Date: Tue, 22 Dec 2015 15:51:19 -0500 >> >>> I was actually just looking at this as well (though a slightly >>> different stack). The issue is with: c6ff5268293e rhashtable: Fix >>> walker list corruption >>> >>> It changed the lock acquired in rhashtable_walk_init to use the new >>> spinlock, but the rht_dereference macro expects the mutex. I was >>> still trying to track down which repository this change came in >>> through, though... >> >> Both cam via my networking tree. > Simple fix is below. Though, I don't understand the history of the > multiple locks in this structure to be sure it's correct. I'll send > it as a formal patch. Please reject if it's not the right approach. > > diff --git a/lib/rhashtable.c b/lib/rhashtable.c > index 1c149e9..cc80870 100644 > --- a/lib/rhashtable.c > +++ b/lib/rhashtable.c > @@ -516,7 +516,8 @@ int rhashtable_walk_init(struct rhashtable *ht, > struct rhashtable_iter *iter) > return -ENOMEM; > > spin_lock(&ht->lock); > - iter->walker->tbl = rht_dereference(ht->tbl, ht); > + iter->walker->tbl = > + rcu_dereference_protected(ht->tbl, > lockdep_is_held(&ht->lock)); > list_add(&iter->walker->list, &iter->walker->tbl->walkers); > spin_unlock(&ht->lock); How can this be the "fix"? That's exactly what's in the tree. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspicious RCU usage (netlink/rhashtable)
On Tue, Dec 22, 2015 at 4:28 PM, David Miller wrote: > From: Craig Gallek > Date: Tue, 22 Dec 2015 15:51:19 -0500 > >> I was actually just looking at this as well (though a slightly >> different stack). The issue is with: c6ff5268293e rhashtable: Fix >> walker list corruption >> >> It changed the lock acquired in rhashtable_walk_init to use the new >> spinlock, but the rht_dereference macro expects the mutex. I was >> still trying to track down which repository this change came in >> through, though... > > Both cam via my networking tree. Simple fix is below. Though, I don't understand the history of the multiple locks in this structure to be sure it's correct. I'll send it as a formal patch. Please reject if it's not the right approach. diff --git a/lib/rhashtable.c b/lib/rhashtable.c index 1c149e9..cc80870 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -516,7 +516,8 @@ int rhashtable_walk_init(struct rhashtable *ht, struct rhashtable_iter *iter) return -ENOMEM; spin_lock(&ht->lock); - iter->walker->tbl = rht_dereference(ht->tbl, ht); + iter->walker->tbl = + rcu_dereference_protected(ht->tbl, lockdep_is_held(&ht->lock)); list_add(&iter->walker->list, &iter->walker->tbl->walkers); spin_unlock(&ht->lock); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspicious RCU usage (netlink/rhashtable)
From: Craig Gallek Date: Tue, 22 Dec 2015 15:51:19 -0500 > I was actually just looking at this as well (though a slightly > different stack). The issue is with: c6ff5268293e rhashtable: Fix > walker list corruption > > It changed the lock acquired in rhashtable_walk_init to use the new > spinlock, but the rht_dereference macro expects the mutex. I was > still trying to track down which repository this change came in > through, though... Both cam via my networking tree. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspicious RCU usage (netlink/rhashtable)
On Tue, Dec 22, 2015 at 3:45 PM, Dave Jones wrote: > === > [ INFO: suspicious RCU usage. ] > 4.4.0-rc6-think+ #1 Not tainted > --- > lib/rhashtable.c:522 suspicious rcu_dereference_protected() usage! > > other info that might help us debug this: > > > rcu_scheduler_active = 1, debug_locks = 0 > 2 locks held by trinity-c1/3652: > #0: (&p->lock){+.+.+.}, at: [] seq_read+0xd7/0x900 > #1: (&(&ht->lock)->rlock){+.+...}, at: [] > rhashtable_walk_init+0x9d/0x170 > > stack backtrace: > CPU: 0 PID: 3652 Comm: trinity-c1 Not tainted 4.4.0-rc6-think+ #1 > 9af6ac60 3fc014d4 8800cff779e0 9a548da1 > 880459b8b700 8800cff77a10 9a131068 8800cdd32c48 > 880464af8000 8800cdd32c58 880464af8160 8800cff77a48 > Call Trace: > [] dump_stack+0x4e/0x7d > [] lockdep_rcu_suspicious+0xf8/0x110 > [] rhashtable_walk_init+0x163/0x170 > [] netlink_walk_start+0x49/0x90 > [] netlink_seq_start+0x40/0x90 > [] seq_read+0x1bf/0x900 > [] ? seq_lseek+0x1b0/0x1b0 > [] ? __might_fault+0xe0/0xf0 > [] ? __might_fault+0x87/0xf0 > [] ? rw_copy_check_uvector+0x139/0x170 > [] proc_reg_read+0x7f/0xc0 > [] do_loop_readv_writev+0xe0/0x110 > [] ? proc_reg_write+0xc0/0xc0 > [] do_readv_writev+0x38b/0x3c0 > [] ? proc_reg_write+0xc0/0xc0 > [] ? vfs_write+0x260/0x260 > [] ? __lock_is_held+0x25/0xd0 > [] ? mark_held_locks+0x23/0xc0 > [] ? context_tracking_exit.part.5+0x2a/0x50 > [] ? trace_hardirqs_on_caller+0x186/0x280 > [] ? trace_hardirqs_on+0xd/0x10 > [] vfs_readv+0x56/0x70 > [] SyS_preadv+0x15d/0x180 > [] ? SyS_writev+0x1a0/0x1a0 > [] ? trace_hardirqs_on_thunk+0x17/0x19 > [] entry_SYSCALL_64_fastpath+0x12/0x6b I was actually just looking at this as well (though a slightly different stack). The issue is with: c6ff5268293e rhashtable: Fix walker list corruption It changed the lock acquired in rhashtable_walk_init to use the new spinlock, but the rht_dereference macro expects the mutex. I was still trying to track down which repository this change came in through, though... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: suspicious RCU usage (netlink/rhashtable)
From: Dave Jones Date: Tue, 22 Dec 2015 15:45:39 -0500 > === > [ INFO: suspicious RCU usage. ] > 4.4.0-rc6-think+ #1 Not tainted > --- > lib/rhashtable.c:522 suspicious rcu_dereference_protected() usage! > > other info that might help us debug this: > > > rcu_scheduler_active = 1, debug_locks = 0 > 2 locks held by trinity-c1/3652: > #0: (&p->lock){+.+.+.}, at: [] seq_read+0xd7/0x900 > #1: (&(&ht->lock)->rlock){+.+...}, at: [] > rhashtable_walk_init+0x9d/0x170 I'm so confused, the code reads: spin_lock(&ht->lock); iter->walker->tbl = rcu_dereference_protected(ht->tbl, lockdep_is_held(&ht->lock)); ?!?!?! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
suspicious RCU usage (netlink/rhashtable)
=== [ INFO: suspicious RCU usage. ] 4.4.0-rc6-think+ #1 Not tainted --- lib/rhashtable.c:522 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by trinity-c1/3652: #0: (&p->lock){+.+.+.}, at: [] seq_read+0xd7/0x900 #1: (&(&ht->lock)->rlock){+.+...}, at: [] rhashtable_walk_init+0x9d/0x170 stack backtrace: CPU: 0 PID: 3652 Comm: trinity-c1 Not tainted 4.4.0-rc6-think+ #1 9af6ac60 3fc014d4 8800cff779e0 9a548da1 880459b8b700 8800cff77a10 9a131068 8800cdd32c48 880464af8000 8800cdd32c58 880464af8160 8800cff77a48 Call Trace: [] dump_stack+0x4e/0x7d [] lockdep_rcu_suspicious+0xf8/0x110 [] rhashtable_walk_init+0x163/0x170 [] netlink_walk_start+0x49/0x90 [] netlink_seq_start+0x40/0x90 [] seq_read+0x1bf/0x900 [] ? seq_lseek+0x1b0/0x1b0 [] ? __might_fault+0xe0/0xf0 [] ? __might_fault+0x87/0xf0 [] ? rw_copy_check_uvector+0x139/0x170 [] proc_reg_read+0x7f/0xc0 [] do_loop_readv_writev+0xe0/0x110 [] ? proc_reg_write+0xc0/0xc0 [] do_readv_writev+0x38b/0x3c0 [] ? proc_reg_write+0xc0/0xc0 [] ? vfs_write+0x260/0x260 [] ? __lock_is_held+0x25/0xd0 [] ? mark_held_locks+0x23/0xc0 [] ? context_tracking_exit.part.5+0x2a/0x50 [] ? trace_hardirqs_on_caller+0x186/0x280 [] ? trace_hardirqs_on+0xd/0x10 [] vfs_readv+0x56/0x70 [] SyS_preadv+0x15d/0x180 [] ? SyS_writev+0x1a0/0x1a0 [] ? trace_hardirqs_on_thunk+0x17/0x19 [] entry_SYSCALL_64_fastpath+0x12/0x6b -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html