Re: [PATCH 4/4] PPPoE: race between interface going down and release()
From: Florian Zumbiehl <[EMAIL PROTECTED]> Date: Sat, 21 Apr 2007 01:44:05 +0200 > > This patch doesn't apply becuase in the actual pppoe.c code: > > [...] > > > So I'm having trouble figuring out what tree you generated > > that patch against :-) Perhaps there was an earlier patch > > I missed or something. > > > > But I won't second guess and leave it to you to let me know > > what I should actually apply. > > That one was against my first try of a fix. So, either you apply that > one first and then remove it again by applying that patch ;-) - or just > ignore those four patches completely and apply > <[EMAIL PROTECTED]> ff instead. I figured out what happened and put in the 4-patch set later in my backlog, thanks! - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] PPPoE: race between interface going down and release()
Hi, > This patch doesn't apply becuase in the actual pppoe.c code: [...] > So I'm having trouble figuring out what tree you generated > that patch against :-) Perhaps there was an earlier patch > I missed or something. > > But I won't second guess and leave it to you to let me know > what I should actually apply. That one was against my first try of a fix. So, either you apply that one first and then remove it again by applying that patch ;-) - or just ignore those four patches completely and apply <[EMAIL PROTECTED]> ff instead. Florian - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] PPPoE: race between interface going down and release()
From: Michal Ostrowski <[EMAIL PROTECTED]> Date: Sun, 11 Mar 2007 21:36:56 -0500 > Attached below is my take on how to address this problem. > This addresses any concerns you may have had about checking > po->pppoe_dev==NULL, > because accesses to this field are now synchronized with pppoe_hash_lock. > > Once we can settle on a fix for this, I'll deal with the SID==0 issue > (trying to do that now would just cause patch conflicts). This patch doesn't apply becuase in the actual pppoe.c code: > - read_lock_bh(&pppoe_hash_lock); > + write_lock_bh(&pppoe_hash_lock); > for (hash = 0; hash < PPPOE_HASH_SIZE; hash++) { > struct pppox_sock *po = item_hash_table[hash]; > > while (po != NULL) { > - if (po->pppoe_dev == dev) { > - struct sock *sk = sk_pppox(po); > - > - sock_hold(sk); > - > - /* We hold a reference to SK, now drop the > - * hash table lock so that we may attempt > - * to lock the socket (which can sleep). > - */ That code doesn't look like that. At the end, instead there is: sock_hold(sk); po->pppoe_dev = NULL; and that goes back all the way to 2.6.12 and beyond before we moved over to GIT. So I'm having trouble figuring out what tree you generated that patch against :-) Perhaps there was an earlier patch I missed or something. But I won't second guess and leave it to you to let me know what I should actually apply. Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] PPPoE: race between interface going down and release()
Hi, > Attached below is my take on how to address this problem. > This addresses any concerns you may have had about checking > po->pppoe_dev==NULL, > because accesses to this field are now synchronized with pppoe_hash_lock. That indeed looks like a much cleaner solution, so I'd certainly prefer it over mine. Testing so far didn't turn up any problems, either. > Once we can settle on a fix for this, I'll deal with the SID==0 issue > (trying to do that now would just cause patch conflicts). Well, yeah, this stuff probably is more important than support for SID 0 :-) Florian - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] PPPoE: race between interface going down and release()
Attached below is my take on how to address this problem. This addresses any concerns you may have had about checking po->pppoe_dev==NULL, because accesses to this field are now synchronized with pppoe_hash_lock. Once we can settle on a fix for this, I'll deal with the SID==0 issue (trying to do that now would just cause patch conflicts). -- Michal Ostrowski <[EMAIL PROTECTED]> [PATCH] PPPOE Fix device tear-down notification. pppoe_flush_dev() kicks all sockets bound to a device that is going down. In doing so, locks must be taken in the right order consistently (sock lock, followed by the pppoe_hash_lock). However, the scan process is based on us holding the sock lock. So, when something is found in the scan we must release the lock we're holding and grab the sock lock. This patch fixes race conditions between this code and pppoe_release(), both of which perform similar functions but would naturally prefer to grab locks in opposing orders. Both code paths are now going after these locks in a consistent manner. pppoe_hash_lock protects the contents of the "pppox_sock" objects that reside inside the hash. Thus, NULL'ing out the pppoe_dev field should be done under the protection of this lock. Signed-off-by: Michal Ostrowski <[EMAIL PROTECTED]> --- drivers/net/pppoe.c | 93 +++--- 1 files changed, 50 insertions(+), 43 deletions(-) diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c index e097688..0961bf9 100644 --- a/drivers/net/pppoe.c +++ b/drivers/net/pppoe.c @@ -241,56 +241,53 @@ static inline struct pppox_sock *delete_item(unsigned long sid, char *addr, int static void pppoe_flush_dev(struct net_device *dev) { int hash; - BUG_ON(dev == NULL); - read_lock_bh(&pppoe_hash_lock); + write_lock_bh(&pppoe_hash_lock); for (hash = 0; hash < PPPOE_HASH_SIZE; hash++) { struct pppox_sock *po = item_hash_table[hash]; while (po != NULL) { - if (po->pppoe_dev == dev) { - struct sock *sk = sk_pppox(po); - - sock_hold(sk); - - /* We hold a reference to SK, now drop the -* hash table lock so that we may attempt -* to lock the socket (which can sleep). -*/ - read_unlock_bh(&pppoe_hash_lock); - - lock_sock(sk); - - if(po->pppoe_dev==dev){ - dev_put(dev); - po->pppoe_dev = NULL; - if (sk->sk_state & - (PPPOX_CONNECTED | PPPOX_BOUND)) { - pppox_unbind_sock(sk); - sk->sk_state = PPPOX_ZOMBIE; - sk->sk_state_change(sk); - } - } - - release_sock(sk); - - sock_put(sk); - - read_lock_bh(&pppoe_hash_lock); - - /* Now restart from the beginning of this -* hash chain. We always NULL out pppoe_dev -* so we are guaranteed to make forward -* progress. -*/ - po = item_hash_table[hash]; + struct sock *sk = sk_pppox(po); + if (po->pppoe_dev != dev) { + po = po->next; continue; } - po = po->next; + po->pppoe_dev = NULL; + dev_put(dev); + + + /* We always grab the socket lock, followed by the +* pppoe_hash_lock, in that order. Since we should +* hold the sock lock while doing any unbinding, +* we need to release the lock we're holding. +* Hold a reference to the sock so it doesn't disappear +* as we're jumping between locks. +*/ + + sock_hold(sk); + + write_unlock_bh(&pppoe_hash_lock); + lock_sock(sk); + + if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND)) { + pppox_unbind_sock(sk); + sk->sk_state = PPPOX_ZOMBIE; + sk->sk_state_change(sk); + } + + release_sock(sk); +
Re: [PATCH 4/4] PPPoE: race between interface going down and release()
I need some more time to think about this one; there are some problem I'm seeing here but I can't look at it right now. I'll send out my version of a patch for this tomorrow and we can discuss more. Regarding the previous three patches, they seem fine after a first pass. However, I'd like to ask you to please use a "Signed-off-by" statement. That why I can ack it and push it upstream without hesitation. -- Michal Ostrowski <[EMAIL PROTECTED]> On Sun, 2007-03-11 at 05:41 +0100, Florian Zumbiehl wrote: > Hi, > > below you find the last patch for now. It (hopefully) fixes a race > between a socket being release()d and the interface it's using going > down. As pppoe_release() didn't lock the socket, and pppoe_flush_dev() > did the locking in the wrong place, pppoe_flush_dev() could set > po->pppoe_dev to NULL, which would then cause pppoe_release() to not > dev_put() the interface, but to still mark the socket as DEAD, > which in turn would cause pppoe_flush_dev() to not dev_put() the > interface, effectively leaking one reference to the device, thus making it > impossible to remove (ignoring the possibility of overflowing the reference > counter by repeated use of this race ;-). > > The thing I'm not quite sure about is whether the "outer" > > |if (po->pppoe_dev == dev) { > > is actually reliable this way on SMP systems, as far as cache consistency > is concerned. I left it that way for now, as any alternative locking > strategies that would lock the socket before doing this comparison seemed > to be pretty complicated to implement because of the need to drop the > hash table lock before trying to acquire the socket lock, so I'd rather > be sure that this actually is a problem before I try to solve it ;-) > > Florian > > --- > diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c > index 1aeac2c..f5abfff 100644 > --- a/drivers/net/pppoe.c > +++ b/drivers/net/pppoe.c > @@ -253,7 +253,6 @@ static void pppoe_flush_dev(struct net_device *dev) > struct sock *sk = sk_pppox(po); > > sock_hold(sk); > - po->pppoe_dev = NULL; > > /* We hold a reference to SK, now drop the >* hash table lock so that we may attempt > @@ -263,12 +262,15 @@ static void pppoe_flush_dev(struct net_device *dev) > > lock_sock(sk); > > - if (sk->sk_state & > - (PPPOX_CONNECTED | PPPOX_BOUND)) { > - pppox_unbind_sock(sk); > + if(po->pppoe_dev==dev){ > dev_put(dev); > - sk->sk_state = PPPOX_ZOMBIE; > - sk->sk_state_change(sk); > + po->pppoe_dev = NULL; > + if (sk->sk_state & > + (PPPOX_CONNECTED | PPPOX_BOUND)) { > + pppox_unbind_sock(sk); > + sk->sk_state = PPPOX_ZOMBIE; > + sk->sk_state_change(sk); > + } > } > > release_sock(sk); > @@ -504,8 +506,11 @@ static int pppoe_release(struct socket *sock) > if (!sk) > return 0; > > - if (sock_flag(sk, SOCK_DEAD)) > + lock_sock(sk); > + if (sock_flag(sk, SOCK_DEAD)){ > + release_sock(sk); > return -EBADF; > + } > > pppox_unbind_sock(sk); > > @@ -526,6 +531,7 @@ static int pppoe_release(struct socket *sock) > sock->sk = NULL; > > skb_queue_purge(&sk->sk_receive_queue); > + release_sock(sk); > sock_put(sk); > > return 0; > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] PPPoE: race between interface going down and release()
Hi, below you find the last patch for now. It (hopefully) fixes a race between a socket being release()d and the interface it's using going down. As pppoe_release() didn't lock the socket, and pppoe_flush_dev() did the locking in the wrong place, pppoe_flush_dev() could set po->pppoe_dev to NULL, which would then cause pppoe_release() to not dev_put() the interface, but to still mark the socket as DEAD, which in turn would cause pppoe_flush_dev() to not dev_put() the interface, effectively leaking one reference to the device, thus making it impossible to remove (ignoring the possibility of overflowing the reference counter by repeated use of this race ;-). The thing I'm not quite sure about is whether the "outer" |if (po->pppoe_dev == dev) { is actually reliable this way on SMP systems, as far as cache consistency is concerned. I left it that way for now, as any alternative locking strategies that would lock the socket before doing this comparison seemed to be pretty complicated to implement because of the need to drop the hash table lock before trying to acquire the socket lock, so I'd rather be sure that this actually is a problem before I try to solve it ;-) Florian --- diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c index 1aeac2c..f5abfff 100644 --- a/drivers/net/pppoe.c +++ b/drivers/net/pppoe.c @@ -253,7 +253,6 @@ static void pppoe_flush_dev(struct net_device *dev) struct sock *sk = sk_pppox(po); sock_hold(sk); - po->pppoe_dev = NULL; /* We hold a reference to SK, now drop the * hash table lock so that we may attempt @@ -263,12 +262,15 @@ static void pppoe_flush_dev(struct net_device *dev) lock_sock(sk); - if (sk->sk_state & - (PPPOX_CONNECTED | PPPOX_BOUND)) { - pppox_unbind_sock(sk); + if(po->pppoe_dev==dev){ dev_put(dev); - sk->sk_state = PPPOX_ZOMBIE; - sk->sk_state_change(sk); + po->pppoe_dev = NULL; + if (sk->sk_state & + (PPPOX_CONNECTED | PPPOX_BOUND)) { + pppox_unbind_sock(sk); + sk->sk_state = PPPOX_ZOMBIE; + sk->sk_state_change(sk); + } } release_sock(sk); @@ -504,8 +506,11 @@ static int pppoe_release(struct socket *sock) if (!sk) return 0; - if (sock_flag(sk, SOCK_DEAD)) + lock_sock(sk); + if (sock_flag(sk, SOCK_DEAD)){ + release_sock(sk); return -EBADF; + } pppox_unbind_sock(sk); @@ -526,6 +531,7 @@ static int pppoe_release(struct socket *sock) sock->sk = NULL; skb_queue_purge(&sk->sk_receive_queue); + release_sock(sk); sock_put(sk); return 0; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html