Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014 at 12:02 PM, Hannes Frederic Sowa wrote: > > If fixes tag is well researched, it won't point to the addition of > ASSERT_RTNL() but your patch would help to discover a bug somewhere else > in the stack. > > I think for this patch a fixes-tag is hard to find because it is hard to > find because it dates back to the beginning of the git history IMHO. > As I replied to Eric, this warning is probably caused by the following commit: commit c15b1ccadb323ea50023e8f1cca2954129a62b51 Author: Hannes Frederic Sowa Date: Thu Mar 27 18:28:07 2014 +0100 ipv6: move DAD and addrconf_verify processing to workqueue HOWEVER, you can definitely argue that the code without your ASSERT_RTNL() was already broken, it's a little hard to tell without digging more, that might date back to the beginning of git as you said. For safety, I think we can simply assume it's that commit to be fixed so that we don't fix older kernels until someone really reports a bug. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On 09/02/2014 02:15 PM, Cong Wang wrote: > On Tue, Sep 2, 2014 at 11:11 AM, Eric Dumazet wrote: >> On Tue, 2014-09-02 at 11:04 -0700, Cong Wang wrote: >>> On Tue, Sep 2, 2014 at 10:58 AM, Hannes Frederic Sowa >> I definitely don't have a problem cleaning this up in net-next. I wanted a minimal patch for stable because I didn't check history where and when additional users of dev_get_by_flags_rcu were removed. >>> >>> `git grep` should show you we only have one caller. Apparently we don't >>> care about any out-of-tree module. >> >> Point is : you did not check if some stable versions had more callers. >> >> Its very nice you checked current version, but it is not enough for a >> stable candidate. > > That is what we do when backporting patches, I can do that if David asks > me to backport it, but you know for netdev that is David's work. > > (I am not saying I don't want to help him, I just want to point out the fact. > I am very pleased to help David for stable backports as long as he asks) Instead of helping after the fact, why not arrange the patches so that it's not such a big issue. Leave the _rcu variant alone. Add an _rtnl variant of the function and use that in the patch. Have a follow-on patch that removes the _rcu variant all by itself. This way backports become easier, and if anyone wants the _rcu variant back, all they have to do is revert a very simple commit. -vlad -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Di, 2014-09-02 at 11:40 -0700, Cong Wang wrote: > On Tue, Sep 2, 2014 at 11:18 AM, Hannes Frederic Sowa > wrote: > > Those ASSERT_RTNLs were misplaced and only caught the callers mostly > > from addrconf.c. I don't mind getting reports from stable kernel users > > and fixing those, too (or help fixing those). ASSERT_RTNL is not > > dangerous. > > > > We had a long history in not correctly using rtnl lock in ipv6/multicast > > code and those wrongfully placed ASSERT_RTNLs were my bad when I fixed > > the duplicate address detection handling. > > > > If enough multicast addresses are subscribed to an interface we might > > again get those splats because enabling promisc mode on an interface > > will also check for rtnl lock. > > > > Sure, I never doubt adding ASSERT_RTNL() is helpful, I just still think > this should be for net-next, or at least a separated patch. I don't want > my patch to be blamed in others' "Fixes:". :) Come on, that's why we have community review. Nobody blames anyone because of added regressions. It's more a fault of the community then, and it works out fairly good I think! Even others are keen on fixing your bugs sometimes. ;) If fixes tag is well researched, it won't point to the addition of ASSERT_RTNL() but your patch would help to discover a bug somewhere else in the stack. I think for this patch a fixes-tag is hard to find because it is hard to find because it dates back to the beginning of the git history IMHO. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014 at 11:18 AM, Hannes Frederic Sowa wrote: > Those ASSERT_RTNLs were misplaced and only caught the callers mostly > from addrconf.c. I don't mind getting reports from stable kernel users > and fixing those, too (or help fixing those). ASSERT_RTNL is not > dangerous. > > We had a long history in not correctly using rtnl lock in ipv6/multicast > code and those wrongfully placed ASSERT_RTNLs were my bad when I fixed > the duplicate address detection handling. > > If enough multicast addresses are subscribed to an interface we might > again get those splats because enabling promisc mode on an interface > will also check for rtnl lock. > Sure, I never doubt adding ASSERT_RTNL() is helpful, I just still think this should be for net-next, or at least a separated patch. I don't want my patch to be blamed in others' "Fixes:". :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014 at 11:21 AM, Eric Dumazet wrote: > On Tue, 2014-09-02 at 11:15 -0700, Cong Wang wrote: > >> That is what we do when backporting patches, I can do that if David asks >> me to backport it, but you know for netdev that is David's work. >> >> (I am not saying I don't want to help him, I just want to point out the fact. >> I am very pleased to help David for stable backports as long as he asks) > > Problem is : your patch submission do not identify bug origin. > > You claim you want to help, but you do not provide the basic thing that > _really_ helps. > > The proper way to identify bug origin is to add in the headers one > line : > > Fixes: 12-digit-SHA1 ("patch title") > Since when "Fixes:" tag becomes mandatory for a stable patch? At least netdev-FAQ is not updated. ;-) I 100% agree "Fixes:" is helpful when backporting patches, but it is not mandatory currently. For this patch, I was too lazy to dig the history, it looks like this is caused by the following commit: commit c15b1ccadb323ea50023e8f1cca2954129a62b51 Author: Hannes Frederic Sowa Date: Thu Mar 27 18:28:07 2014 +0100 ipv6: move DAD and addrconf_verify processing to workqueue Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, 2014-09-02 at 11:15 -0700, Cong Wang wrote: > That is what we do when backporting patches, I can do that if David asks > me to backport it, but you know for netdev that is David's work. > > (I am not saying I don't want to help him, I just want to point out the fact. > I am very pleased to help David for stable backports as long as he asks) Problem is : your patch submission do not identify bug origin. You claim you want to help, but you do not provide the basic thing that _really_ helps. The proper way to identify bug origin is to add in the headers one line : Fixes: 12-digit-SHA1 ("patch title") -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014, at 20:04, Cong Wang wrote: > On Tue, Sep 2, 2014 at 10:58 AM, Hannes Frederic Sowa > wrote: > > Hi Cong, > > > > On Tue, Sep 2, 2014, at 18:50, Cong Wang wrote: > >> On Fri, Aug 29, 2014 at 6:51 PM, Hannes Frederic Sowa > >> wrote: > >> > > >> > Also rtnl_lock and rcu_read_lock compose in that order, so we don't need > >> > to change dev_get_by_flags, but as this is the only user it sure is > >> > possible. RCU locked version is just easier composeable, so I wouldn't > >> > touch that if needed in future, just also take rcu lock as before. > >> > >> There is no point to keep RCU read lock if we have rtnl lock, > >> I don't know why you don't want to change dev_get_by_flags(), > >> it is pretty easy to do since it only has one caller. > > > > I definitely don't have a problem cleaning this up in net-next. I wanted > > a minimal patch for stable because I didn't check history where and when > > additional users of dev_get_by_flags_rcu were removed. > > `git grep` should show you we only have one caller. Apparently we don't > care about any out-of-tree module. Sure, I don't care about out-of-tree modules either. I just wanted to make it easier to backport. Current patch is almost headache free to backport. > >> Even if you really need RCU in future, you are always welcome > >> to bring it back when you do, sorry we should never be blocked by > >> code NOT merged yet. > >> > >> > > >> > Also we should move ASSERT_RTNL checks from addrconf_join_solict to > >> > ipv6_dev_mc_inc/dec. > >> > > >> > >> Make it another patch. > > > > It is just one logical change, moving ASSERT_RTNLs to places where they > > better catch invalid callstacks. > > > > Conflicts with what you claimed above. :) Those ASSERT_RTNLs were misplaced and only caught the callers mostly from addrconf.c. I don't mind getting reports from stable kernel users and fixing those, too (or help fixing those). ASSERT_RTNL is not dangerous. We had a long history in not correctly using rtnl lock in ipv6/multicast code and those wrongfully placed ASSERT_RTNLs were my bad when I fixed the duplicate address detection handling. If enough multicast addresses are subscribed to an interface we might again get those splats because enabling promisc mode on an interface will also check for rtnl lock. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014 at 11:11 AM, Eric Dumazet wrote: > On Tue, 2014-09-02 at 11:04 -0700, Cong Wang wrote: >> On Tue, Sep 2, 2014 at 10:58 AM, Hannes Frederic Sowa > >> > I definitely don't have a problem cleaning this up in net-next. I wanted >> > a minimal patch for stable because I didn't check history where and when >> > additional users of dev_get_by_flags_rcu were removed. >> >> `git grep` should show you we only have one caller. Apparently we don't >> care about any out-of-tree module. > > Point is : you did not check if some stable versions had more callers. > > Its very nice you checked current version, but it is not enough for a > stable candidate. That is what we do when backporting patches, I can do that if David asks me to backport it, but you know for netdev that is David's work. (I am not saying I don't want to help him, I just want to point out the fact. I am very pleased to help David for stable backports as long as he asks) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, 2014-09-02 at 11:04 -0700, Cong Wang wrote: > On Tue, Sep 2, 2014 at 10:58 AM, Hannes Frederic Sowa > > I definitely don't have a problem cleaning this up in net-next. I wanted > > a minimal patch for stable because I didn't check history where and when > > additional users of dev_get_by_flags_rcu were removed. > > `git grep` should show you we only have one caller. Apparently we don't > care about any out-of-tree module. Point is : you did not check if some stable versions had more callers. Its very nice you checked current version, but it is not enough for a stable candidate. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014 at 10:58 AM, Hannes Frederic Sowa wrote: > Hi Cong, > > On Tue, Sep 2, 2014, at 18:50, Cong Wang wrote: >> On Fri, Aug 29, 2014 at 6:51 PM, Hannes Frederic Sowa >> wrote: >> > >> > Also rtnl_lock and rcu_read_lock compose in that order, so we don't need >> > to change dev_get_by_flags, but as this is the only user it sure is >> > possible. RCU locked version is just easier composeable, so I wouldn't >> > touch that if needed in future, just also take rcu lock as before. >> >> There is no point to keep RCU read lock if we have rtnl lock, >> I don't know why you don't want to change dev_get_by_flags(), >> it is pretty easy to do since it only has one caller. > > I definitely don't have a problem cleaning this up in net-next. I wanted > a minimal patch for stable because I didn't check history where and when > additional users of dev_get_by_flags_rcu were removed. `git grep` should show you we only have one caller. Apparently we don't care about any out-of-tree module. > >> Even if you really need RCU in future, you are always welcome >> to bring it back when you do, sorry we should never be blocked by >> code NOT merged yet. >> >> > >> > Also we should move ASSERT_RTNL checks from addrconf_join_solict to >> > ipv6_dev_mc_inc/dec. >> > >> >> Make it another patch. > > It is just one logical change, moving ASSERT_RTNLs to places where they > better catch invalid callstacks. > Conflicts with what you claimed above. :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
Hi Cong, On Tue, Sep 2, 2014, at 18:50, Cong Wang wrote: > On Fri, Aug 29, 2014 at 6:51 PM, Hannes Frederic Sowa > wrote: > > > > Also rtnl_lock and rcu_read_lock compose in that order, so we don't need > > to change dev_get_by_flags, but as this is the only user it sure is > > possible. RCU locked version is just easier composeable, so I wouldn't > > touch that if needed in future, just also take rcu lock as before. > > There is no point to keep RCU read lock if we have rtnl lock, > I don't know why you don't want to change dev_get_by_flags(), > it is pretty easy to do since it only has one caller. I definitely don't have a problem cleaning this up in net-next. I wanted a minimal patch for stable because I didn't check history where and when additional users of dev_get_by_flags_rcu were removed. > Even if you really need RCU in future, you are always welcome > to bring it back when you do, sorry we should never be blocked by > code NOT merged yet. > > > > > Also we should move ASSERT_RTNL checks from addrconf_join_solict to > > ipv6_dev_mc_inc/dec. > > > > Make it another patch. It is just one logical change, moving ASSERT_RTNLs to places where they better catch invalid callstacks. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Fri, Aug 29, 2014 at 6:51 PM, Hannes Frederic Sowa wrote: > > Also rtnl_lock and rcu_read_lock compose in that order, so we don't need > to change dev_get_by_flags, but as this is the only user it sure is > possible. RCU locked version is just easier composeable, so I wouldn't > touch that if needed in future, just also take rcu lock as before. There is no point to keep RCU read lock if we have rtnl lock, I don't know why you don't want to change dev_get_by_flags(), it is pretty easy to do since it only has one caller. Even if you really need RCU in future, you are always welcome to bring it back when you do, sorry we should never be blocked by code NOT merged yet. > > Also we should move ASSERT_RTNL checks from addrconf_join_solict to > ipv6_dev_mc_inc/dec. > Make it another patch. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Fri, Aug 29, 2014 at 6:51 PM, Hannes Frederic Sowa han...@stressinduktion.org wrote: Also rtnl_lock and rcu_read_lock compose in that order, so we don't need to change dev_get_by_flags, but as this is the only user it sure is possible. RCU locked version is just easier composeable, so I wouldn't touch that if needed in future, just also take rcu lock as before. There is no point to keep RCU read lock if we have rtnl lock, I don't know why you don't want to change dev_get_by_flags(), it is pretty easy to do since it only has one caller. Even if you really need RCU in future, you are always welcome to bring it back when you do, sorry we should never be blocked by code NOT merged yet. Also we should move ASSERT_RTNL checks from addrconf_join_solict to ipv6_dev_mc_inc/dec. Make it another patch. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
Hi Cong, On Tue, Sep 2, 2014, at 18:50, Cong Wang wrote: On Fri, Aug 29, 2014 at 6:51 PM, Hannes Frederic Sowa han...@stressinduktion.org wrote: Also rtnl_lock and rcu_read_lock compose in that order, so we don't need to change dev_get_by_flags, but as this is the only user it sure is possible. RCU locked version is just easier composeable, so I wouldn't touch that if needed in future, just also take rcu lock as before. There is no point to keep RCU read lock if we have rtnl lock, I don't know why you don't want to change dev_get_by_flags(), it is pretty easy to do since it only has one caller. I definitely don't have a problem cleaning this up in net-next. I wanted a minimal patch for stable because I didn't check history where and when additional users of dev_get_by_flags_rcu were removed. Even if you really need RCU in future, you are always welcome to bring it back when you do, sorry we should never be blocked by code NOT merged yet. Also we should move ASSERT_RTNL checks from addrconf_join_solict to ipv6_dev_mc_inc/dec. Make it another patch. It is just one logical change, moving ASSERT_RTNLs to places where they better catch invalid callstacks. Bye, Hannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014 at 10:58 AM, Hannes Frederic Sowa han...@stressinduktion.org wrote: Hi Cong, On Tue, Sep 2, 2014, at 18:50, Cong Wang wrote: On Fri, Aug 29, 2014 at 6:51 PM, Hannes Frederic Sowa han...@stressinduktion.org wrote: Also rtnl_lock and rcu_read_lock compose in that order, so we don't need to change dev_get_by_flags, but as this is the only user it sure is possible. RCU locked version is just easier composeable, so I wouldn't touch that if needed in future, just also take rcu lock as before. There is no point to keep RCU read lock if we have rtnl lock, I don't know why you don't want to change dev_get_by_flags(), it is pretty easy to do since it only has one caller. I definitely don't have a problem cleaning this up in net-next. I wanted a minimal patch for stable because I didn't check history where and when additional users of dev_get_by_flags_rcu were removed. `git grep` should show you we only have one caller. Apparently we don't care about any out-of-tree module. Even if you really need RCU in future, you are always welcome to bring it back when you do, sorry we should never be blocked by code NOT merged yet. Also we should move ASSERT_RTNL checks from addrconf_join_solict to ipv6_dev_mc_inc/dec. Make it another patch. It is just one logical change, moving ASSERT_RTNLs to places where they better catch invalid callstacks. Conflicts with what you claimed above. :) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, 2014-09-02 at 11:04 -0700, Cong Wang wrote: On Tue, Sep 2, 2014 at 10:58 AM, Hannes Frederic Sowa I definitely don't have a problem cleaning this up in net-next. I wanted a minimal patch for stable because I didn't check history where and when additional users of dev_get_by_flags_rcu were removed. `git grep` should show you we only have one caller. Apparently we don't care about any out-of-tree module. Point is : you did not check if some stable versions had more callers. Its very nice you checked current version, but it is not enough for a stable candidate. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014 at 11:11 AM, Eric Dumazet eric.duma...@gmail.com wrote: On Tue, 2014-09-02 at 11:04 -0700, Cong Wang wrote: On Tue, Sep 2, 2014 at 10:58 AM, Hannes Frederic Sowa I definitely don't have a problem cleaning this up in net-next. I wanted a minimal patch for stable because I didn't check history where and when additional users of dev_get_by_flags_rcu were removed. `git grep` should show you we only have one caller. Apparently we don't care about any out-of-tree module. Point is : you did not check if some stable versions had more callers. Its very nice you checked current version, but it is not enough for a stable candidate. That is what we do when backporting patches, I can do that if David asks me to backport it, but you know for netdev that is David's work. (I am not saying I don't want to help him, I just want to point out the fact. I am very pleased to help David for stable backports as long as he asks) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014, at 20:04, Cong Wang wrote: On Tue, Sep 2, 2014 at 10:58 AM, Hannes Frederic Sowa han...@stressinduktion.org wrote: Hi Cong, On Tue, Sep 2, 2014, at 18:50, Cong Wang wrote: On Fri, Aug 29, 2014 at 6:51 PM, Hannes Frederic Sowa han...@stressinduktion.org wrote: Also rtnl_lock and rcu_read_lock compose in that order, so we don't need to change dev_get_by_flags, but as this is the only user it sure is possible. RCU locked version is just easier composeable, so I wouldn't touch that if needed in future, just also take rcu lock as before. There is no point to keep RCU read lock if we have rtnl lock, I don't know why you don't want to change dev_get_by_flags(), it is pretty easy to do since it only has one caller. I definitely don't have a problem cleaning this up in net-next. I wanted a minimal patch for stable because I didn't check history where and when additional users of dev_get_by_flags_rcu were removed. `git grep` should show you we only have one caller. Apparently we don't care about any out-of-tree module. Sure, I don't care about out-of-tree modules either. I just wanted to make it easier to backport. Current patch is almost headache free to backport. Even if you really need RCU in future, you are always welcome to bring it back when you do, sorry we should never be blocked by code NOT merged yet. Also we should move ASSERT_RTNL checks from addrconf_join_solict to ipv6_dev_mc_inc/dec. Make it another patch. It is just one logical change, moving ASSERT_RTNLs to places where they better catch invalid callstacks. Conflicts with what you claimed above. :) Those ASSERT_RTNLs were misplaced and only caught the callers mostly from addrconf.c. I don't mind getting reports from stable kernel users and fixing those, too (or help fixing those). ASSERT_RTNL is not dangerous. We had a long history in not correctly using rtnl lock in ipv6/multicast code and those wrongfully placed ASSERT_RTNLs were my bad when I fixed the duplicate address detection handling. If enough multicast addresses are subscribed to an interface we might again get those splats because enabling promisc mode on an interface will also check for rtnl lock. Bye, Hannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, 2014-09-02 at 11:15 -0700, Cong Wang wrote: That is what we do when backporting patches, I can do that if David asks me to backport it, but you know for netdev that is David's work. (I am not saying I don't want to help him, I just want to point out the fact. I am very pleased to help David for stable backports as long as he asks) Problem is : your patch submission do not identify bug origin. You claim you want to help, but you do not provide the basic thing that _really_ helps. The proper way to identify bug origin is to add in the headers one line : Fixes: 12-digit-SHA1 (patch title) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014 at 11:21 AM, Eric Dumazet eric.duma...@gmail.com wrote: On Tue, 2014-09-02 at 11:15 -0700, Cong Wang wrote: That is what we do when backporting patches, I can do that if David asks me to backport it, but you know for netdev that is David's work. (I am not saying I don't want to help him, I just want to point out the fact. I am very pleased to help David for stable backports as long as he asks) Problem is : your patch submission do not identify bug origin. You claim you want to help, but you do not provide the basic thing that _really_ helps. The proper way to identify bug origin is to add in the headers one line : Fixes: 12-digit-SHA1 (patch title) Since when Fixes: tag becomes mandatory for a stable patch? At least netdev-FAQ is not updated. ;-) I 100% agree Fixes: is helpful when backporting patches, but it is not mandatory currently. For this patch, I was too lazy to dig the history, it looks like this is caused by the following commit: commit c15b1ccadb323ea50023e8f1cca2954129a62b51 Author: Hannes Frederic Sowa han...@stressinduktion.org Date: Thu Mar 27 18:28:07 2014 +0100 ipv6: move DAD and addrconf_verify processing to workqueue Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014 at 11:18 AM, Hannes Frederic Sowa han...@stressinduktion.org wrote: Those ASSERT_RTNLs were misplaced and only caught the callers mostly from addrconf.c. I don't mind getting reports from stable kernel users and fixing those, too (or help fixing those). ASSERT_RTNL is not dangerous. We had a long history in not correctly using rtnl lock in ipv6/multicast code and those wrongfully placed ASSERT_RTNLs were my bad when I fixed the duplicate address detection handling. If enough multicast addresses are subscribed to an interface we might again get those splats because enabling promisc mode on an interface will also check for rtnl lock. Sure, I never doubt adding ASSERT_RTNL() is helpful, I just still think this should be for net-next, or at least a separated patch. I don't want my patch to be blamed in others' Fixes:. :) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Di, 2014-09-02 at 11:40 -0700, Cong Wang wrote: On Tue, Sep 2, 2014 at 11:18 AM, Hannes Frederic Sowa han...@stressinduktion.org wrote: Those ASSERT_RTNLs were misplaced and only caught the callers mostly from addrconf.c. I don't mind getting reports from stable kernel users and fixing those, too (or help fixing those). ASSERT_RTNL is not dangerous. We had a long history in not correctly using rtnl lock in ipv6/multicast code and those wrongfully placed ASSERT_RTNLs were my bad when I fixed the duplicate address detection handling. If enough multicast addresses are subscribed to an interface we might again get those splats because enabling promisc mode on an interface will also check for rtnl lock. Sure, I never doubt adding ASSERT_RTNL() is helpful, I just still think this should be for net-next, or at least a separated patch. I don't want my patch to be blamed in others' Fixes:. :) Come on, that's why we have community review. Nobody blames anyone because of added regressions. It's more a fault of the community then, and it works out fairly good I think! Even others are keen on fixing your bugs sometimes. ;) If fixes tag is well researched, it won't point to the addition of ASSERT_RTNL() but your patch would help to discover a bug somewhere else in the stack. I think for this patch a fixes-tag is hard to find because it is hard to find because it dates back to the beginning of the git history IMHO. Bye, Hannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On 09/02/2014 02:15 PM, Cong Wang wrote: On Tue, Sep 2, 2014 at 11:11 AM, Eric Dumazet eric.duma...@gmail.com wrote: On Tue, 2014-09-02 at 11:04 -0700, Cong Wang wrote: On Tue, Sep 2, 2014 at 10:58 AM, Hannes Frederic Sowa I definitely don't have a problem cleaning this up in net-next. I wanted a minimal patch for stable because I didn't check history where and when additional users of dev_get_by_flags_rcu were removed. `git grep` should show you we only have one caller. Apparently we don't care about any out-of-tree module. Point is : you did not check if some stable versions had more callers. Its very nice you checked current version, but it is not enough for a stable candidate. That is what we do when backporting patches, I can do that if David asks me to backport it, but you know for netdev that is David's work. (I am not saying I don't want to help him, I just want to point out the fact. I am very pleased to help David for stable backports as long as he asks) Instead of helping after the fact, why not arrange the patches so that it's not such a big issue. Leave the _rcu variant alone. Add an _rtnl variant of the function and use that in the patch. Have a follow-on patch that removes the _rcu variant all by itself. This way backports become easier, and if anyone wants the _rcu variant back, all they have to do is revert a very simple commit. -vlad -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Tue, Sep 2, 2014 at 12:02 PM, Hannes Frederic Sowa han...@stressinduktion.org wrote: If fixes tag is well researched, it won't point to the addition of ASSERT_RTNL() but your patch would help to discover a bug somewhere else in the stack. I think for this patch a fixes-tag is hard to find because it is hard to find because it dates back to the beginning of the git history IMHO. As I replied to Eric, this warning is probably caused by the following commit: commit c15b1ccadb323ea50023e8f1cca2954129a62b51 Author: Hannes Frederic Sowa han...@stressinduktion.org Date: Thu Mar 27 18:28:07 2014 +0100 ipv6: move DAD and addrconf_verify processing to workqueue HOWEVER, you can definitely argue that the code without your ASSERT_RTNL() was already broken, it's a little hard to tell without digging more, that might date back to the beginning of git as you said. For safety, I think we can simply assume it's that commit to be fixed so that we don't fix older kernels until someone really reports a bug. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
Hi, On Sa, 2014-08-30 at 12:58 +0200, Sabrina Dubroca wrote: > 2014-08-30, 03:51:29 +0200, Hannes Frederic Sowa wrote: > > Hi Sabrina, > > > > [...] > > > > Sorry, just had time to look at this. > > > > The reason is not to have list corruption but that the calls down to > > ndo_set_rx_mode expect rtnl to be locked by the drivers. Filter lists > > are locked by addr_list_lock and that's why I think we never saw any > > problems with that, but drivers expect rtnl locked for those calls. > > > > But this problem also affects multicast join, so patch seems incomplete > > to me (and for that matter ssm multicast join, too). > > > > Also rtnl_lock and rcu_read_lock compose in that order, so we don't need > > to change dev_get_by_flags, but as this is the only user it sure is > > possible. RCU locked version is just easier composeable, so I wouldn't > > touch that if needed in future, just also take rcu lock as before. > > > > So just adding rtnl_lock add appropriate places seems to be ok to me, > > but still need to review parts of the ssm code. > > > > Also we should move ASSERT_RTNL checks from addrconf_join_solict to > > ipv6_dev_mc_inc/dec. > > > > Thanks, > > Hannes > > Thanks for explaining. > > I had a look at what you suggested. > > > So, for anycast, on top of the previous patch, we'd have: > > --- > diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c > index 210183244689..61dd3046b804 100644 > --- a/net/ipv6/anycast.c > +++ b/net/ipv6/anycast.c > static void aca_put(struct ifacaddr6 *ac) > @@ -233,6 +235,8 @@ int ipv6_dev_ac_inc(struct net_device *dev, const struct > in6_addr *addr) > struct rt6_info *rt; > int err; > > + ASSERT_RTNL(); > + > idev = in6_dev_get(dev); > > if (idev == NULL) > @@ -302,6 +306,8 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const > struct in6_addr *addr) > { > struct ifacaddr6 *aca, *prev_aca; > > + ASSERT_RTNL(); > + > write_lock_bh(>lock); > prev_aca = NULL; > for (aca = idev->ac_list; aca; aca = aca->aca_next) { > @@ -336,6 +342,8 @@ static int ipv6_dev_ac_dec(struct net_device *dev, const > struct in6_addr *addr) > { > struct inet6_dev *idev = __in6_dev_get(dev); > > + ASSERT_RTNL(); > + > if (idev == NULL) > return -ENODEV; > return __ipv6_dev_ac_dec(idev, addr); ASSERT_RTNL() still performs a runtime check. While those are not really fast paths, I still think it is better to keep them to a minimum and place them only at places where we know all code which needs to be guarded passes by: I would suggest to move ASSERT_RTNL to ipv6_dev_mc_inc and __ipv6_dev_mc_dec and even remove the checks from addrconf_join_solict and addrconf_leave_solict. Does that cover all code paths and makes sense? > --- > > > And for multicast: > - locking order in the patch below: rtnl -> rcu -> ipv6_sk_mc_lock > - ipv6_sock_mc_join: maybe move all the _unlock()'s together at the end of > the function > - do we need to modify rcu_dereference_protected in > ipv6_sock_mc_drop/ipv6_sock_mc_close > - I had a look at the other codepaths that call ipv6_dev_mc_inc/dec >- ipv6_mc_destroy_dev, dev_forward_change, ipv6_add_dev, > addrconf_join_solict -- all take rtnl or already have an > ASSERT_RTNL() >- pndisc_destructor, called from pneigh_ifdown/pneigh_delete >- pndisc_constructor, called from pneigh_lookup -- pneigh_lookup > has ASSERT_RTNL(), but pneigh_lookup is called from ip6_forward and > ndisc_recv_na >- (hope I didn't miss any callers) > > As far as I could see, apart maybe from pndisc_constructor, it seems > okay, but I'd like to hear your comments. The rest of the patch looks good. Can you or Cong post a final patch with the adapted ac_join/drop changes? Thanks, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
Hi, On Sa, 2014-08-30 at 12:58 +0200, Sabrina Dubroca wrote: 2014-08-30, 03:51:29 +0200, Hannes Frederic Sowa wrote: Hi Sabrina, [...] Sorry, just had time to look at this. The reason is not to have list corruption but that the calls down to ndo_set_rx_mode expect rtnl to be locked by the drivers. Filter lists are locked by addr_list_lock and that's why I think we never saw any problems with that, but drivers expect rtnl locked for those calls. But this problem also affects multicast join, so patch seems incomplete to me (and for that matter ssm multicast join, too). Also rtnl_lock and rcu_read_lock compose in that order, so we don't need to change dev_get_by_flags, but as this is the only user it sure is possible. RCU locked version is just easier composeable, so I wouldn't touch that if needed in future, just also take rcu lock as before. So just adding rtnl_lock add appropriate places seems to be ok to me, but still need to review parts of the ssm code. Also we should move ASSERT_RTNL checks from addrconf_join_solict to ipv6_dev_mc_inc/dec. Thanks, Hannes Thanks for explaining. I had a look at what you suggested. So, for anycast, on top of the previous patch, we'd have: --- diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index 210183244689..61dd3046b804 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c static void aca_put(struct ifacaddr6 *ac) @@ -233,6 +235,8 @@ int ipv6_dev_ac_inc(struct net_device *dev, const struct in6_addr *addr) struct rt6_info *rt; int err; + ASSERT_RTNL(); + idev = in6_dev_get(dev); if (idev == NULL) @@ -302,6 +306,8 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct in6_addr *addr) { struct ifacaddr6 *aca, *prev_aca; + ASSERT_RTNL(); + write_lock_bh(idev-lock); prev_aca = NULL; for (aca = idev-ac_list; aca; aca = aca-aca_next) { @@ -336,6 +342,8 @@ static int ipv6_dev_ac_dec(struct net_device *dev, const struct in6_addr *addr) { struct inet6_dev *idev = __in6_dev_get(dev); + ASSERT_RTNL(); + if (idev == NULL) return -ENODEV; return __ipv6_dev_ac_dec(idev, addr); ASSERT_RTNL() still performs a runtime check. While those are not really fast paths, I still think it is better to keep them to a minimum and place them only at places where we know all code which needs to be guarded passes by: I would suggest to move ASSERT_RTNL to ipv6_dev_mc_inc and __ipv6_dev_mc_dec and even remove the checks from addrconf_join_solict and addrconf_leave_solict. Does that cover all code paths and makes sense? --- And for multicast: - locking order in the patch below: rtnl - rcu - ipv6_sk_mc_lock - ipv6_sock_mc_join: maybe move all the _unlock()'s together at the end of the function - do we need to modify rcu_dereference_protected in ipv6_sock_mc_drop/ipv6_sock_mc_close - I had a look at the other codepaths that call ipv6_dev_mc_inc/dec - ipv6_mc_destroy_dev, dev_forward_change, ipv6_add_dev, addrconf_join_solict -- all take rtnl or already have an ASSERT_RTNL() - pndisc_destructor, called from pneigh_ifdown/pneigh_delete - pndisc_constructor, called from pneigh_lookup -- pneigh_lookup has ASSERT_RTNL(), but pneigh_lookup is called from ip6_forward and ndisc_recv_na - (hope I didn't miss any callers) As far as I could see, apart maybe from pndisc_constructor, it seems okay, but I'd like to hear your comments. The rest of the patch looks good. Can you or Cong post a final patch with the adapted ac_join/drop changes? Thanks, Hannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
2014-08-30, 12:58:21 +0200, Sabrina Dubroca wrote: >- pndisc_constructor, called from pneigh_lookup -- pneigh_lookup > has ASSERT_RTNL(), but pneigh_lookup is called from ip6_forward and > ndisc_recv_na Ah, these have creat = 0, so it's fine. I missed that earlier. Sorry for the noise. -- Sabrina -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
Hello, 2014-08-30, 03:51:29 +0200, Hannes Frederic Sowa wrote: > Hi Sabrina, > > [...] > > Sorry, just had time to look at this. > > The reason is not to have list corruption but that the calls down to > ndo_set_rx_mode expect rtnl to be locked by the drivers. Filter lists > are locked by addr_list_lock and that's why I think we never saw any > problems with that, but drivers expect rtnl locked for those calls. > > But this problem also affects multicast join, so patch seems incomplete > to me (and for that matter ssm multicast join, too). > > Also rtnl_lock and rcu_read_lock compose in that order, so we don't need > to change dev_get_by_flags, but as this is the only user it sure is > possible. RCU locked version is just easier composeable, so I wouldn't > touch that if needed in future, just also take rcu lock as before. > > So just adding rtnl_lock add appropriate places seems to be ok to me, > but still need to review parts of the ssm code. > > Also we should move ASSERT_RTNL checks from addrconf_join_solict to > ipv6_dev_mc_inc/dec. > > Thanks, > Hannes Thanks for explaining. I had a look at what you suggested. So, for anycast, on top of the previous patch, we'd have: --- diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index 210183244689..61dd3046b804 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c static void aca_put(struct ifacaddr6 *ac) @@ -233,6 +235,8 @@ int ipv6_dev_ac_inc(struct net_device *dev, const struct in6_addr *addr) struct rt6_info *rt; int err; + ASSERT_RTNL(); + idev = in6_dev_get(dev); if (idev == NULL) @@ -302,6 +306,8 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct in6_addr *addr) { struct ifacaddr6 *aca, *prev_aca; + ASSERT_RTNL(); + write_lock_bh(>lock); prev_aca = NULL; for (aca = idev->ac_list; aca; aca = aca->aca_next) { @@ -336,6 +342,8 @@ static int ipv6_dev_ac_dec(struct net_device *dev, const struct in6_addr *addr) { struct inet6_dev *idev = __in6_dev_get(dev); + ASSERT_RTNL(); + if (idev == NULL) return -ENODEV; return __ipv6_dev_ac_dec(idev, addr); --- And for multicast: - locking order in the patch below: rtnl -> rcu -> ipv6_sk_mc_lock - ipv6_sock_mc_join: maybe move all the _unlock()'s together at the end of the function - do we need to modify rcu_dereference_protected in ipv6_sock_mc_drop/ipv6_sock_mc_close - I had a look at the other codepaths that call ipv6_dev_mc_inc/dec - ipv6_mc_destroy_dev, dev_forward_change, ipv6_add_dev, addrconf_join_solict -- all take rtnl or already have an ASSERT_RTNL() - pndisc_destructor, called from pneigh_ifdown/pneigh_delete - pndisc_constructor, called from pneigh_lookup -- pneigh_lookup has ASSERT_RTNL(), but pneigh_lookup is called from ip6_forward and ndisc_recv_na - (hope I didn't miss any callers) As far as I could see, apart maybe from pndisc_constructor, it seems okay, but I'd like to hear your comments. Current modifications: --- diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 70881795da96..d73ac1ef65f2 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -172,6 +172,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) mc_lst->next = NULL; mc_lst->addr = *addr; + rtnl_lock(); rcu_read_lock(); if (ifindex == 0) { struct rt6_info *rt; @@ -185,6 +186,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) if (dev == NULL) { rcu_read_unlock(); + rtnl_unlock(); sock_kfree_s(sk, mc_lst, sizeof(*mc_lst)); return -ENODEV; } @@ -202,6 +204,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) if (err) { rcu_read_unlock(); + rtnl_unlock(); sock_kfree_s(sk, mc_lst, sizeof(*mc_lst)); return err; } @@ -212,6 +215,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) spin_unlock(_sk_mc_lock); rcu_read_unlock(); + rtnl_unlock(); return 0; } @@ -229,6 +233,7 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr) if (!ipv6_addr_is_multicast(addr)) return -EINVAL; + rtnl_lock(); spin_lock(_sk_mc_lock); for (lnk = >ipv6_mc_list; (mc_lst = rcu_dereference_protected(*lnk, @@ -252,12 +257,15 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr) } else (void) ip6_mc_leave_src(sk, mc_lst, NULL); rcu_read_unlock(); + rtnl_unlock(); + atomic_sub(sizeof(*mc_lst), >sk_omem_alloc);
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
2014-08-29, 15:54:48 -0700, Cong Wang wrote: > [...] > > You are absolutely right here. > > Can I have your Signed-off-by and Tested-by before sending the patch > formally? > > Thanks! Sure: Signed-off-by: Sabrina Dubroca Tested-by: Sabrina Dubroca Thanks, -- Sabrina -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
2014-08-29, 15:54:48 -0700, Cong Wang wrote: [...] You are absolutely right here. Can I have your Signed-off-by and Tested-by before sending the patch formally? Thanks! Sure: Signed-off-by: Sabrina Dubroca s...@queasysnail.net Tested-by: Sabrina Dubroca s...@queasysnail.net Thanks, -- Sabrina -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
Hello, 2014-08-30, 03:51:29 +0200, Hannes Frederic Sowa wrote: Hi Sabrina, [...] Sorry, just had time to look at this. The reason is not to have list corruption but that the calls down to ndo_set_rx_mode expect rtnl to be locked by the drivers. Filter lists are locked by addr_list_lock and that's why I think we never saw any problems with that, but drivers expect rtnl locked for those calls. But this problem also affects multicast join, so patch seems incomplete to me (and for that matter ssm multicast join, too). Also rtnl_lock and rcu_read_lock compose in that order, so we don't need to change dev_get_by_flags, but as this is the only user it sure is possible. RCU locked version is just easier composeable, so I wouldn't touch that if needed in future, just also take rcu lock as before. So just adding rtnl_lock add appropriate places seems to be ok to me, but still need to review parts of the ssm code. Also we should move ASSERT_RTNL checks from addrconf_join_solict to ipv6_dev_mc_inc/dec. Thanks, Hannes Thanks for explaining. I had a look at what you suggested. So, for anycast, on top of the previous patch, we'd have: --- diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index 210183244689..61dd3046b804 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c static void aca_put(struct ifacaddr6 *ac) @@ -233,6 +235,8 @@ int ipv6_dev_ac_inc(struct net_device *dev, const struct in6_addr *addr) struct rt6_info *rt; int err; + ASSERT_RTNL(); + idev = in6_dev_get(dev); if (idev == NULL) @@ -302,6 +306,8 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct in6_addr *addr) { struct ifacaddr6 *aca, *prev_aca; + ASSERT_RTNL(); + write_lock_bh(idev-lock); prev_aca = NULL; for (aca = idev-ac_list; aca; aca = aca-aca_next) { @@ -336,6 +342,8 @@ static int ipv6_dev_ac_dec(struct net_device *dev, const struct in6_addr *addr) { struct inet6_dev *idev = __in6_dev_get(dev); + ASSERT_RTNL(); + if (idev == NULL) return -ENODEV; return __ipv6_dev_ac_dec(idev, addr); --- And for multicast: - locking order in the patch below: rtnl - rcu - ipv6_sk_mc_lock - ipv6_sock_mc_join: maybe move all the _unlock()'s together at the end of the function - do we need to modify rcu_dereference_protected in ipv6_sock_mc_drop/ipv6_sock_mc_close - I had a look at the other codepaths that call ipv6_dev_mc_inc/dec - ipv6_mc_destroy_dev, dev_forward_change, ipv6_add_dev, addrconf_join_solict -- all take rtnl or already have an ASSERT_RTNL() - pndisc_destructor, called from pneigh_ifdown/pneigh_delete - pndisc_constructor, called from pneigh_lookup -- pneigh_lookup has ASSERT_RTNL(), but pneigh_lookup is called from ip6_forward and ndisc_recv_na - (hope I didn't miss any callers) As far as I could see, apart maybe from pndisc_constructor, it seems okay, but I'd like to hear your comments. Current modifications: --- diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 70881795da96..d73ac1ef65f2 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -172,6 +172,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) mc_lst-next = NULL; mc_lst-addr = *addr; + rtnl_lock(); rcu_read_lock(); if (ifindex == 0) { struct rt6_info *rt; @@ -185,6 +186,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) if (dev == NULL) { rcu_read_unlock(); + rtnl_unlock(); sock_kfree_s(sk, mc_lst, sizeof(*mc_lst)); return -ENODEV; } @@ -202,6 +204,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) if (err) { rcu_read_unlock(); + rtnl_unlock(); sock_kfree_s(sk, mc_lst, sizeof(*mc_lst)); return err; } @@ -212,6 +215,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) spin_unlock(ipv6_sk_mc_lock); rcu_read_unlock(); + rtnl_unlock(); return 0; } @@ -229,6 +233,7 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr) if (!ipv6_addr_is_multicast(addr)) return -EINVAL; + rtnl_lock(); spin_lock(ipv6_sk_mc_lock); for (lnk = np-ipv6_mc_list; (mc_lst = rcu_dereference_protected(*lnk, @@ -252,12 +257,15 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr) } else (void) ip6_mc_leave_src(sk, mc_lst, NULL); rcu_read_unlock(); + rtnl_unlock(); + atomic_sub(sizeof(*mc_lst), sk-sk_omem_alloc); kfree_rcu(mc_lst,
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
2014-08-30, 12:58:21 +0200, Sabrina Dubroca wrote: - pndisc_constructor, called from pneigh_lookup -- pneigh_lookup has ASSERT_RTNL(), but pneigh_lookup is called from ip6_forward and ndisc_recv_na Ah, these have creat = 0, so it's fine. I missed that earlier. Sorry for the noise. -- Sabrina -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
Hi Sabrina, On Fr, 2014-08-29 at 21:53 +0200, Sabrina Dubroca wrote: > 2014-08-29, 11:14:48 -0700, Cong Wang wrote: > > On Fri, Aug 29, 2014 at 8:26 AM, Tommi Rantala wrote: > > > [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) > > > [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted > > > 3.17.0-rc2+ #30 > > > [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > > > [ 77.299789] 88003d76a618 880026133c50 8238ba79 > > > 880037c84520 > > > [ 77.300829] 880026133c90 820bd52b > > > 82d86c40 > > > [ 77.301869] f76fd1e1 8800382d8000 > > > 8800382d8220 > > > [ 77.302906] Call Trace: > > > [ 77.303246] [] dump_stack+0x4d/0x66 > > > [ 77.303928] [] addrconf_join_solict+0x4b/0xb0 > > > [ 77.304731] [] ipv6_dev_ac_inc+0x2bb/0x330 > > > [ 77.305498] [] ? ac6_seq_start+0x260/0x260 > > > [ 77.306257] [] ipv6_sock_ac_join+0x26e/0x360 > > > [ 77.307046] [] ? ipv6_sock_ac_join+0x99/0x360 > > > [ 77.307798] [] do_ipv6_setsockopt.isra.5+0xa70/0xf20 > > > > > > I think we should just use rtnl_lock() instead of rcu_read_lock() there, > > it is not a hot path worth optimization. > > > > Please try the attached patch. > > note: it doesn't build as it is now, it needs: > > -EXPORT_SYMBOL(dev_get_by_flags_rcu); > +EXPORT_SYMBOL(dev_get_by_flags); > > > I just tried your patch with a basic test program (open > socket/join/leave/close and open socket/join/close). > > I think you need to modify ipv6_sock_ac_close as well, or you can still > trigger the assertion when closing the socket without leaving first. > > Modified patch attached. Sorry, just had time to look at this. The reason is not to have list corruption but that the calls down to ndo_set_rx_mode expect rtnl to be locked by the drivers. Filter lists are locked by addr_list_lock and that's why I think we never saw any problems with that, but drivers expect rtnl locked for those calls. But this problem also affects multicast join, so patch seems incomplete to me (and for that matter ssm multicast join, too). Also rtnl_lock and rcu_read_lock compose in that order, so we don't need to change dev_get_by_flags, but as this is the only user it sure is possible. RCU locked version is just easier composeable, so I wouldn't touch that if needed in future, just also take rcu lock as before. So just adding rtnl_lock add appropriate places seems to be ok to me, but still need to review parts of the ssm code. Also we should move ASSERT_RTNL checks from addrconf_join_solict to ipv6_dev_mc_inc/dec. Thanks, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Fri, Aug 29, 2014 at 12:53 PM, Sabrina Dubroca wrote: > 2014-08-29, 11:14:48 -0700, Cong Wang wrote: >> On Fri, Aug 29, 2014 at 8:26 AM, Tommi Rantala wrote: >> > [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) >> > [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ >> > #30 >> > [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 >> > [ 77.299789] 88003d76a618 880026133c50 8238ba79 >> > 880037c84520 >> > [ 77.300829] 880026133c90 820bd52b >> > 82d86c40 >> > [ 77.301869] f76fd1e1 8800382d8000 >> > 8800382d8220 >> > [ 77.302906] Call Trace: >> > [ 77.303246] [] dump_stack+0x4d/0x66 >> > [ 77.303928] [] addrconf_join_solict+0x4b/0xb0 >> > [ 77.304731] [] ipv6_dev_ac_inc+0x2bb/0x330 >> > [ 77.305498] [] ? ac6_seq_start+0x260/0x260 >> > [ 77.306257] [] ipv6_sock_ac_join+0x26e/0x360 >> > [ 77.307046] [] ? ipv6_sock_ac_join+0x99/0x360 >> > [ 77.307798] [] do_ipv6_setsockopt.isra.5+0xa70/0xf20 >> >> >> I think we should just use rtnl_lock() instead of rcu_read_lock() there, >> it is not a hot path worth optimization. >> >> Please try the attached patch. > > note: it doesn't build as it is now, it needs: > > -EXPORT_SYMBOL(dev_get_by_flags_rcu); > +EXPORT_SYMBOL(dev_get_by_flags); > > > I just tried your patch with a basic test program (open > socket/join/leave/close and open socket/join/close). > > I think you need to modify ipv6_sock_ac_close as well, or you can still > trigger the assertion when closing the socket without leaving first. You are absolutely right here. Can I have your Signed-off-by and Tested-by before sending the patch formally? Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
2014-08-29, 11:14:48 -0700, Cong Wang wrote: > On Fri, Aug 29, 2014 at 8:26 AM, Tommi Rantala wrote: > > [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) > > [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ > > #30 > > [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > > [ 77.299789] 88003d76a618 880026133c50 8238ba79 > > 880037c84520 > > [ 77.300829] 880026133c90 820bd52b > > 82d86c40 > > [ 77.301869] f76fd1e1 8800382d8000 > > 8800382d8220 > > [ 77.302906] Call Trace: > > [ 77.303246] [] dump_stack+0x4d/0x66 > > [ 77.303928] [] addrconf_join_solict+0x4b/0xb0 > > [ 77.304731] [] ipv6_dev_ac_inc+0x2bb/0x330 > > [ 77.305498] [] ? ac6_seq_start+0x260/0x260 > > [ 77.306257] [] ipv6_sock_ac_join+0x26e/0x360 > > [ 77.307046] [] ? ipv6_sock_ac_join+0x99/0x360 > > [ 77.307798] [] do_ipv6_setsockopt.isra.5+0xa70/0xf20 > > > I think we should just use rtnl_lock() instead of rcu_read_lock() there, > it is not a hot path worth optimization. > > Please try the attached patch. note: it doesn't build as it is now, it needs: -EXPORT_SYMBOL(dev_get_by_flags_rcu); +EXPORT_SYMBOL(dev_get_by_flags); I just tried your patch with a basic test program (open socket/join/leave/close and open socket/join/close). I think you need to modify ipv6_sock_ac_close as well, or you can still trigger the assertion when closing the socket without leaving first. Modified patch attached. -- Sabrina diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 429801370d0c..1ae0e745b1b1 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2077,8 +2077,8 @@ void __dev_remove_pack(struct packet_type *pt); void dev_add_offload(struct packet_offload *po); void dev_remove_offload(struct packet_offload *po); -struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short flags, - unsigned short mask); +struct net_device *dev_get_by_flags(struct net *net, unsigned short flags, + unsigned short mask); struct net_device *dev_get_by_name(struct net *net, const char *name); struct net_device *dev_get_by_name_rcu(struct net *net, const char *name); struct net_device *__dev_get_by_name(struct net *net, const char *name); diff --git a/net/core/dev.c b/net/core/dev.c index 443b814db05b..8fede6ef4a39 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -897,23 +897,24 @@ struct net_device *dev_getfirstbyhwtype(struct net *net, unsigned short type) EXPORT_SYMBOL(dev_getfirstbyhwtype); /** - * dev_get_by_flags_rcu - find any device with given flags + * dev_get_by_flags - find any device with given flags * @net: the applicable net namespace * @if_flags: IFF_* values * @mask: bitmask of bits in if_flags to check * * Search for any interface with the given flags. Returns NULL if a device * is not found or a pointer to the device. Must be called inside - * rcu_read_lock(), and result refcount is unchanged. + * rtnl_lock(), and result refcount is unchanged. */ -struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short if_flags, +struct net_device *dev_get_by_flags(struct net *net, unsigned short if_flags, unsigned short mask) { struct net_device *dev, *ret; + ASSERT_RTNL(); ret = NULL; - for_each_netdev_rcu(net, dev) { + for_each_netdev(net, dev) { if (((dev->flags ^ if_flags) & mask) == 0) { ret = dev; break; @@ -921,7 +922,7 @@ struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short if_flags } return ret; } -EXPORT_SYMBOL(dev_get_by_flags_rcu); +EXPORT_SYMBOL(dev_get_by_flags); /** * dev_valid_name - check if name is okay for network device diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index 210183244689..6de5caa26ea4 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c @@ -77,7 +77,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) pac->acl_next = NULL; pac->acl_addr = *addr; - rcu_read_lock(); + rtnl_lock(); if (ifindex == 0) { struct rt6_info *rt; @@ -90,11 +90,11 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) goto error; } else { /* router, no matching interface: just pick one */ - dev = dev_get_by_flags_rcu(net, IFF_UP, + dev = dev_get_by_flags(net, IFF_UP,
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Fri, Aug 29, 2014 at 8:26 AM, Tommi Rantala wrote: > [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) > [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 > [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > [ 77.299789] 88003d76a618 880026133c50 8238ba79 > 880037c84520 > [ 77.300829] 880026133c90 820bd52b > 82d86c40 > [ 77.301869] f76fd1e1 8800382d8000 > 8800382d8220 > [ 77.302906] Call Trace: > [ 77.303246] [] dump_stack+0x4d/0x66 > [ 77.303928] [] addrconf_join_solict+0x4b/0xb0 > [ 77.304731] [] ipv6_dev_ac_inc+0x2bb/0x330 > [ 77.305498] [] ? ac6_seq_start+0x260/0x260 > [ 77.306257] [] ipv6_sock_ac_join+0x26e/0x360 > [ 77.307046] [] ? ipv6_sock_ac_join+0x99/0x360 > [ 77.307798] [] do_ipv6_setsockopt.isra.5+0xa70/0xf20 I think we should just use rtnl_lock() instead of rcu_read_lock() there, it is not a hot path worth optimization. Please try the attached patch. commit 31d83db0b417f705cbb31b2159603b8b53b81ab6 Author: Cong Wang Date: Fri Aug 29 11:02:15 2014 -0700 ipv6: fix rtnl lock assertion in ipv6_sock_ac_join() Signed-off-by: Cong Wang diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 4298013..1ae0e74 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2077,8 +2077,8 @@ void __dev_remove_pack(struct packet_type *pt); void dev_add_offload(struct packet_offload *po); void dev_remove_offload(struct packet_offload *po); -struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short flags, - unsigned short mask); +struct net_device *dev_get_by_flags(struct net *net, unsigned short flags, + unsigned short mask); struct net_device *dev_get_by_name(struct net *net, const char *name); struct net_device *dev_get_by_name_rcu(struct net *net, const char *name); struct net_device *__dev_get_by_name(struct net *net, const char *name); diff --git a/net/core/dev.c b/net/core/dev.c index 26d296c..73cdb03 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -897,23 +897,24 @@ struct net_device *dev_getfirstbyhwtype(struct net *net, unsigned short type) EXPORT_SYMBOL(dev_getfirstbyhwtype); /** - * dev_get_by_flags_rcu - find any device with given flags + * dev_get_by_flags - find any device with given flags * @net: the applicable net namespace * @if_flags: IFF_* values * @mask: bitmask of bits in if_flags to check * * Search for any interface with the given flags. Returns NULL if a device * is not found or a pointer to the device. Must be called inside - * rcu_read_lock(), and result refcount is unchanged. + * rtnl_lock(), and result refcount is unchanged. */ -struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short if_flags, +struct net_device *dev_get_by_flags(struct net *net, unsigned short if_flags, unsigned short mask) { struct net_device *dev, *ret; + ASSERT_RTNL(); ret = NULL; - for_each_netdev_rcu(net, dev) { + for_each_netdev(net, dev) { if (((dev->flags ^ if_flags) & mask) == 0) { ret = dev; break; diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index 2101832..c523c1a 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c @@ -77,7 +77,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) pac->acl_next = NULL; pac->acl_addr = *addr; - rcu_read_lock(); + rtnl_lock(); if (ifindex == 0) { struct rt6_info *rt; @@ -90,11 +90,11 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) goto error; } else { /* router, no matching interface: just pick one */ - dev = dev_get_by_flags_rcu(net, IFF_UP, + dev = dev_get_by_flags(net, IFF_UP, IFF_UP | IFF_LOOPBACK); } } else - dev = dev_get_by_index_rcu(net, ifindex); + dev = __dev_get_by_index(net, ifindex); if (dev == NULL) { err = -ENODEV; @@ -136,7 +136,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) } error: - rcu_read_unlock(); + rtnl_unlock(); if (pac) sock_kfree_s(sk, pac, sizeof(*pac)); return err; @@ -171,13 +171,15 @@ int ipv6_sock_ac_drop(struct sock *sk, int ifindex, const struct in6_addr *addr) spin_unlock_bh(_sk_ac_lock); - rcu_read_lock(); - dev = dev_get_by_index_rcu(net, pac->
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On 08/29/2014 11:26 AM, Tommi Rantala wrote: > Hi, > > Was fuzzing Linus v3.17-rc2-89-g59753a8 with Trinity as the root user > in qemu, when I hit the following assertion failures. > > Tommi > > > [init] Started watchdog process, PID is 4841 > [main] Main thread is alive. > [ 77.229699] sctp: [Deprecated]: trinity-main (pid 4842) Use of int > in max_burst socket option deprecated. > [ 77.229699] Use struct sctp_assoc_value instead > [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) > [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 > [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > [ 77.299789] 88003d76a618 880026133c50 8238ba79 > 880037c84520 > [ 77.300829] 880026133c90 820bd52b > 82d86c40 > [ 77.301869] f76fd1e1 8800382d8000 > 8800382d8220 > [ 77.302906] Call Trace: > [ 77.303246] [] dump_stack+0x4d/0x66 > [ 77.303928] [] addrconf_join_solict+0x4b/0xb0 > [ 77.304731] [] ipv6_dev_ac_inc+0x2bb/0x330 > [ 77.305498] [] ? ac6_seq_start+0x260/0x260 > [ 77.306257] [] ipv6_sock_ac_join+0x26e/0x360 > [ 77.307046] [] ? ipv6_sock_ac_join+0x99/0x360 > [ 77.307798] [] do_ipv6_setsockopt.isra.5+0xa70/0xf20 > [ 77.308570] [] ? sched_clock_local+0x1d/0x80 > [ 77.309260] [] ? kvm_clock_read+0x27/0x40 > [ 77.309915] [] ? sched_clock+0x9/0x10 > [ 77.310537] [] ? sock_has_perm+0x168/0x1e0 > [ 77.311204] [] ? sched_clock_cpu+0xa8/0xf0 > [ 77.311866] [] ? local_clock+0x1b/0x30 > [ 77.312501] [] ? lock_release_holdtime+0x1d/0x170 > [ 77.313241] [] ? sock_has_perm+0x180/0x1e0 > [ 77.313905] [] ? > selinux_msg_queue_alloc_security+0xa0/0xa0 > [ 77.314746] [] ipv6_setsockopt+0x53/0xb0 > [ 77.315397] [] udpv6_setsockopt+0x25/0x30 > [ 77.316058] [] sock_common_setsockopt+0xf/0x20 > [ 77.316764] [] SyS_setsockopt+0x8e/0xd0 > [ 77.317406] [] system_call_fastpath+0x16/0x1b > [main] 375 sockets created based on info from socket cachefile. > [main] Generating file descriptors > [main] Added 129 filenames from /dev > [main] Added 44048 filenames from /proc > [main] Added 18192 filenames from /sys > [main] Enabled 9 fd providers. > [watchdog] Watchdog is alive. (pid:4841) > [child3:4846] finit_module (313) returned ENOSYS, marking as inactive. > [child1:4844] kcmp (312) returned ENOSYS, marking as inactive. > [child2:4845] uselib (134) returned ENOSYS, marking as inactive. > [child1:4844] nfsservctl (180) returned ENOSYS, marking as inactive. > [child2:4845] delete_module (129:[32BIT]) returned ENOSYS, marking as > inactive. > [child2:4845] init_module (175) returned ENOSYS, marking as inactive. > [ 84.126609] trinity-c7: vm86 mode not supported on 64 bit kernel > [child7:4850] vm86 (166:[32BIT]) returned ENOSYS, marking as inactive. > [main] Bailing main loop because ctrl-c. > [ 84.345840] RTNL: assertion failed at net/ipv6/addrconf.c (1712) > [ 84.346615] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 > [ 84.347426] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > [ 84.348102] 88003d76a618 880026133d10 8238ba79 > 8800382d8000 > [ 84.349018] 880026133d50 820bd5db 81141555 > 8800382d8220 > [ 84.349935] 8800382d8000 f76fd1e1 88003d76a618 > 8800382d8000 > [ 84.350848] Call Trace: > [ 84.351149] [] dump_stack+0x4d/0x66 > [ 84.351751] [] addrconf_leave_solict+0x4b/0xb0 > [ 84.352574] [] ? __local_bh_enable_ip+0xa5/0xf0 > [ 84.353315] [] __ipv6_dev_ac_dec+0xc3/0x140 > [ 84.354019] [] ipv6_dev_ac_dec+0x98/0xb0 > [ 84.354687] [] ipv6_sock_ac_close+0x10d/0x1a0 > [ 84.355410] [] ? ipv6_sock_ac_close+0x2e/0x1a0 > [ 84.356147] [] inet6_release+0x23/0x40 > [ 84.356789] [] sock_release+0x14/0x80 > [ 84.357410] [] sock_close+0xd/0x20 > [ 84.358042] [] __fput+0x111/0x1e0 > [ 84.358622] [] fput+0x9/0x10 > [ 84.359196] [] task_work_run+0x9e/0xd0 > [ 84.359825] [] do_exit+0x456/0xb30 > [ 84.360419] [] ? retint_swapgs+0x13/0x1b > [ 84.361075] [] do_group_exit+0x84/0xd0 > [ 84.361705] [] SyS_exit_group+0xf/0x10 > [ 84.362338] [] system_call_fastpath+0x16/0x1b > [watchdog] [4841] Watchdog exiting because ctrl-c. > [init] Ran 775 syscalls. Successes: 179 Failures: 596 > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Yep, looks like ipv6_dev_ac_inc() and __ipv6_dev_ac_dec() are called without RNTL in the socket option p
RTNL: assertion failed at net/ipv6/addrconf.c (1699)
Hi, Was fuzzing Linus v3.17-rc2-89-g59753a8 with Trinity as the root user in qemu, when I hit the following assertion failures. Tommi [init] Started watchdog process, PID is 4841 [main] Main thread is alive. [ 77.229699] sctp: [Deprecated]: trinity-main (pid 4842) Use of int in max_burst socket option deprecated. [ 77.229699] Use struct sctp_assoc_value instead [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 77.299789] 88003d76a618 880026133c50 8238ba79 880037c84520 [ 77.300829] 880026133c90 820bd52b 82d86c40 [ 77.301869] f76fd1e1 8800382d8000 8800382d8220 [ 77.302906] Call Trace: [ 77.303246] [] dump_stack+0x4d/0x66 [ 77.303928] [] addrconf_join_solict+0x4b/0xb0 [ 77.304731] [] ipv6_dev_ac_inc+0x2bb/0x330 [ 77.305498] [] ? ac6_seq_start+0x260/0x260 [ 77.306257] [] ipv6_sock_ac_join+0x26e/0x360 [ 77.307046] [] ? ipv6_sock_ac_join+0x99/0x360 [ 77.307798] [] do_ipv6_setsockopt.isra.5+0xa70/0xf20 [ 77.308570] [] ? sched_clock_local+0x1d/0x80 [ 77.309260] [] ? kvm_clock_read+0x27/0x40 [ 77.309915] [] ? sched_clock+0x9/0x10 [ 77.310537] [] ? sock_has_perm+0x168/0x1e0 [ 77.311204] [] ? sched_clock_cpu+0xa8/0xf0 [ 77.311866] [] ? local_clock+0x1b/0x30 [ 77.312501] [] ? lock_release_holdtime+0x1d/0x170 [ 77.313241] [] ? sock_has_perm+0x180/0x1e0 [ 77.313905] [] ? selinux_msg_queue_alloc_security+0xa0/0xa0 [ 77.314746] [] ipv6_setsockopt+0x53/0xb0 [ 77.315397] [] udpv6_setsockopt+0x25/0x30 [ 77.316058] [] sock_common_setsockopt+0xf/0x20 [ 77.316764] [] SyS_setsockopt+0x8e/0xd0 [ 77.317406] [] system_call_fastpath+0x16/0x1b [main] 375 sockets created based on info from socket cachefile. [main] Generating file descriptors [main] Added 129 filenames from /dev [main] Added 44048 filenames from /proc [main] Added 18192 filenames from /sys [main] Enabled 9 fd providers. [watchdog] Watchdog is alive. (pid:4841) [child3:4846] finit_module (313) returned ENOSYS, marking as inactive. [child1:4844] kcmp (312) returned ENOSYS, marking as inactive. [child2:4845] uselib (134) returned ENOSYS, marking as inactive. [child1:4844] nfsservctl (180) returned ENOSYS, marking as inactive. [child2:4845] delete_module (129:[32BIT]) returned ENOSYS, marking as inactive. [child2:4845] init_module (175) returned ENOSYS, marking as inactive. [ 84.126609] trinity-c7: vm86 mode not supported on 64 bit kernel [child7:4850] vm86 (166:[32BIT]) returned ENOSYS, marking as inactive. [main] Bailing main loop because ctrl-c. [ 84.345840] RTNL: assertion failed at net/ipv6/addrconf.c (1712) [ 84.346615] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 [ 84.347426] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 84.348102] 88003d76a618 880026133d10 8238ba79 8800382d8000 [ 84.349018] 880026133d50 820bd5db 81141555 8800382d8220 [ 84.349935] 8800382d8000 f76fd1e1 88003d76a618 8800382d8000 [ 84.350848] Call Trace: [ 84.351149] [] dump_stack+0x4d/0x66 [ 84.351751] [] addrconf_leave_solict+0x4b/0xb0 [ 84.352574] [] ? __local_bh_enable_ip+0xa5/0xf0 [ 84.353315] [] __ipv6_dev_ac_dec+0xc3/0x140 [ 84.354019] [] ipv6_dev_ac_dec+0x98/0xb0 [ 84.354687] [] ipv6_sock_ac_close+0x10d/0x1a0 [ 84.355410] [] ? ipv6_sock_ac_close+0x2e/0x1a0 [ 84.356147] [] inet6_release+0x23/0x40 [ 84.356789] [] sock_release+0x14/0x80 [ 84.357410] [] sock_close+0xd/0x20 [ 84.358042] [] __fput+0x111/0x1e0 [ 84.358622] [] fput+0x9/0x10 [ 84.359196] [] task_work_run+0x9e/0xd0 [ 84.359825] [] do_exit+0x456/0xb30 [ 84.360419] [] ? retint_swapgs+0x13/0x1b [ 84.361075] [] do_group_exit+0x84/0xd0 [ 84.361705] [] SyS_exit_group+0xf/0x10 [ 84.362338] [] system_call_fastpath+0x16/0x1b [watchdog] [4841] Watchdog exiting because ctrl-c. [init] Ran 775 syscalls. Successes: 179 Failures: 596 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RTNL: assertion failed at net/ipv6/addrconf.c (1699)
Hi, Was fuzzing Linus v3.17-rc2-89-g59753a8 with Trinity as the root user in qemu, when I hit the following assertion failures. Tommi [init] Started watchdog process, PID is 4841 [main] Main thread is alive. [ 77.229699] sctp: [Deprecated]: trinity-main (pid 4842) Use of int in max_burst socket option deprecated. [ 77.229699] Use struct sctp_assoc_value instead [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 77.299789] 88003d76a618 880026133c50 8238ba79 880037c84520 [ 77.300829] 880026133c90 820bd52b 82d86c40 [ 77.301869] f76fd1e1 8800382d8000 8800382d8220 [ 77.302906] Call Trace: [ 77.303246] [8238ba79] dump_stack+0x4d/0x66 [ 77.303928] [820bd52b] addrconf_join_solict+0x4b/0xb0 [ 77.304731] [820b031b] ipv6_dev_ac_inc+0x2bb/0x330 [ 77.305498] [820b0060] ? ac6_seq_start+0x260/0x260 [ 77.306257] [820b05fe] ipv6_sock_ac_join+0x26e/0x360 [ 77.307046] [820b0429] ? ipv6_sock_ac_join+0x99/0x360 [ 77.307798] [820cdd60] do_ipv6_setsockopt.isra.5+0xa70/0xf20 [ 77.308570] [8117097d] ? sched_clock_local+0x1d/0x80 [ 77.309260] [810a8a27] ? kvm_clock_read+0x27/0x40 [ 77.309915] [810736d9] ? sched_clock+0x9/0x10 [ 77.310537] [815afff8] ? sock_has_perm+0x168/0x1e0 [ 77.311204] [81170bb8] ? sched_clock_cpu+0xa8/0xf0 [ 77.311866] [81170d1b] ? local_clock+0x1b/0x30 [ 77.312501] [811872cd] ? lock_release_holdtime+0x1d/0x170 [ 77.313241] [815b0010] ? sock_has_perm+0x180/0x1e0 [ 77.313905] [815afe90] ? selinux_msg_queue_alloc_security+0xa0/0xa0 [ 77.314746] [820ce263] ipv6_setsockopt+0x53/0xb0 [ 77.315397] [820d3135] udpv6_setsockopt+0x25/0x30 [ 77.316058] [81f9930f] sock_common_setsockopt+0xf/0x20 [ 77.316764] [81f9305e] SyS_setsockopt+0x8e/0xd0 [ 77.317406] [823a47e9] system_call_fastpath+0x16/0x1b [main] 375 sockets created based on info from socket cachefile. [main] Generating file descriptors [main] Added 129 filenames from /dev [main] Added 44048 filenames from /proc [main] Added 18192 filenames from /sys [main] Enabled 9 fd providers. [watchdog] Watchdog is alive. (pid:4841) [child3:4846] finit_module (313) returned ENOSYS, marking as inactive. [child1:4844] kcmp (312) returned ENOSYS, marking as inactive. [child2:4845] uselib (134) returned ENOSYS, marking as inactive. [child1:4844] nfsservctl (180) returned ENOSYS, marking as inactive. [child2:4845] delete_module (129:[32BIT]) returned ENOSYS, marking as inactive. [child2:4845] init_module (175) returned ENOSYS, marking as inactive. [ 84.126609] trinity-c7: vm86 mode not supported on 64 bit kernel [child7:4850] vm86 (166:[32BIT]) returned ENOSYS, marking as inactive. [main] Bailing main loop because ctrl-c. [ 84.345840] RTNL: assertion failed at net/ipv6/addrconf.c (1712) [ 84.346615] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 [ 84.347426] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 84.348102] 88003d76a618 880026133d10 8238ba79 8800382d8000 [ 84.349018] 880026133d50 820bd5db 81141555 8800382d8220 [ 84.349935] 8800382d8000 f76fd1e1 88003d76a618 8800382d8000 [ 84.350848] Call Trace: [ 84.351149] [8238ba79] dump_stack+0x4d/0x66 [ 84.351751] [820bd5db] addrconf_leave_solict+0x4b/0xb0 [ 84.352574] [81141555] ? __local_bh_enable_ip+0xa5/0xf0 [ 84.353315] [820b07b3] __ipv6_dev_ac_dec+0xc3/0x140 [ 84.354019] [820b08c8] ipv6_dev_ac_dec+0x98/0xb0 [ 84.354687] [820b0bcd] ipv6_sock_ac_close+0x10d/0x1a0 [ 84.355410] [820b0aee] ? ipv6_sock_ac_close+0x2e/0x1a0 [ 84.356147] [820ae9d3] inet6_release+0x23/0x40 [ 84.356789] [81f91834] sock_release+0x14/0x80 [ 84.357410] [81f918ad] sock_close+0xd/0x20 [ 84.358042] [8127fa91] __fput+0x111/0x1e0 [ 84.358622] [8127fba9] fput+0x9/0x10 [ 84.359196] [8115e3ee] task_work_run+0x9e/0xd0 [ 84.359825] [8113f4b6] do_exit+0x456/0xb30 [ 84.360419] [823a541c] ? retint_swapgs+0x13/0x1b [ 84.361075] [8113fc54] do_group_exit+0x84/0xd0 [ 84.361705] [8113fcaf] SyS_exit_group+0xf/0x10 [ 84.362338] [823a47e9] system_call_fastpath+0x16/0x1b [watchdog] [4841] Watchdog exiting because ctrl-c. [init] Ran 775 syscalls. Successes: 179 Failures: 596 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On 08/29/2014 11:26 AM, Tommi Rantala wrote: Hi, Was fuzzing Linus v3.17-rc2-89-g59753a8 with Trinity as the root user in qemu, when I hit the following assertion failures. Tommi [init] Started watchdog process, PID is 4841 [main] Main thread is alive. [ 77.229699] sctp: [Deprecated]: trinity-main (pid 4842) Use of int in max_burst socket option deprecated. [ 77.229699] Use struct sctp_assoc_value instead [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 77.299789] 88003d76a618 880026133c50 8238ba79 880037c84520 [ 77.300829] 880026133c90 820bd52b 82d86c40 [ 77.301869] f76fd1e1 8800382d8000 8800382d8220 [ 77.302906] Call Trace: [ 77.303246] [8238ba79] dump_stack+0x4d/0x66 [ 77.303928] [820bd52b] addrconf_join_solict+0x4b/0xb0 [ 77.304731] [820b031b] ipv6_dev_ac_inc+0x2bb/0x330 [ 77.305498] [820b0060] ? ac6_seq_start+0x260/0x260 [ 77.306257] [820b05fe] ipv6_sock_ac_join+0x26e/0x360 [ 77.307046] [820b0429] ? ipv6_sock_ac_join+0x99/0x360 [ 77.307798] [820cdd60] do_ipv6_setsockopt.isra.5+0xa70/0xf20 [ 77.308570] [8117097d] ? sched_clock_local+0x1d/0x80 [ 77.309260] [810a8a27] ? kvm_clock_read+0x27/0x40 [ 77.309915] [810736d9] ? sched_clock+0x9/0x10 [ 77.310537] [815afff8] ? sock_has_perm+0x168/0x1e0 [ 77.311204] [81170bb8] ? sched_clock_cpu+0xa8/0xf0 [ 77.311866] [81170d1b] ? local_clock+0x1b/0x30 [ 77.312501] [811872cd] ? lock_release_holdtime+0x1d/0x170 [ 77.313241] [815b0010] ? sock_has_perm+0x180/0x1e0 [ 77.313905] [815afe90] ? selinux_msg_queue_alloc_security+0xa0/0xa0 [ 77.314746] [820ce263] ipv6_setsockopt+0x53/0xb0 [ 77.315397] [820d3135] udpv6_setsockopt+0x25/0x30 [ 77.316058] [81f9930f] sock_common_setsockopt+0xf/0x20 [ 77.316764] [81f9305e] SyS_setsockopt+0x8e/0xd0 [ 77.317406] [823a47e9] system_call_fastpath+0x16/0x1b [main] 375 sockets created based on info from socket cachefile. [main] Generating file descriptors [main] Added 129 filenames from /dev [main] Added 44048 filenames from /proc [main] Added 18192 filenames from /sys [main] Enabled 9 fd providers. [watchdog] Watchdog is alive. (pid:4841) [child3:4846] finit_module (313) returned ENOSYS, marking as inactive. [child1:4844] kcmp (312) returned ENOSYS, marking as inactive. [child2:4845] uselib (134) returned ENOSYS, marking as inactive. [child1:4844] nfsservctl (180) returned ENOSYS, marking as inactive. [child2:4845] delete_module (129:[32BIT]) returned ENOSYS, marking as inactive. [child2:4845] init_module (175) returned ENOSYS, marking as inactive. [ 84.126609] trinity-c7: vm86 mode not supported on 64 bit kernel [child7:4850] vm86 (166:[32BIT]) returned ENOSYS, marking as inactive. [main] Bailing main loop because ctrl-c. [ 84.345840] RTNL: assertion failed at net/ipv6/addrconf.c (1712) [ 84.346615] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 [ 84.347426] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 84.348102] 88003d76a618 880026133d10 8238ba79 8800382d8000 [ 84.349018] 880026133d50 820bd5db 81141555 8800382d8220 [ 84.349935] 8800382d8000 f76fd1e1 88003d76a618 8800382d8000 [ 84.350848] Call Trace: [ 84.351149] [8238ba79] dump_stack+0x4d/0x66 [ 84.351751] [820bd5db] addrconf_leave_solict+0x4b/0xb0 [ 84.352574] [81141555] ? __local_bh_enable_ip+0xa5/0xf0 [ 84.353315] [820b07b3] __ipv6_dev_ac_dec+0xc3/0x140 [ 84.354019] [820b08c8] ipv6_dev_ac_dec+0x98/0xb0 [ 84.354687] [820b0bcd] ipv6_sock_ac_close+0x10d/0x1a0 [ 84.355410] [820b0aee] ? ipv6_sock_ac_close+0x2e/0x1a0 [ 84.356147] [820ae9d3] inet6_release+0x23/0x40 [ 84.356789] [81f91834] sock_release+0x14/0x80 [ 84.357410] [81f918ad] sock_close+0xd/0x20 [ 84.358042] [8127fa91] __fput+0x111/0x1e0 [ 84.358622] [8127fba9] fput+0x9/0x10 [ 84.359196] [8115e3ee] task_work_run+0x9e/0xd0 [ 84.359825] [8113f4b6] do_exit+0x456/0xb30 [ 84.360419] [823a541c] ? retint_swapgs+0x13/0x1b [ 84.361075] [8113fc54] do_group_exit+0x84/0xd0 [ 84.361705] [8113fcaf] SyS_exit_group+0xf/0x10 [ 84.362338] [823a47e9] system_call_fastpath+0x16/0x1b [watchdog] [4841] Watchdog exiting because ctrl-c. [init] Ran 775 syscalls. Successes: 179 Failures: 596 -- To unsubscribe from this list: send the line unsubscribe netdev
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Fri, Aug 29, 2014 at 8:26 AM, Tommi Rantala tt.rant...@gmail.com wrote: [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 77.299789] 88003d76a618 880026133c50 8238ba79 880037c84520 [ 77.300829] 880026133c90 820bd52b 82d86c40 [ 77.301869] f76fd1e1 8800382d8000 8800382d8220 [ 77.302906] Call Trace: [ 77.303246] [8238ba79] dump_stack+0x4d/0x66 [ 77.303928] [820bd52b] addrconf_join_solict+0x4b/0xb0 [ 77.304731] [820b031b] ipv6_dev_ac_inc+0x2bb/0x330 [ 77.305498] [820b0060] ? ac6_seq_start+0x260/0x260 [ 77.306257] [820b05fe] ipv6_sock_ac_join+0x26e/0x360 [ 77.307046] [820b0429] ? ipv6_sock_ac_join+0x99/0x360 [ 77.307798] [820cdd60] do_ipv6_setsockopt.isra.5+0xa70/0xf20 I think we should just use rtnl_lock() instead of rcu_read_lock() there, it is not a hot path worth optimization. Please try the attached patch. commit 31d83db0b417f705cbb31b2159603b8b53b81ab6 Author: Cong Wang xiyou.wangc...@gmail.com Date: Fri Aug 29 11:02:15 2014 -0700 ipv6: fix rtnl lock assertion in ipv6_sock_ac_join() Signed-off-by: Cong Wang xiyou.wangc...@gmail.com diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 4298013..1ae0e74 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2077,8 +2077,8 @@ void __dev_remove_pack(struct packet_type *pt); void dev_add_offload(struct packet_offload *po); void dev_remove_offload(struct packet_offload *po); -struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short flags, - unsigned short mask); +struct net_device *dev_get_by_flags(struct net *net, unsigned short flags, + unsigned short mask); struct net_device *dev_get_by_name(struct net *net, const char *name); struct net_device *dev_get_by_name_rcu(struct net *net, const char *name); struct net_device *__dev_get_by_name(struct net *net, const char *name); diff --git a/net/core/dev.c b/net/core/dev.c index 26d296c..73cdb03 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -897,23 +897,24 @@ struct net_device *dev_getfirstbyhwtype(struct net *net, unsigned short type) EXPORT_SYMBOL(dev_getfirstbyhwtype); /** - * dev_get_by_flags_rcu - find any device with given flags + * dev_get_by_flags - find any device with given flags * @net: the applicable net namespace * @if_flags: IFF_* values * @mask: bitmask of bits in if_flags to check * * Search for any interface with the given flags. Returns NULL if a device * is not found or a pointer to the device. Must be called inside - * rcu_read_lock(), and result refcount is unchanged. + * rtnl_lock(), and result refcount is unchanged. */ -struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short if_flags, +struct net_device *dev_get_by_flags(struct net *net, unsigned short if_flags, unsigned short mask) { struct net_device *dev, *ret; + ASSERT_RTNL(); ret = NULL; - for_each_netdev_rcu(net, dev) { + for_each_netdev(net, dev) { if (((dev-flags ^ if_flags) mask) == 0) { ret = dev; break; diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index 2101832..c523c1a 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c @@ -77,7 +77,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) pac-acl_next = NULL; pac-acl_addr = *addr; - rcu_read_lock(); + rtnl_lock(); if (ifindex == 0) { struct rt6_info *rt; @@ -90,11 +90,11 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) goto error; } else { /* router, no matching interface: just pick one */ - dev = dev_get_by_flags_rcu(net, IFF_UP, + dev = dev_get_by_flags(net, IFF_UP, IFF_UP | IFF_LOOPBACK); } } else - dev = dev_get_by_index_rcu(net, ifindex); + dev = __dev_get_by_index(net, ifindex); if (dev == NULL) { err = -ENODEV; @@ -136,7 +136,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) } error: - rcu_read_unlock(); + rtnl_unlock(); if (pac) sock_kfree_s(sk, pac, sizeof(*pac)); return err; @@ -171,13 +171,15 @@ int ipv6_sock_ac_drop(struct sock *sk, int ifindex, const struct in6_addr *addr
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
2014-08-29, 11:14:48 -0700, Cong Wang wrote: On Fri, Aug 29, 2014 at 8:26 AM, Tommi Rantala tt.rant...@gmail.com wrote: [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 77.299789] 88003d76a618 880026133c50 8238ba79 880037c84520 [ 77.300829] 880026133c90 820bd52b 82d86c40 [ 77.301869] f76fd1e1 8800382d8000 8800382d8220 [ 77.302906] Call Trace: [ 77.303246] [8238ba79] dump_stack+0x4d/0x66 [ 77.303928] [820bd52b] addrconf_join_solict+0x4b/0xb0 [ 77.304731] [820b031b] ipv6_dev_ac_inc+0x2bb/0x330 [ 77.305498] [820b0060] ? ac6_seq_start+0x260/0x260 [ 77.306257] [820b05fe] ipv6_sock_ac_join+0x26e/0x360 [ 77.307046] [820b0429] ? ipv6_sock_ac_join+0x99/0x360 [ 77.307798] [820cdd60] do_ipv6_setsockopt.isra.5+0xa70/0xf20 I think we should just use rtnl_lock() instead of rcu_read_lock() there, it is not a hot path worth optimization. Please try the attached patch. note: it doesn't build as it is now, it needs: -EXPORT_SYMBOL(dev_get_by_flags_rcu); +EXPORT_SYMBOL(dev_get_by_flags); I just tried your patch with a basic test program (open socket/join/leave/close and open socket/join/close). I think you need to modify ipv6_sock_ac_close as well, or you can still trigger the assertion when closing the socket without leaving first. Modified patch attached. -- Sabrina diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 429801370d0c..1ae0e745b1b1 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2077,8 +2077,8 @@ void __dev_remove_pack(struct packet_type *pt); void dev_add_offload(struct packet_offload *po); void dev_remove_offload(struct packet_offload *po); -struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short flags, - unsigned short mask); +struct net_device *dev_get_by_flags(struct net *net, unsigned short flags, + unsigned short mask); struct net_device *dev_get_by_name(struct net *net, const char *name); struct net_device *dev_get_by_name_rcu(struct net *net, const char *name); struct net_device *__dev_get_by_name(struct net *net, const char *name); diff --git a/net/core/dev.c b/net/core/dev.c index 443b814db05b..8fede6ef4a39 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -897,23 +897,24 @@ struct net_device *dev_getfirstbyhwtype(struct net *net, unsigned short type) EXPORT_SYMBOL(dev_getfirstbyhwtype); /** - * dev_get_by_flags_rcu - find any device with given flags + * dev_get_by_flags - find any device with given flags * @net: the applicable net namespace * @if_flags: IFF_* values * @mask: bitmask of bits in if_flags to check * * Search for any interface with the given flags. Returns NULL if a device * is not found or a pointer to the device. Must be called inside - * rcu_read_lock(), and result refcount is unchanged. + * rtnl_lock(), and result refcount is unchanged. */ -struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short if_flags, +struct net_device *dev_get_by_flags(struct net *net, unsigned short if_flags, unsigned short mask) { struct net_device *dev, *ret; + ASSERT_RTNL(); ret = NULL; - for_each_netdev_rcu(net, dev) { + for_each_netdev(net, dev) { if (((dev-flags ^ if_flags) mask) == 0) { ret = dev; break; @@ -921,7 +922,7 @@ struct net_device *dev_get_by_flags_rcu(struct net *net, unsigned short if_flags } return ret; } -EXPORT_SYMBOL(dev_get_by_flags_rcu); +EXPORT_SYMBOL(dev_get_by_flags); /** * dev_valid_name - check if name is okay for network device diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index 210183244689..6de5caa26ea4 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c @@ -77,7 +77,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) pac-acl_next = NULL; pac-acl_addr = *addr; - rcu_read_lock(); + rtnl_lock(); if (ifindex == 0) { struct rt6_info *rt; @@ -90,11 +90,11 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) goto error; } else { /* router, no matching interface: just pick one */ - dev = dev_get_by_flags_rcu(net, IFF_UP, + dev = dev_get_by_flags(net, IFF_UP, IFF_UP | IFF_LOOPBACK); } } else
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
On Fri, Aug 29, 2014 at 12:53 PM, Sabrina Dubroca s...@queasysnail.net wrote: 2014-08-29, 11:14:48 -0700, Cong Wang wrote: On Fri, Aug 29, 2014 at 8:26 AM, Tommi Rantala tt.rant...@gmail.com wrote: [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 77.299789] 88003d76a618 880026133c50 8238ba79 880037c84520 [ 77.300829] 880026133c90 820bd52b 82d86c40 [ 77.301869] f76fd1e1 8800382d8000 8800382d8220 [ 77.302906] Call Trace: [ 77.303246] [8238ba79] dump_stack+0x4d/0x66 [ 77.303928] [820bd52b] addrconf_join_solict+0x4b/0xb0 [ 77.304731] [820b031b] ipv6_dev_ac_inc+0x2bb/0x330 [ 77.305498] [820b0060] ? ac6_seq_start+0x260/0x260 [ 77.306257] [820b05fe] ipv6_sock_ac_join+0x26e/0x360 [ 77.307046] [820b0429] ? ipv6_sock_ac_join+0x99/0x360 [ 77.307798] [820cdd60] do_ipv6_setsockopt.isra.5+0xa70/0xf20 I think we should just use rtnl_lock() instead of rcu_read_lock() there, it is not a hot path worth optimization. Please try the attached patch. note: it doesn't build as it is now, it needs: -EXPORT_SYMBOL(dev_get_by_flags_rcu); +EXPORT_SYMBOL(dev_get_by_flags); I just tried your patch with a basic test program (open socket/join/leave/close and open socket/join/close). I think you need to modify ipv6_sock_ac_close as well, or you can still trigger the assertion when closing the socket without leaving first. You are absolutely right here. Can I have your Signed-off-by and Tested-by before sending the patch formally? Thanks! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RTNL: assertion failed at net/ipv6/addrconf.c (1699)
Hi Sabrina, On Fr, 2014-08-29 at 21:53 +0200, Sabrina Dubroca wrote: 2014-08-29, 11:14:48 -0700, Cong Wang wrote: On Fri, Aug 29, 2014 at 8:26 AM, Tommi Rantala tt.rant...@gmail.com wrote: [ 77.297196] RTNL: assertion failed at net/ipv6/addrconf.c (1699) [ 77.298080] CPU: 0 PID: 4842 Comm: trinity-main Not tainted 3.17.0-rc2+ #30 [ 77.299039] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 77.299789] 88003d76a618 880026133c50 8238ba79 880037c84520 [ 77.300829] 880026133c90 820bd52b 82d86c40 [ 77.301869] f76fd1e1 8800382d8000 8800382d8220 [ 77.302906] Call Trace: [ 77.303246] [8238ba79] dump_stack+0x4d/0x66 [ 77.303928] [820bd52b] addrconf_join_solict+0x4b/0xb0 [ 77.304731] [820b031b] ipv6_dev_ac_inc+0x2bb/0x330 [ 77.305498] [820b0060] ? ac6_seq_start+0x260/0x260 [ 77.306257] [820b05fe] ipv6_sock_ac_join+0x26e/0x360 [ 77.307046] [820b0429] ? ipv6_sock_ac_join+0x99/0x360 [ 77.307798] [820cdd60] do_ipv6_setsockopt.isra.5+0xa70/0xf20 I think we should just use rtnl_lock() instead of rcu_read_lock() there, it is not a hot path worth optimization. Please try the attached patch. note: it doesn't build as it is now, it needs: -EXPORT_SYMBOL(dev_get_by_flags_rcu); +EXPORT_SYMBOL(dev_get_by_flags); I just tried your patch with a basic test program (open socket/join/leave/close and open socket/join/close). I think you need to modify ipv6_sock_ac_close as well, or you can still trigger the assertion when closing the socket without leaving first. Modified patch attached. Sorry, just had time to look at this. The reason is not to have list corruption but that the calls down to ndo_set_rx_mode expect rtnl to be locked by the drivers. Filter lists are locked by addr_list_lock and that's why I think we never saw any problems with that, but drivers expect rtnl locked for those calls. But this problem also affects multicast join, so patch seems incomplete to me (and for that matter ssm multicast join, too). Also rtnl_lock and rcu_read_lock compose in that order, so we don't need to change dev_get_by_flags, but as this is the only user it sure is possible. RCU locked version is just easier composeable, so I wouldn't touch that if needed in future, just also take rcu lock as before. So just adding rtnl_lock add appropriate places seems to be ok to me, but still need to review parts of the ssm code. Also we should move ASSERT_RTNL checks from addrconf_join_solict to ipv6_dev_mc_inc/dec. Thanks, Hannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/