Re: [RFC] netdev sysfs failure handling
On Tue, 09 May 2006 15:43:22 -0700 (PDT) "David S. Miller" <[EMAIL PROTECTED]> wrote: > From: Stephen Hemminger <[EMAIL PROTECTED]> > Date: Tue, 9 May 2006 14:40:49 -0700 > > > Agreed, especially since rtnl is now a real mutex. The case, that > > I was worried about: > > rtnl_lock() > > spin_lock_irq(&mylock); > > x = register_netdevice(); > > ... > > > > Doesn't show up in any current code, even for the pseudo devices > > and funny virtualized interfaces. > > Right, therefore I think we should put something like your patch in > there now perhaps. > > The case where we really needed the todo list is unregister, so that > we can safely wait for all references to the net device to go away. > > I still wonder about those mentioned hotplug races wrt. linkwatch > in the comment above netdev_run_todo(). > > Linkwatch is such a nuissance because it combines asynchronous link > state change processing with keventd and RTNL locking. It sleeps > waiting for __LINK_STATE_SCHED to clear with the RTNL held (via > dev_deactivate()). But then again dev_close() code paths do this > too, so the dev_deactivate() bit should be OK. > > Linkwatch, after doing the dev_activate(), emits a NETDEV_CHANGE > notifier on netdev_chain and also sends out an RTM_NETLINK > message. This is for the case where IFF_UP is set. > > Until we release the RTNL semaphore, during netdev register, nobody > can go in an inspect the state of a net device. So doing the sysfs > node creation in register_netdevice() should be OK as far as I can > tell. > > Can anyone find a problem with this? Also, by getting the netdevice fully in sysfs under RTNL, we are safe from races with the hotplug uevent that occurs. Right now, it might be possible on SMP for the hotplug to happen after register_netdevice, but before the device shows up in sysfs. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] netdev sysfs failure handling
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Tue, 9 May 2006 14:40:49 -0700 > Agreed, especially since rtnl is now a real mutex. The case, that > I was worried about: > rtnl_lock() > spin_lock_irq(&mylock); > x = register_netdevice(); > ... > > Doesn't show up in any current code, even for the pseudo devices > and funny virtualized interfaces. Right, therefore I think we should put something like your patch in there now perhaps. The case where we really needed the todo list is unregister, so that we can safely wait for all references to the net device to go away. I still wonder about those mentioned hotplug races wrt. linkwatch in the comment above netdev_run_todo(). Linkwatch is such a nuissance because it combines asynchronous link state change processing with keventd and RTNL locking. It sleeps waiting for __LINK_STATE_SCHED to clear with the RTNL held (via dev_deactivate()). But then again dev_close() code paths do this too, so the dev_deactivate() bit should be OK. Linkwatch, after doing the dev_activate(), emits a NETDEV_CHANGE notifier on netdev_chain and also sends out an RTM_NETLINK message. This is for the case where IFF_UP is set. Until we release the RTNL semaphore, during netdev register, nobody can go in an inspect the state of a net device. So doing the sysfs node creation in register_netdevice() should be OK as far as I can tell. Can anyone find a problem with this? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] netdev sysfs failure handling
On Tue, 09 May 2006 14:05:01 -0700 (PDT) "David S. Miller" <[EMAIL PROTECTED]> wrote: > From: Stephen Hemminger <[EMAIL PROTECTED]> > Date: Tue, 9 May 2006 12:01:07 -0700 > > > Something like this would handle errors better, but introduce possible > > problems for drivers that call register_netdevice with irq's disabled. > > There was some comment about racing with linkwatch, but don't see how > > that could happen during creation. > > > > For 2.6.18? > > I've been thinking about this a bit more. > > How can anyone be using this with IRQ's disabled if we have > an ASSERT_RTNL() there? Agreed, especially since rtnl is now a real mutex. The case, that I was worried about: rtnl_lock() spin_lock_irq(&mylock); x = register_netdevice(); ... Doesn't show up in any current code, even for the pseudo devices and funny virtualized interfaces. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] netdev sysfs failure handling
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Tue, 9 May 2006 12:01:07 -0700 > Something like this would handle errors better, but introduce possible > problems for drivers that call register_netdevice with irq's disabled. > There was some comment about racing with linkwatch, but don't see how > that could happen during creation. > > For 2.6.18? I've been thinking about this a bit more. How can anyone be using this with IRQ's disabled if we have an ASSERT_RTNL() there? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] netdev sysfs failure handling
Something like this would handle errors better, but introduce possible problems for drivers that call register_netdevice with irq's disabled. There was some comment about racing with linkwatch, but don't see how that could happen during creation. For 2.6.18? --- bridge.orig/include/linux/netdevice.h 2006-05-09 11:17:08.0 -0700 +++ bridge/include/linux/netdevice.h2006-05-09 11:18:52.0 -0700 @@ -433,8 +433,7 @@ /* register/unregister state machine */ enum { NETREG_UNINITIALIZED=0, - NETREG_REGISTERING, /* called register_netdevice */ - NETREG_REGISTERED, /* completed register todo */ + NETREG_REGISTERED, /* completed register_netdevice */ NETREG_UNREGISTERING,/* called unregister_netdevice */ NETREG_UNREGISTERED, /* completed unregister todo */ NETREG_RELEASED, /* called free_netdev */ --- bridge.orig/net/core/dev.c 2006-05-09 11:17:09.0 -0700 +++ bridge/net/core/dev.c 2006-05-09 11:37:18.0 -0700 @@ -2777,6 +2777,8 @@ BUG_ON(dev_boot_phase); ASSERT_RTNL(); + might_sleep(); + /* When net_device's are persistent, this will be fatal. */ BUG_ON(dev->reg_state != NETREG_UNINITIALIZED); @@ -2863,6 +2865,11 @@ if (!dev->rebuild_header) dev->rebuild_header = default_rebuild_header; + ret = netdev_register_sysfs(dev); + if (ret) + goto out_err; + dev->reg_state = NETREG_REGISTERED; + /* * Default initial state at registry is that the * device is present. @@ -2878,14 +2885,11 @@ hlist_add_head(&dev->name_hlist, head); hlist_add_head(&dev->index_hlist, dev_index_hash(dev->ifindex)); dev_hold(dev); - dev->reg_state = NETREG_REGISTERING; write_unlock_bh(&dev_base_lock); /* Notify protocols, that a new device appeared. */ blocking_notifier_call_chain(&netdev_chain, NETDEV_REGISTER, dev); - /* Finish registration after unlock */ - net_set_todo(dev); ret = 0; out: @@ -3008,7 +3012,7 @@ * * We are invoked by rtnl_unlock() after it drops the semaphore. * This allows us to deal with problems: - * 1) We can create/delete sysfs objects which invoke hotplug + * 1) We can delete sysfs objects which invoke hotplug *without deadlocking with linkwatch via keventd. * 2) Since we run with the RTNL semaphore not held, we can sleep *safely in order to wait for the netdev refcnt to drop to zero. @@ -3017,8 +3021,6 @@ void netdev_run_todo(void) { struct list_head list = LIST_HEAD_INIT(list); - int err; - /* Need to guard against multiple cpu's getting out of order. */ mutex_lock(&net_todo_run_mutex); @@ -3041,40 +3043,29 @@ = list_entry(list.next, struct net_device, todo_list); list_del(&dev->todo_list); - switch(dev->reg_state) { - case NETREG_REGISTERING: - err = netdev_register_sysfs(dev); - if (err) - printk(KERN_ERR "%s: failed sysfs registration (%d)\n", - dev->name, err); - dev->reg_state = NETREG_REGISTERED; - break; - - case NETREG_UNREGISTERING: - netdev_unregister_sysfs(dev); - dev->reg_state = NETREG_UNREGISTERED; - - netdev_wait_allrefs(dev); - - /* paranoia */ - BUG_ON(atomic_read(&dev->refcnt)); - BUG_TRAP(!dev->ip_ptr); - BUG_TRAP(!dev->ip6_ptr); - BUG_TRAP(!dev->dn_ptr); - - - /* It must be the very last action, -* after this 'dev' may point to freed up memory. -*/ - if (dev->destructor) - dev->destructor(dev); - break; - - default: + if (unlikely(dev->reg_state != NETREG_UNREGISTERING)) { printk(KERN_ERR "network todo '%s' but state %d\n", dev->name, dev->reg_state); - break; + dump_stack(); + continue; } + + netdev_unregister_sysfs(dev); + dev->reg_state = NETREG_UNREGISTERED; + + netdev_wait_allrefs(dev); + + /* paranoia */ + BUG_ON(atomic_read(&dev->refcnt)); + BUG_TRAP(!dev->ip_ptr); + BUG_TRAP(!dev->ip6_ptr); + BUG_TRAP(!dev->dn_ptr); + + /* It must be the very last action, +* after this 'dev' may point to freed up
Re: [RFC] netdev sysfs failure handling
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Fri, 21 Apr 2006 13:42:05 -0700 > In case of sysfs failure, don't let device be brought up. > It can be cleared by unregister_netdevice so module can be unloaded > normally. > > Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> I'm not so sure about this, a hot plug event could clear that bit too. The problem I think is that here we've structured things such that we can't handle the error properly and pass it back to the register_netdevice() caller because we do the sysfs registry call in the rtnl_unlock() todo list execution. Next, even if you prevent the device from being brought up, people can still assign IP addresses and do other stuff to the device so it still sort of behaves as if it is there. It would therefore be the best if we can do this stuff inside of register_netdevice(), then handle and propagate any errors correctly. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html