Re: [RFC] netdev sysfs failure handling

2006-05-09 Thread Stephen Hemminger
On Tue, 09 May 2006 15:43:22 -0700 (PDT)
"David S. Miller" <[EMAIL PROTECTED]> wrote:

> From: Stephen Hemminger <[EMAIL PROTECTED]>
> Date: Tue, 9 May 2006 14:40:49 -0700
> 
> > Agreed, especially since rtnl is now a real mutex.  The case, that
> > I was worried about:
> > rtnl_lock()
> > spin_lock_irq(&mylock);
> > x = register_netdevice();
> > ...
> > 
> > Doesn't show up in any current code, even for the pseudo devices
> > and funny virtualized interfaces.
> 
> Right, therefore I think we should put something like your patch in
> there now perhaps.
> 
> The case where we really needed the todo list is unregister, so that
> we can safely wait for all references to the net device to go away.
> 
> I still wonder about those mentioned hotplug races wrt. linkwatch
> in the comment above netdev_run_todo().
> 
> Linkwatch is such a nuissance because it combines asynchronous link
> state change processing with keventd and RTNL locking.  It sleeps
> waiting for __LINK_STATE_SCHED to clear with the RTNL held (via
> dev_deactivate()).  But then again dev_close() code paths do this
> too, so the dev_deactivate() bit should be OK.
> 
> Linkwatch, after doing the dev_activate(), emits a NETDEV_CHANGE
> notifier on netdev_chain and also sends out an RTM_NETLINK
> message.  This is for the case where IFF_UP is set.
> 
> Until we release the RTNL semaphore, during netdev register, nobody
> can go in an inspect the state of a net device.  So doing the sysfs
> node creation in register_netdevice() should be OK as far as I can
> tell.
> 
> Can anyone find a problem with this?

Also, by getting the netdevice fully in sysfs under RTNL,
we are safe from races with the hotplug uevent that occurs.

Right now, it might be possible on SMP for the hotplug to happen
after register_netdevice, but before the device shows up
in sysfs.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] netdev sysfs failure handling

2006-05-09 Thread David S. Miller
From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Tue, 9 May 2006 14:40:49 -0700

> Agreed, especially since rtnl is now a real mutex.  The case, that
> I was worried about:
>   rtnl_lock()
>   spin_lock_irq(&mylock);
>   x = register_netdevice();
> ...
> 
> Doesn't show up in any current code, even for the pseudo devices
> and funny virtualized interfaces.

Right, therefore I think we should put something like your patch in
there now perhaps.

The case where we really needed the todo list is unregister, so that
we can safely wait for all references to the net device to go away.

I still wonder about those mentioned hotplug races wrt. linkwatch
in the comment above netdev_run_todo().

Linkwatch is such a nuissance because it combines asynchronous link
state change processing with keventd and RTNL locking.  It sleeps
waiting for __LINK_STATE_SCHED to clear with the RTNL held (via
dev_deactivate()).  But then again dev_close() code paths do this
too, so the dev_deactivate() bit should be OK.

Linkwatch, after doing the dev_activate(), emits a NETDEV_CHANGE
notifier on netdev_chain and also sends out an RTM_NETLINK
message.  This is for the case where IFF_UP is set.

Until we release the RTNL semaphore, during netdev register, nobody
can go in an inspect the state of a net device.  So doing the sysfs
node creation in register_netdevice() should be OK as far as I can
tell.

Can anyone find a problem with this?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] netdev sysfs failure handling

2006-05-09 Thread Stephen Hemminger
On Tue, 09 May 2006 14:05:01 -0700 (PDT)
"David S. Miller" <[EMAIL PROTECTED]> wrote:

> From: Stephen Hemminger <[EMAIL PROTECTED]>
> Date: Tue, 9 May 2006 12:01:07 -0700
> 
> > Something like this would handle errors better, but introduce possible
> > problems for drivers that call register_netdevice with irq's disabled.
> > There was some comment about racing with linkwatch, but don't see how
> > that could happen during creation.  
> > 
> > For 2.6.18?
> 
> I've been thinking about this a bit more.
> 
> How can anyone be using this with IRQ's disabled if we have
> an ASSERT_RTNL() there?

Agreed, especially since rtnl is now a real mutex.  The case, that
I was worried about:
rtnl_lock()
spin_lock_irq(&mylock);
x = register_netdevice();
...

Doesn't show up in any current code, even for the pseudo devices
and funny virtualized interfaces.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] netdev sysfs failure handling

2006-05-09 Thread David S. Miller
From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Tue, 9 May 2006 12:01:07 -0700

> Something like this would handle errors better, but introduce possible
> problems for drivers that call register_netdevice with irq's disabled.
> There was some comment about racing with linkwatch, but don't see how
> that could happen during creation.  
> 
> For 2.6.18?

I've been thinking about this a bit more.

How can anyone be using this with IRQ's disabled if we have
an ASSERT_RTNL() there?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] netdev sysfs failure handling

2006-05-09 Thread Stephen Hemminger
Something like this would handle errors better, but introduce possible
problems for drivers that call register_netdevice with irq's disabled.
There was some comment about racing with linkwatch, but don't see how
that could happen during creation.  

For 2.6.18?

--- bridge.orig/include/linux/netdevice.h   2006-05-09 11:17:08.0 
-0700
+++ bridge/include/linux/netdevice.h2006-05-09 11:18:52.0 -0700
@@ -433,8 +433,7 @@
 
/* register/unregister state machine */
enum { NETREG_UNINITIALIZED=0,
-  NETREG_REGISTERING,  /* called register_netdevice */
-  NETREG_REGISTERED,   /* completed register todo */
+  NETREG_REGISTERED,   /* completed register_netdevice */
   NETREG_UNREGISTERING,/* called unregister_netdevice */
   NETREG_UNREGISTERED, /* completed unregister todo */
   NETREG_RELEASED, /* called free_netdev */
--- bridge.orig/net/core/dev.c  2006-05-09 11:17:09.0 -0700
+++ bridge/net/core/dev.c   2006-05-09 11:37:18.0 -0700
@@ -2777,6 +2777,8 @@
BUG_ON(dev_boot_phase);
ASSERT_RTNL();
 
+   might_sleep();
+
/* When net_device's are persistent, this will be fatal. */
BUG_ON(dev->reg_state != NETREG_UNINITIALIZED);
 
@@ -2863,6 +2865,11 @@
if (!dev->rebuild_header)
dev->rebuild_header = default_rebuild_header;
 
+   ret = netdev_register_sysfs(dev);
+   if (ret)
+   goto out_err;
+   dev->reg_state = NETREG_REGISTERED;
+
/*
 *  Default initial state at registry is that the
 *  device is present.
@@ -2878,14 +2885,11 @@
hlist_add_head(&dev->name_hlist, head);
hlist_add_head(&dev->index_hlist, dev_index_hash(dev->ifindex));
dev_hold(dev);
-   dev->reg_state = NETREG_REGISTERING;
write_unlock_bh(&dev_base_lock);
 
/* Notify protocols, that a new device appeared. */
blocking_notifier_call_chain(&netdev_chain, NETDEV_REGISTER, dev);
 
-   /* Finish registration after unlock */
-   net_set_todo(dev);
ret = 0;
 
 out:
@@ -3008,7 +3012,7 @@
  *
  * We are invoked by rtnl_unlock() after it drops the semaphore.
  * This allows us to deal with problems:
- * 1) We can create/delete sysfs objects which invoke hotplug
+ * 1) We can delete sysfs objects which invoke hotplug
  *without deadlocking with linkwatch via keventd.
  * 2) Since we run with the RTNL semaphore not held, we can sleep
  *safely in order to wait for the netdev refcnt to drop to zero.
@@ -3017,8 +3021,6 @@
 void netdev_run_todo(void)
 {
struct list_head list = LIST_HEAD_INIT(list);
-   int err;
-
 
/* Need to guard against multiple cpu's getting out of order. */
mutex_lock(&net_todo_run_mutex);
@@ -3041,40 +3043,29 @@
= list_entry(list.next, struct net_device, todo_list);
list_del(&dev->todo_list);
 
-   switch(dev->reg_state) {
-   case NETREG_REGISTERING:
-   err = netdev_register_sysfs(dev);
-   if (err)
-   printk(KERN_ERR "%s: failed sysfs registration 
(%d)\n",
-  dev->name, err);
-   dev->reg_state = NETREG_REGISTERED;
-   break;
-
-   case NETREG_UNREGISTERING:
-   netdev_unregister_sysfs(dev);
-   dev->reg_state = NETREG_UNREGISTERED;
-
-   netdev_wait_allrefs(dev);
-
-   /* paranoia */
-   BUG_ON(atomic_read(&dev->refcnt));
-   BUG_TRAP(!dev->ip_ptr);
-   BUG_TRAP(!dev->ip6_ptr);
-   BUG_TRAP(!dev->dn_ptr);
-
-
-   /* It must be the very last action, 
-* after this 'dev' may point to freed up memory.
-*/
-   if (dev->destructor)
-   dev->destructor(dev);
-   break;
-
-   default:
+   if (unlikely(dev->reg_state != NETREG_UNREGISTERING)) {
printk(KERN_ERR "network todo '%s' but state %d\n",
   dev->name, dev->reg_state);
-   break;
+   dump_stack();
+   continue;
}
+
+   netdev_unregister_sysfs(dev);
+   dev->reg_state = NETREG_UNREGISTERED;
+
+   netdev_wait_allrefs(dev);
+
+   /* paranoia */
+   BUG_ON(atomic_read(&dev->refcnt));
+   BUG_TRAP(!dev->ip_ptr);
+   BUG_TRAP(!dev->ip6_ptr);
+   BUG_TRAP(!dev->dn_ptr);
+
+   /* It must be the very last action,
+* after this 'dev' may point to freed up

Re: [RFC] netdev sysfs failure handling

2006-05-06 Thread David S. Miller
From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Fri, 21 Apr 2006 13:42:05 -0700

> In case of sysfs failure, don't let device be brought up.
> It can be cleared by unregister_netdevice so module can be unloaded
> normally.
> 
> Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>

I'm not so sure about this, a hot plug event could clear that bit too.

The problem I think is that here we've structured things such that we
can't handle the error properly and pass it back to the
register_netdevice() caller because we do the sysfs registry call in
the rtnl_unlock() todo list execution.

Next, even if you prevent the device from being brought up, people
can still assign IP addresses and do other stuff to the device so
it still sort of behaves as if it is there.

It would therefore be the best if we can do this stuff inside of
register_netdevice(), then handle and propagate any errors correctly.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html