Re: [RFC][PATCH] Add suspend and resume support to uli526x

2007-06-05 Thread Stephen Hemminger
I hope soon to add suspend/resume to the network device class
and remove driver specific suspend/resume from lots of devices.

The class suspend routine would just be:
pci_save_state
dev-stop

resume is
pci_restore_state
dev-open

for many devices that is all they need.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2/4] [IPV4]: Convert IPv4 devconf to an array

2007-06-05 Thread David Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Sat, 2 Jun 2007 20:02:52 +1000

 @@ -64,20 +64,26 @@
  #include net/rtnetlink.h
  
  struct ipv4_devconf ipv4_devconf = {
 - .accept_redirects = 1,
 - .send_redirects =  1,
 - .secure_redirects = 1,
 - .shared_media =   1,
 + .data = {
 + [NET_IPV4_CONF_ACCEPT_REDIRECTS] = 1,
 + [NET_IPV4_CONF_SEND_REDIRECTS] = 1,
 + [NET_IPV4_CONF_SECURE_REDIRECTS] = 1,
 + [NET_IPV4_CONF_SHARED_MEDIA] = 1,
 + },
  };

These array indexes are off by one.

This is the danger in using this x-1 indexing style.
Such a mistake is way too easy to make.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] [IPV4]: Only panic if inetdev_init fails for loopback

2007-06-05 Thread Herbert Xu
[IPV4]: Only panic if inetdev_init fails for loopback

When I made the inetdev_init call work on all devices I incorrectly
left in the panic call as well.  It is obviously undesirable to
panic on an allocation failure for a normal network device.  This
patch moves the panic call under the loopback if clause.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 net/ipv4/devinet.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -1057,9 +1057,10 @@ static int inetdev_event(struct notifier
if (!in_dev) {
if (event == NETDEV_REGISTER) {
in_dev = inetdev_init(dev);
-   if (!in_dev)
-   panic(devinet: Failed to create loopback\n);
if (dev == loopback_dev) {
+   if (!in_dev)
+   panic(devinet: 
+ Failed to create loopback\n);
in_dev-cnf.no_xfrm = 1;
in_dev-cnf.no_policy = 1;
}
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] [IPV4]: Convert IPv4 devconf to an array

2007-06-05 Thread Herbert Xu
[IPV4]: Convert IPv4 devconf to an array

This patch converts the ipv4_devconf config members (everything except
sysctl) to an array.  This allows easier manipulation which will be
needed later on to provide better management of default config values.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 drivers/infiniband/hw/amso1100/c2.c |2 
 include/linux/inetdevice.h  |   94 +++-
 net/ipv4/arp.c  |   11 -
 net/ipv4/devinet.c  |  264 ++--
 net/ipv4/igmp.c |   18 +-
 net/ipv4/ipmr.c |   12 -
 net/ipv4/proc.c |2 
 net/ipv4/route.c|   14 -
 net/ipv4/sysctl_net_ipv4.c  |6 
 9 files changed, 163 insertions(+), 260 deletions(-)

diff --git a/drivers/infiniband/hw/amso1100/c2.c 
b/drivers/infiniband/hw/amso1100/c2.c
--- a/drivers/infiniband/hw/amso1100/c2.c
+++ b/drivers/infiniband/hw/amso1100/c2.c
@@ -672,7 +672,7 @@ static int c2_up(struct net_device *netd
 * rdma interface.
 */
in_dev = in_dev_get(netdev);
-   in_dev-cnf.arp_ignore = 1;
+   IN_DEV_CONF_SET(in_dev, ARP_IGNORE, 1);
in_dev_put(in_dev);
 
return 0;
diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
--- a/include/linux/inetdevice.h
+++ b/include/linux/inetdevice.h
@@ -10,28 +10,8 @@
 
 struct ipv4_devconf
 {
-   int accept_redirects;
-   int send_redirects;
-   int secure_redirects;
-   int shared_media;
-   int accept_source_route;
-   int rp_filter;
-   int proxy_arp;
-   int bootp_relay;
-   int log_martians;
-   int forwarding;
-   int mc_forwarding;
-   int tag;
-   int arp_filter;
-   int arp_announce;
-   int arp_ignore;
-   int arp_accept;
-   int medium_id;
-   int no_xfrm;
-   int no_policy;
-   int force_igmp_version;
-   int promote_secondaries;
void*sysctl;
+   int data[__NET_IPV4_CONF_MAX - 1];
 };
 
 extern struct ipv4_devconf ipv4_devconf;
@@ -60,30 +40,64 @@ struct in_device
struct rcu_head rcu_head;
 };
 
-#define IN_DEV_FORWARD(in_dev) ((in_dev)-cnf.forwarding)
-#define IN_DEV_MFORWARD(in_dev)(ipv4_devconf.mc_forwarding  
(in_dev)-cnf.mc_forwarding)
-#define IN_DEV_RPFILTER(in_dev)(ipv4_devconf.rp_filter  
(in_dev)-cnf.rp_filter)
-#define IN_DEV_SOURCE_ROUTE(in_dev)(ipv4_devconf.accept_source_route  
(in_dev)-cnf.accept_source_route)
-#define IN_DEV_BOOTP_RELAY(in_dev) (ipv4_devconf.bootp_relay  
(in_dev)-cnf.bootp_relay)
-
-#define IN_DEV_LOG_MARTIANS(in_dev)(ipv4_devconf.log_martians || 
(in_dev)-cnf.log_martians)
-#define IN_DEV_PROXY_ARP(in_dev)   (ipv4_devconf.proxy_arp || 
(in_dev)-cnf.proxy_arp)
-#define IN_DEV_SHARED_MEDIA(in_dev)(ipv4_devconf.shared_media || 
(in_dev)-cnf.shared_media)
-#define IN_DEV_TX_REDIRECTS(in_dev)(ipv4_devconf.send_redirects || 
(in_dev)-cnf.send_redirects)
-#define IN_DEV_SEC_REDIRECTS(in_dev)   (ipv4_devconf.secure_redirects || 
(in_dev)-cnf.secure_redirects)
-#define IN_DEV_IDTAG(in_dev)   ((in_dev)-cnf.tag)
-#define IN_DEV_MEDIUM_ID(in_dev)   ((in_dev)-cnf.medium_id)
-#define IN_DEV_PROMOTE_SECONDARIES(in_dev) 
(ipv4_devconf.promote_secondaries || (in_dev)-cnf.promote_secondaries)
+#define IPV4_DEVCONF(cnf, attr) ((cnf).data[NET_IPV4_CONF_ ## attr - 1])
+#define IPV4_DEVCONF_ALL(attr) IPV4_DEVCONF(ipv4_devconf, attr)
+
+static inline int ipv4_devconf_get(struct in_device *in_dev, int index)
+{
+   index--;
+   return in_dev-cnf.data[index];
+}
+
+static inline void ipv4_devconf_set(struct in_device *in_dev, int index,
+   int val)
+{
+   index--;
+   in_dev-cnf.data[index] = val;
+}
+
+#define IN_DEV_CONF_GET(in_dev, attr) \
+   ipv4_devconf_get((in_dev), NET_IPV4_CONF_ ## attr)
+#define IN_DEV_CONF_SET(in_dev, attr, val) \
+   ipv4_devconf_set((in_dev), NET_IPV4_CONF_ ## attr, (val))
+
+#define IN_DEV_ANDCONF(in_dev, attr) \
+   (IPV4_DEVCONF_ALL(attr)  IN_DEV_CONF_GET((in_dev), attr))
+#define IN_DEV_ORCONF(in_dev, attr) \
+   (IPV4_DEVCONF_ALL(attr) || IN_DEV_CONF_GET((in_dev), attr))
+#define IN_DEV_MAXCONF(in_dev, attr) \
+   (max(IPV4_DEVCONF_ALL(attr), IN_DEV_CONF_GET((in_dev), attr)))
+
+#define IN_DEV_FORWARD(in_dev) IN_DEV_CONF_GET((in_dev), FORWARDING)
+#define IN_DEV_MFORWARD(in_dev)
(IPV4_DEVCONF_ALL(MC_FORWARDING)  \
+IPV4_DEVCONF((in_dev)-cnf, \
+ MC_FORWARDING))
+#define IN_DEV_RPFILTER(in_dev)IN_DEV_ANDCONF((in_dev), 
RP_FILTER)
+#define IN_DEV_SOURCE_ROUTE(in_dev)IN_DEV_ANDCONF((in_dev), \
+  

[PATCH 4/4] [IPV4]: Restore old behaviour of default config values

2007-06-05 Thread Herbert Xu
[IPV4]: Restore old behaviour of default config values

Previously inet devices were only constructed when addresses are added
(or rarely in ipmr).  Therefore the default config values they get are
the ones at the time of these operations.

Now that we're creating inet devices earlier, this changes the behaviour
of default config values in an incompatible way (see bug #8519).

This patch creates a compromise by setting the default values at the
same point as before but only for those that have not been explicitly
set by the user since the inet device's creation.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/linux/inetdevice.h |6 +-
 net/ipv4/devinet.c |   19 ---
 net/ipv4/ipmr.c|   15 +++
 3 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
--- a/include/linux/inetdevice.h
+++ b/include/linux/inetdevice.h
@@ -59,6 +59,11 @@ static inline void ipv4_devconf_set(stru
in_dev-cnf.data[index] = val;
 }
 
+static inline void ipv4_devconf_setall(struct in_device *in_dev)
+{
+   bitmap_fill(in_dev-cnf.state, __NET_IPV4_CONF_MAX - 1);
+}
+
 #define IN_DEV_CONF_GET(in_dev, attr) \
ipv4_devconf_get((in_dev), NET_IPV4_CONF_ ## attr)
 #define IN_DEV_CONF_SET(in_dev, attr, val) \
@@ -125,7 +130,6 @@ extern struct net_device*ip_dev_find(_
 extern int inet_addr_onlink(struct in_device *in_dev, __be32 a, 
__be32 b);
 extern int devinet_ioctl(unsigned int cmd, void __user *);
 extern voiddevinet_init(void);
-extern struct in_device *inetdev_init(struct net_device *dev);
 extern struct in_device*inetdev_by_index(int);
 extern __be32  inet_select_addr(const struct net_device *dev, __be32 
dst, int scope);
 extern __be32  inet_confirm_addr(const struct net_device *dev, __be32 
dst, __be32 local, int scope);
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -147,7 +147,7 @@ void in_dev_finish_destroy(struct in_dev
}
 }
 
-struct in_device *inetdev_init(struct net_device *dev)
+static struct in_device *inetdev_init(struct net_device *dev)
 {
struct in_device *in_dev;
 
@@ -405,12 +405,10 @@ static int inet_set_ifa(struct net_devic
ASSERT_RTNL();
 
if (!in_dev) {
-   in_dev = inetdev_init(dev);
-   if (!in_dev) {
-   inet_free_ifa(ifa);
-   return -ENOBUFS;
-   }
+   inet_free_ifa(ifa);
+   return -ENOBUFS;
}
+   ipv4_devconf_setall(in_dev);
if (ifa-ifa_dev != in_dev) {
BUG_TRAP(!ifa-ifa_dev);
in_dev_hold(in_dev);
@@ -520,13 +518,12 @@ static struct in_ifaddr *rtm_to_ifaddr(s
 
in_dev = __in_dev_get_rtnl(dev);
if (in_dev == NULL) {
-   in_dev = inetdev_init(dev);
-   if (in_dev == NULL) {
-   err = -ENOBUFS;
-   goto errout;
-   }
+   err = -ENOBUFS;
+   goto errout;
}
 
+   ipv4_devconf_setall(in_dev);
+
ifa = inet_alloc_ifa();
if (ifa == NULL) {
/*
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -152,9 +152,11 @@ struct net_device *ipmr_new_tunnel(struc
dev-flags |= IFF_MULTICAST;
 
in_dev = __in_dev_get_rtnl(dev);
-   if (in_dev == NULL  (in_dev = inetdev_init(dev)) == 
NULL)
+   if (in_dev == NULL)
goto failure;
-   IN_DEV_CONF_SET(in_dev, RP_FILTER, 0);
+
+   ipv4_devconf_setall(in_dev);
+   IPV4_DEVCONF(in_dev-cnf, RP_FILTER) = 0;
 
if (dev_open(dev))
goto failure;
@@ -218,10 +220,15 @@ static struct net_device *ipmr_reg_vif(v
}
dev-iflink = 0;
 
-   if ((in_dev = inetdev_init(dev)) == NULL)
+   rcu_read_lock();
+   if ((in_dev = __in_dev_get_rcu(dev)) == NULL) {
+   rcu_read_unlock();
goto failure;
+   }
 
-   IN_DEV_CONF_SET(in_dev, RP_FILTER, 0);
+   ipv4_devconf_setall(in_dev);
+   IPV4_DEVCONF(in_dev-cnf, RP_FILTER) = 0;
+   rcu_read_unlock();
 
if (dev_open(dev))
goto failure;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] [IPV4]: Add default config support after inetdev_init

2007-06-05 Thread Herbert Xu
[IPV4]: Add default config support after inetdev_init

Previously once inetdev_init has been called on a device any changes made
to ipv4_devconf_dflt would have no effect on that device's configuration.

This creates a problem since we have moved the point where inetdev_init
is called from when an address is added to where the device is registered.

This patch is the first half of a set that tries to mimic the old behaviour
while still calling inetdev_init.

It propagates any changes to ipv4_devconf_dflt to those devices that have
not had the corresponding attribute set.

The next patch will forcibly set all values at the point where inetdev_init
was previously called.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/linux/inetdevice.h |3 +
 net/ipv4/devinet.c |  133 +
 2 files changed, 101 insertions(+), 35 deletions(-)

diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
--- a/include/linux/inetdevice.h
+++ b/include/linux/inetdevice.h
@@ -3,6 +3,7 @@
 
 #ifdef __KERNEL__
 
+#include linux/bitmap.h
 #include linux/if.h
 #include linux/netdevice.h
 #include linux/rcupdate.h
@@ -12,6 +13,7 @@ struct ipv4_devconf
 {
void*sysctl;
int data[__NET_IPV4_CONF_MAX - 1];
+   DECLARE_BITMAP(state, __NET_IPV4_CONF_MAX - 1);
 };
 
 extern struct ipv4_devconf ipv4_devconf;
@@ -53,6 +55,7 @@ static inline void ipv4_devconf_set(stru
int val)
 {
index--;
+   set_bit(index, in_dev-cnf.state);
in_dev-cnf.data[index] = val;
 }
 
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -1244,6 +1244,91 @@ errout:
 
 #ifdef CONFIG_SYSCTL
 
+static void devinet_copy_dflt_conf(int i)
+{
+   struct net_device *dev;
+
+   read_lock(dev_base_lock);
+   for_each_netdev(dev) {
+   struct in_device *in_dev;
+   rcu_read_lock();
+   in_dev = __in_dev_get_rcu(dev);
+   if (in_dev  !test_bit(i, in_dev-cnf.state))
+   in_dev-cnf.data[i] = ipv4_devconf_dflt.data[i];
+   rcu_read_unlock();
+   }
+   read_unlock(dev_base_lock);
+}
+
+static int devinet_conf_proc(ctl_table *ctl, int write,
+struct file* filp, void __user *buffer,
+size_t *lenp, loff_t *ppos)
+{
+   int ret = proc_dointvec(ctl, write, filp, buffer, lenp, ppos);
+
+   if (write) {
+   struct ipv4_devconf *cnf = ctl-extra1;
+   int i = (int *)ctl-data - cnf-data;
+
+   set_bit(i, cnf-state);
+
+   if (cnf == ipv4_devconf_dflt)
+   devinet_copy_dflt_conf(i);
+   }
+
+   return ret;
+}
+
+static int devinet_conf_sysctl(ctl_table *table, int __user *name, int nlen,
+  void __user *oldval, size_t __user *oldlenp,
+  void __user *newval, size_t newlen)
+{
+   struct ipv4_devconf *cnf;
+   int *valp = table-data;
+   int new;
+   int i;
+
+   if (!newval || !newlen)
+   return 0;
+
+   if (newlen != sizeof(int))
+   return -EINVAL;
+
+   if (get_user(new, (int __user *)newval))
+   return -EFAULT;
+
+   if (new == *valp)
+   return 0;
+
+   if (oldval  oldlenp) {
+   size_t len;
+
+   if (get_user(len, oldlenp))
+   return -EFAULT;
+
+   if (len) {
+   if (len  table-maxlen)
+   len = table-maxlen;
+   if (copy_to_user(oldval, valp, len))
+   return -EFAULT;
+   if (put_user(len, oldlenp))
+   return -EFAULT;
+   }
+   }
+
+   *valp = new;
+
+   cnf = table-extra1;
+   i = (int *)table-data - cnf-data;
+
+   set_bit(i, cnf-state);
+
+   if (cnf == ipv4_devconf_dflt)
+   devinet_copy_dflt_conf(i);
+
+   return 1;
+}
+
 void inet_forward_change(void)
 {
struct net_device *dev;
@@ -1302,40 +1387,13 @@ int ipv4_doint_and_flush_strategy(ctl_ta
  void __user *oldval, size_t __user *oldlenp,
  void __user *newval, size_t newlen)
 {
-   int *valp = table-data;
-   int new;
-
-   if (!newval || !newlen)
-   return 0;
-
-   if (newlen != sizeof(int))
-   return -EINVAL;
+   int ret = devinet_conf_sysctl(table, name, nlen, oldval, oldlenp,
+ newval, newlen);
 
-   if (get_user(new, (int __user *)newval))
-   return -EFAULT;
-
-   if (new == *valp)
-   return 0;
-
-   if (oldval  oldlenp) {
-   size_t len;
-
-   if (get_user(len, oldlenp))
-

Re: [PATCH 1/4] [IPV4]: Only panic if inetdev_init fails for loopback

2007-06-05 Thread David Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Tue, 05 Jun 2007 16:31:01 +1000

 [IPV4]: Only panic if inetdev_init fails for loopback
 
 When I made the inetdev_init call work on all devices I incorrectly
 left in the panic call as well.  It is obviously undesirable to
 panic on an allocation failure for a normal network device.  This
 patch moves the panic call under the loopback if clause.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Applied.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] [IPV4]: Convert IPv4 devconf to an array

2007-06-05 Thread David Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Tue, 05 Jun 2007 16:31:02 +1000

 [IPV4]: Convert IPv4 devconf to an array
 
 This patch converts the ipv4_devconf config members (everything except
 sysctl) to an array.  This allows easier manipulation which will be
 needed later on to provide better management of default config values.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Applied, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] [IPV4]: Add default config support after inetdev_init

2007-06-05 Thread David Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Tue, 05 Jun 2007 16:31:03 +1000

 [IPV4]: Add default config support after inetdev_init
 
 Previously once inetdev_init has been called on a device any changes made
 to ipv4_devconf_dflt would have no effect on that device's configuration.
 
 This creates a problem since we have moved the point where inetdev_init
 is called from when an address is added to where the device is registered.
 
 This patch is the first half of a set that tries to mimic the old behaviour
 while still calling inetdev_init.
 
 It propagates any changes to ipv4_devconf_dflt to those devices that have
 not had the corresponding attribute set.
 
 The next patch will forcibly set all values at the point where inetdev_init
 was previously called.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Looks good, applied.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] [IPV4]: Restore old behaviour of default config values

2007-06-05 Thread David Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Tue, 05 Jun 2007 16:31:04 +1000

 [IPV4]: Restore old behaviour of default config values
 
 Previously inet devices were only constructed when addresses are added
 (or rarely in ipmr).  Therefore the default config values they get are
 the ones at the time of these operations.
 
 Now that we're creating inet devices earlier, this changes the behaviour
 of default config values in an incompatible way (see bug #8519).
 
 This patch creates a compromise by setting the default values at the
 same point as before but only for those that have not been explicitly
 set by the user since the inet device's creation.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Also applied, thanks a lot.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] TCP : use LIMIT_NETDEBUG in tcp_retransmit_timer()

2007-06-05 Thread David Miller
From: Eric Dumazet [EMAIL PROTECTED]
Date: Mon, 04 Jun 2007 09:13:40 +0200

 LIMIT_NETDEBUG allows the admin to disable some warning messages (echo 0 
  /proc/sys/net/core/warnings).
 
 The TCP: Treason uncloaked! message can use this facility.
 
 Signed-off-by: Eric Dumazet [EMAIL PROTECTED]

That's pretty difficult to trigger remotely, and the last time
we saw a legitimate case of this it was actually a bug in the
Linux TCP stack which we've long since fixed.

I would rather not let people easily disable this, since it
finds bugs and I can't see someone seriously exploiting this
to spam your logs.

I suppose you disagree :)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED]
Date: Mon, 04 Jun 2007 11:45:32 +0200

  A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
  send()+close() on the peer, causing recv() to return zero, even though
  the sent data should be received.
  
  This happens if the send() and the close() is performed between
  skb_dequeue() and checking sk-sk_shutdown in unix_stream_recvmsg():
  
  process A  skb_dequeue() returns NULL, there's no data in the socket queue
  process B  new data is inserted onto the queue by unix_stream_sendmsg()
  process B  sk-sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
  process A  sk-sk_shutdown is checked, unix_release_sock() returns zero
 
 This is only part of the story.  It turns out, there are other races
 involving the garbage collector, that can throw away perfectly good
 packets with AF_UNIX sockets in them.
 
 The problems arise when a socket goes from installed to in-flight or
 vica versa during garbage collection.  Since gc is done with a
 spinlock held, this only shows up on SMP.
 
 The following patch fixes it for me, but it's possibly the wrong
 approach.
 
 Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]

I haven't seen a repost of the first patch, which is necessary because
that first patch doesn't apply to the current tree.  Please don't
ignore Arnaldo's feedback like that, or else I'll ignore you just the
same. :-)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rfkill: Make rfkill-name const

2007-06-05 Thread David Miller
From: Ivo van Doorn [EMAIL PROTECTED]
Date: Fri, 1 Jun 2007 20:20:43 +0200

 The rfkill name can be made const safely,
 this makes the compiler happy when drivers make
 it point to some const string used elsewhere.
 
 Signed-off-by: Ivo van Doorn [EMAIL PROTECTED]

Patch applied, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] since the definition of dst_discard_in and dst_discard_out are the same,

2007-06-05 Thread David Miller
From: Denis Cheng [EMAIL PROTECTED]
Date: Fri, 01 Jun 2007 19:08:51 -0700 (PDT)

 they should merged into one
 
 Signed-off-by: Denis Cheng [EMAIL PROTECTED]

Patch applied.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] TCP : use LIMIT_NETDEBUG in tcp_retransmit_timer()

2007-06-05 Thread Eric Dumazet

David Miller a écrit :

From: Eric Dumazet [EMAIL PROTECTED]
Date: Mon, 04 Jun 2007 09:13:40 +0200

LIMIT_NETDEBUG allows the admin to disable some warning messages (echo 0 
 /proc/sys/net/core/warnings).


The TCP: Treason uncloaked! message can use this facility.

Signed-off-by: Eric Dumazet [EMAIL PROTECTED]


That's pretty difficult to trigger remotely, and the last time
we saw a legitimate case of this it was actually a bug in the
Linux TCP stack which we've long since fixed.

I would rather not let people easily disable this, since it
finds bugs and I can't see someone seriously exploiting this
to spam your logs.

I suppose you disagree :)


Well, I dont know..

Should I try to contact 222.39.111.164 and kindly ask him not to trigger this 
annoying message ?


Is this remote peer running an old linux, or is it a bug in the current 
linux-2.6.22-rc4 kernel that my server runs ?


TCP: Treason uncloaked! Peer 222.39.111.164:3336/443 shrinks window 
768445619:768445688. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:1557/443 shrinks window 
979986676:979986745. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2819/443 shrinks window 
903113077:903113146. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2303/443 shrinks window 
919019510:919019579. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2689/443 shrinks window 
986306349:986306418. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2811/443 shrinks window 
1125084057:1125084126. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2394/443 shrinks window 
1141647532:1141647601. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2928/443 shrinks window 
1278301359:1278301428. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3034/443 shrinks window 
1408574195:1408574264. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2942/443 shrinks window 
1453238832:1453238901. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3384/443 shrinks window 
1883648213:1883648282. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3069/443 shrinks window 
1913770905:1913770974. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2852/443 shrinks window 
1956256538:1956256607. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3485/443 shrinks window 
2034019830:2034019899. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3104/443 shrinks window 
2054229702:2054229771. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2897/443 shrinks window 
2070392361:2070392430. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3485/443 shrinks window 
2034019830:2034019899. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3626/443 shrinks window 
2167403488:2167403557. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3104/443 shrinks window 
2054229702:2054229771. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2897/443 shrinks window 
2070392361:2070392430. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3626/443 shrinks window 
2167403488:2167403557. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2991/443 shrinks window 
2283637853:2283637922. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3128/443 shrinks window 
2283549171:2283549240. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2991/443 shrinks window 
2283637853:2283637922. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3128/443 shrinks window 
2283549171:2283549240. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:2991/443 shrinks window 
2283637853:2283637922. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3128/443 shrinks window 
2283549171:2283549240. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:4075/443 shrinks window 
2788112983:2788113052. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:4075/443 shrinks window 
2788112983:2788113052. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3786/443 shrinks window 
2905215394:2905215463. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:4075/443 shrinks window 
2788112983:2788113052. Repaired.
TCP: Treason uncloaked! Peer 222.39.111.164:3786/443 shrinks window 
2905215394:2905215463. Repaired.



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] TCP : use LIMIT_NETDEBUG in tcp_retransmit_timer()

2007-06-05 Thread David Miller
From: Eric Dumazet [EMAIL PROTECTED]
Date: Tue, 05 Jun 2007 09:07:26 +0200

 Should I try to contact 222.39.111.164 and kindly ask him not to trigger this 
 annoying message ?
 
 Is this remote peer running an old linux, or is it a bug in the current 
 linux-2.6.22-rc4 kernel that my server runs ?

Grrr, yeah it's their end.  And because it's in China with no reverse
DNS your chances of getting them to fix this are between slim and
none.

I'll add your patch, thanks a lot.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: r8169: hard freezes on TX

2007-06-05 Thread Rolf Eike Beer
You wrote:
 Rolf Eike Beer [EMAIL PROTECTED] :
 [...]

  I just had another freeze using your patches. After 512kB over smb it was
  dead.

 In-kernel smb/cifs ?

Copying to a partition mounted via smb:// protocol in konqueror which uses 
kio_smb (userspace io slave).

Eike


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread Miklos Szeredi
   A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
   send()+close() on the peer, causing recv() to return zero, even though
   the sent data should be received.
   
   This happens if the send() and the close() is performed between
   skb_dequeue() and checking sk-sk_shutdown in unix_stream_recvmsg():
   
   process A  skb_dequeue() returns NULL, there's no data in the socket queue
   process B  new data is inserted onto the queue by unix_stream_sendmsg()
   process B  sk-sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
   process A  sk-sk_shutdown is checked, unix_release_sock() returns zero
  
  This is only part of the story.  It turns out, there are other races
  involving the garbage collector, that can throw away perfectly good
  packets with AF_UNIX sockets in them.
  
  The problems arise when a socket goes from installed to in-flight or
  vica versa during garbage collection.  Since gc is done with a
  spinlock held, this only shows up on SMP.
  
  The following patch fixes it for me, but it's possibly the wrong
  approach.
  
  Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
 
 I haven't seen a repost of the first patch, which is necessary because
 that first patch doesn't apply to the current tree.  Please don't
 ignore Arnaldo's feedback like that, or else I'll ignore you just the
 same. :-)

I just want to win the who's laziest? league.  It would take me
about 5 minutes to get the netdev tree and test compile the change.
Of which 5 seconds would be actually updating the patch.  I was
thought it was OK to pass that 5 seconds worth of hard work to you in
order to save the rest ;)

Anyway here's the updated (but not compile tested) patch.

Thanks,
Miklos

From: Miklos Szeredi [EMAIL PROTECTED]

A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
send()+close() on the peer, causing recv() to return zero, even though
the sent data should be received.

This happens if the send() and the close() is performed between
skb_dequeue() and checking sk-sk_shutdown in unix_stream_recvmsg():

process A  skb_dequeue() returns NULL, there's no data in the socket queue
process B  new data is inserted onto the queue by unix_stream_sendmsg()
process B  sk-sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
process A  sk-sk_shutdown is checked, unix_release_sock() returns zero

I'm surprised nobody noticed this, it's not hard to trigger.  Maybe
it's just (un)luck with the timing.

It's possible to work around this bug in userspace, by retrying the
recv() once in case of a zero return value.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux-2.6.22-rc2/net/unix/af_unix.c
===
--- linux-2.6.22-rc2.orig/net/unix/af_unix.c2007-06-02 23:45:47.0 
+0200
+++ linux-2.6.22-rc2/net/unix/af_unix.c 2007-06-02 23:45:49.0 +0200
@@ -1711,20 +1711,23 @@ static int unix_stream_recvmsg(struct ki
int chunk;
struct sk_buff *skb;
 
+   unix_state_lock(sk);
skb = skb_dequeue(sk-sk_receive_queue);
if (skb==NULL)
{
if (copied = target)
-   break;
+   goto unlock;
 
/*
 *  POSIX 1003.1g mandates this order.
 */
 
if ((err = sock_error(sk)) != 0)
-   break;
+   goto unlock;
if (sk-sk_shutdown  RCV_SHUTDOWN)
-   break;
+   goto unlock;
+
+   unix_state_unlock(sk);
err = -EAGAIN;
if (!timeo)
break;
@@ -1738,7 +1741,11 @@ static int unix_stream_recvmsg(struct ki
}
mutex_lock(u-readlock);
continue;
+ unlock:
+   unix_state_unlock(sk);
+   break;
}
+   unix_state_unlock(sk);
 
if (check_creds) {
/* Never glue messages from different writers */

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED]
Date: Tue, 05 Jun 2007 09:42:41 +0200

 I just want to win the who's laziest? league.  It would take me
 about 5 minutes to get the netdev tree and test compile the change.
 Of which 5 seconds would be actually updating the patch.  I was
 thought it was OK to pass that 5 seconds worth of hard work to you in
 order to save the rest ;)

That tradeoff is fine, if, in return you'll do the rest of the
networking subsystem maintainership work I need to to. :-)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


2.6.9 packets not received on link-local address

2007-06-05 Thread Edward Wang
One 2.6.9 machine sends to its own link-local address some packets. But 
it simply can't receive those packets, for both TCP and UDP. If will 
succeeded if:


  - sending to a global address instead of a link-local one;
  - or, machine A sends packets, machine B will receive them;
  - or, 2.6.15 *do* receive TCP packets it sends to its link-local 
address. But UDP still fail.



It seems to be a known issue since 2.6.15 shows some improvement. But I 
can't find anything in mailing list archive. Any pointer?


And, there's a dummy route entry in 2.6.* route table, which use 
loopback_dev as reference device to link-local address. It seems really 
odd. Is it related to the problem I just asked? And why Linux IPv6 stack 
did that?


Thanks in advance.

-Ed
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][RFC] network splice receive

2007-06-05 Thread Jens Axboe
Hi,

Here's an implementation of tcp network splice receive support. It's
originally based on the patch set that Intel posted some time ago, but
has been (close to) 100% reworked.

Now, I'm not a networking guru by any stretch of the imagination, so I'd
like some input on the direction of the main patch. Is the approach
feasible? Glaring errors? Missing bits?

If you want to test it, I'd suggest downloading the latest splice tools
snapshot here:

http://brick.kernel.dk/snaps/splice-git-latest.tar.gz

Examples:

- Sending a small test message over the network:

  [EMAIL PROTECTED]:~/splice $ ./splice-fromnet  | cat
  [EMAIL PROTECTED]:~ $ echo hello | netcat host1 

  should write hello on host1. Yeah, complex stuff.

- Sending a file over the network:

  [EMAIL PROTECTED]:~/splice $ ./splice-fromnet  | ./splice out outfile
  [EMAIL PROTECTED]:~ $ cat somefile | netcat host1 

  should send somefile over the network, storing it in outfile.

Seems to work reasonably well for me, sometimes I do see small ranges
inside the output file that are not correct, but I haven't been able to
reproduce today. I think it has to do with page reuse, hence the
NET_COPY_SPLICE ifdef that you can enable to just plain copy the data
instead of referencing it.

Patches are against the #splice branch of the block repo, official url
of that is:

git://git.kernel.dk/data/git/linux-2.6-block.git/

and it's based on Linus main tree (at 2.6.22-rc4 right now). Let me know
if I should supply netdev branch patches instead.

-- 
Jens Axboe

From 592c46ea813c31c0d6b28bf543ce2f5dd884a75e Mon Sep 17 00:00:00 2001
From: Jens Axboe [EMAIL PROTECTED]
Date: Mon, 4 Jun 2007 15:06:43 +0200
Subject: [PATCH] [NET] tcp_read_sock: alloc recv_actor() return return negative error value

Signed-off-by: Jens Axboe [EMAIL PROTECTED]
---
 net/ipv4/tcp.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index cd3c7e9..450f44b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1064,7 +1064,11 @@ int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
 	break;
 			}
 			used = recv_actor(desc, skb, offset, len);
-			if (used = len) {
+			if (used  0) {
+if (!copied)
+	copied = used;
+break;
+			} else if (used = len) {
 seq += used;
 copied += used;
 offset += used;
@@ -1086,7 +1090,7 @@ int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
 	tcp_rcv_space_adjust(sk);
 
 	/* Clean up data we have read: This will do ACK frames. */
-	if (copied)
+	if (copied  0)
 		tcp_cleanup_rbuf(sk, copied);
 	return copied;
 }
-- 
1.5.2.rc1

From 10d906a9a5a16a022d5067bee3963a0e3a03ae0c Mon Sep 17 00:00:00 2001
From: Jens Axboe [EMAIL PROTECTED]
Date: Tue, 5 Jun 2007 09:54:00 +0200
Subject: [PATCH] [NET] TCP splice receive support

Losely based on original patches from Intel, modified to actually
be zero-copy (the original patches memcpy'ed the data).

Signed-off-by: Jens Axboe [EMAIL PROTECTED]
---
 include/linux/net.h|3 +
 include/linux/skbuff.h |5 ++
 include/net/tcp.h  |3 +
 net/core/skbuff.c  |  114 +++
 net/ipv4/af_inet.c |1 +
 net/ipv4/tcp.c |  138 
 net/socket.c   |   13 +
 7 files changed, 277 insertions(+), 0 deletions(-)

diff --git a/include/linux/net.h b/include/linux/net.h
index efc4517..472ee12 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -19,6 +19,7 @@
 #define _LINUX_NET_H
 
 #include linux/wait.h
+#include linux/splice.h
 #include asm/socket.h
 
 struct poll_table_struct;
@@ -165,6 +166,8 @@ struct proto_ops {
   struct vm_area_struct * vma);
 	ssize_t		(*sendpage)  (struct socket *sock, struct page *page,
   int offset, size_t size, int flags);
+	ssize_t 	(*splice_read)(struct socket *sock,  loff_t *ppos,
+   struct pipe_inode_info *pipe, size_t len, unsigned int flags);
 };
 
 struct net_proto_family {
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index e7367c7..619dcf5 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1504,6 +1504,11 @@ extern int	   skb_store_bits(struct sk_buff *skb, int offset,
 extern __wsum	   skb_copy_and_csum_bits(const struct sk_buff *skb,
 	  int offset, u8 *to, int len,
 	  __wsum csum);
+extern int skb_splice_bits(const struct sk_buff *skb,
+		unsigned int offset,
+		struct pipe_inode_info *pipe,
+		unsigned int len,
+		unsigned int flags);
 extern void	   skb_copy_and_csum_dev(const struct sk_buff *skb, u8 *to);
 extern void	   skb_split(struct sk_buff *skb,
  struct sk_buff *skb1, const u32 len);
diff --git a/include/net/tcp.h b/include/net/tcp.h
index a8af9ae..8e86697 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -308,6 +308,9 @@ extern int			tcp_twsk_unique(struct sock *sk,
 
 extern void			tcp_twsk_destructor(struct sock 

Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED]
Date: Tue, 05 Jun 2007 10:11:56 +0200

   I just want to win the who's laziest? league.  It would take me
   about 5 minutes to get the netdev tree and test compile the change.
   Of which 5 seconds would be actually updating the patch.  I was
   thought it was OK to pass that 5 seconds worth of hard work to you in
   order to save the rest ;)
  
  That tradeoff is fine, if, in return you'll do the rest of the
  networking subsystem maintainership work I need to to. :-)
 
 Well, I _did_ save you quite a bit of time by tracking down these
 bugs.  That 5 sec of dedication in exchange would have really made me
 feel good ;(

The only reason I can process as many patches as I can every day is
that I depend upon the end-nodes (that's you) doing most of the
time intensive work so that I can concentrate on reviewing patches
for correctness and proper implementation.

In any event, thanks for respinning the patch, it's late here so
I'll review it tomorrow.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread Miklos Szeredi
  I just want to win the who's laziest? league.  It would take me
  about 5 minutes to get the netdev tree and test compile the change.
  Of which 5 seconds would be actually updating the patch.  I was
  thought it was OK to pass that 5 seconds worth of hard work to you in
  order to save the rest ;)
 
 That tradeoff is fine, if, in return you'll do the rest of the
 networking subsystem maintainership work I need to to. :-)

Well, I _did_ save you quite a bit of time by tracking down these
bugs.  That 5 sec of dedication in exchange would have really made me
feel good ;(

Miklos
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bugfix GFP_KERNEL - GFP_ATOMIC in spin_locked region

2007-06-05 Thread Oliver Neukum
Am Dienstag, 5. Juni 2007 06:08 schrieb Andrew Morton:
 Everything in USB appears to already be fixed, apart from the io_ti.c bug.

Yes, that's a bug. I've queued a patch.

Regards
Oliver
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


warnings in git-wireless

2007-06-05 Thread Andrew Morton

i386 allmodconfig isn't that hard, guys.

drivers/net/wireless/mac80211/zd1211rw/zd_mac.c:600: warning: 'fill_rt_header' 
defined but not used
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c: In function 
'iwl_hw_tx_queue_free_tfd':
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c:964: warning: left shift count 
= width of type
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c: In function 
'iwl_hw_tx_queue_attach_buffer_to_tfd':
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c:2041: warning: left shift 
count = width of type
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c:2041: warning: left shift 
count = width of type
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c:2047: warning: left shift 
count = width of type
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c:2050: warning: left shift 
count = width of type

With some trepidation I looked in just that header.


 #define iwl_get_bits(src, pos, len)   \
 ({\
   u32 __tmp = le32_to_cpu(src); \
   __tmp = pos;\
   __tmp = (1UL  len) - 1;\
   __tmp;\
 })

Can be a inlined C function.  Should be commented.

 #define iwl_set_bits(dst, pos, len, val) \
 ({   \
   u32 __tmp = le32_to_cpu(*dst);   \
 __tmp = ~((1UL  (pos+len)) - (1  pos)); \
   __tmp |= (val  ((1UL  len) - 1))  pos;  \
 *dst = cpu_to_le32(__tmp);   \
 })

Ditto.  Whitespace broken.

 #define _IWL_SET_BITS(s, d, o, l, v) \
 iwl_set_bits(s.d, o, l, v)
 
 #define IWL_SET_BITS(s, sym, v) \
 _IWL_SET_BITS((s), IWL_ ## sym ## _SYM, IWL_ ## sym ## _POS, IWL_ ## 
 sym ## _LEN, (v))
 
 #define _IWL_GET_BITS(s, v, o, l) \
 iwl_get_bits(s.v, o, l)
 
 #define IWL_GET_BITS(s, sym) \
 _IWL_GET_BITS((s), IWL_ ## sym ## _SYM, IWL_ ## sym ## _POS, IWL_ ## 
 sym ## _LEN)

Shudder.

 /*
  * make C=2 CF=-Wall will complain if you use ARRAY_SIZE on global data
  */
 #define GLOBAL_ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))

This is identical to ARRAY_SIZE.

And if there's some problem with ARRAY_SIZE then fix ARRAY_SIZE!  Don't go 
off and create some private thing and leave everyone else twisting in the
wind.

 /* Debug and printf string expansion helpers for printing bitfields */
 #define BIT_FMT8 %c%c%c%c-%c%c%c%c
 #define BIT_FMT16 BIT_FMT8 : BIT_FMT8
 #define BIT_FMT32 BIT_FMT16   BIT_FMT16
 
 #define BITC(x,y) (((xy)1)?'1':'0')
 #define BIT_ARG8(x) \
 BITC(x,7),BITC(x,6),BITC(x,5),BITC(x,4),\
 BITC(x,3),BITC(x,2),BITC(x,1),BITC(x,0)
 
 #define BIT_ARG16(x) \
 BITC(x,15),BITC(x,14),BITC(x,13),BITC(x,12),\
 BITC(x,11),BITC(x,10),BITC(x,9),BITC(x,8),\
 BIT_ARG8(x)
 
 #define BIT_ARG32(x) \
 BITC(x,31),BITC(x,30),BITC(x,29),BITC(x,28),\
 BITC(x,27),BITC(x,26),BITC(x,25),BITC(x,24),\
 BITC(x,23),BITC(x,22),BITC(x,21),BITC(x,20),\
 BITC(x,19),BITC(x,18),BITC(x,17),BITC(x,16),\
 BIT_ARG16(x)

None of the above is appropriate to a driver-private header.

 #define KELVIN_TO_CELSIUS(x) ((x)-273)

Nor is that.

 #define IEEE80211_CHAN_W_RADAR_DETECT 0x0010
 
 static inline struct ieee80211_conf *ieee80211_get_hw_conf(struct ieee80211_hw
  *hw)
 {
   return hw-conf;
 }
 
 static inline const struct ieee80211_hw_mode *iwl_get_hw_mode(struct iwl_priv
 *priv, int mode)
 {
   int i;
 
   for (i = 0; i  3; i++)
   if (priv-modes[i].mode == mode)
   return priv-modes[i];
 
   return NULL;
 }

Far too large to inline, has five callsites.

 #define WLAN_FC_GET_TYPE(fc)(((fc)  IEEE80211_FCTL_FTYPE))
 #define WLAN_FC_GET_STYPE(fc)   (((fc)  IEEE80211_FCTL_STYPE))
 #define WLAN_GET_SEQ_FRAG(seq)  ((seq)  0x000f)
 #define WLAN_GET_SEQ_SEQ(seq)   ((seq)  4)

These don't need to be macros

 #define QOS_CONTROL_LEN 2
 
 static inline u16 *ieee80211_get_qos_ctrl(struct ieee80211_hdr *hdr)
 {
   int hdr_len = ieee80211_get_hdrlen(hdr-frame_control);
   if (hdr-frame_control  IEEE80211_STYPE_QOS_DATA)
   return (u16 *) ((u8 *) hdr + (hdr_len) - QOS_CONTROL_LEN);
   return NULL;
 }

Two callsites, too large to inline.

 #define IEEE80211_STYPE_BACK_REQ  0x0080
 #define IEEE80211_STYPE_BACK  0x0090
 
 #define ieee80211_is_back_request(fc) \
   ((WLAN_FC_GET_TYPE(fc) == IEEE80211_FTYPE_CTL)  \
   (WLAN_FC_GET_STYPE(fc) == IEEE80211_STYPE_BACK_REQ))
 
 #define ieee80211_is_probe_response(fc) \
((WLAN_FC_GET_TYPE(fc) == IEEE80211_FTYPE_MGMT)  \
 ( WLAN_FC_GET_STYPE(fc) == IEEE80211_STYPE_PROBE_RESP ))
 
 #define ieee80211_is_probe_request(fc) \
((WLAN_FC_GET_TYPE(fc) == IEEE80211_FTYPE_MGMT)  \
 ( WLAN_FC_GET_STYPE(fc) ==IEEE80211_STYPE_PROBE_REQ ))
 
 #define ieee80211_is_beacon(fc) \
((WLAN_FC_GET_TYPE(fc) == IEEE80211_FTYPE_MGMT)  \
 ( 

Re: [RFC][PATCH] Add suspend and resume support to uli526x

2007-06-05 Thread Rafael J. Wysocki
On Tuesday, 5 June 2007 07:56, Stephen Hemminger wrote:
 I hope soon to add suspend/resume to the network device class
 and remove driver specific suspend/resume from lots of devices.
 
 The class suspend routine would just be:
   pci_save_state
   dev-stop
 
 resume is
   pci_restore_state
   dev-open
 
 for many devices that is all they need.

Well, that would be nice, but does it mean there's no need for the $subject
patch?

Rafael


-- 
Premature optimization is the root of all evil. - Donald Knuth
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] phylib: add RGMII-ID mode to the Marvell m88e1111 PHY to fix broken ucc_geth

2007-06-05 Thread Li Yang

From: Kim Phillips [EMAIL PROTECTED]

Support for configuring RGMII-ID (RGMII with internal delay) mode on the
88e and 88e1145.  Ucc_geth on MPC8360EMDS(the main user of ucc_geth)
is broken after changed to use phylib.  It is fixed by adding this
internal delay.

Also renamed 88es - 88e (no references to an 88es part were
found), and fixed some whitespace.

Signed-off-by: Kim Phillips [EMAIL PROTECTED]
Signed-off-by: Li Yang [EMAIL PROTECTED]
---
Please push this to Linus before 2.6.22 rc phase ends.  The regression
has caused serious breakage to ucc_geth driver.

drivers/net/phy/marvell.c |   62 +++--
1 files changed, 54 insertions(+), 8 deletions(-)

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index 22aec5c..b87f8d2 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -54,6 +54,12 @@
#define MII_M_PHY_LED_CONTROL   0x18
#define MII_M_PHY_LED_DIRECT0x4100
#define MII_M_PHY_LED_COMBINE   0x411c
+#define MII_M_PHY_EXT_CR   0x14
+#define MII_M_RX_DELAY 0x80
+#define MII_M_TX_DELAY 0x2
+#define MII_M_PHY_EXT_SR   0x1b
+#define MII_M_HWCFG_MODE_MASK  0xf
+#define MII_M_HWCFG_MODE_RGMII 0xb

MODULE_DESCRIPTION(Marvell PHY driver);
MODULE_AUTHOR(Andy Fleming);
@@ -131,6 +137,45 @@ static int marvell_config_aneg(struct phy_device *phydev)
return err;
}

+static int m88e_config_init(struct phy_device *phydev)
+{
+   int err;
+
+   if ((phydev-interface == PHY_INTERFACE_MODE_RGMII) ||
+   (phydev-interface == PHY_INTERFACE_MODE_RGMII_ID)) {
+   int temp;
+
+   if (phydev-interface == PHY_INTERFACE_MODE_RGMII_ID) {
+   temp = phy_read(phydev, MII_M_PHY_EXT_CR);
+   if (temp  0)
+   return temp;
+
+   temp |= (MII_M_RX_DELAY | MII_M_TX_DELAY);
+
+   err = phy_write(phydev, MII_M_PHY_EXT_CR, temp);
+   if (err  0)
+   return err;
+   }
+
+   temp = phy_read(phydev, MII_M_PHY_EXT_SR);
+   if (temp  0)
+   return temp;
+
+   temp = ~(MII_M_HWCFG_MODE_MASK);
+   temp |= MII_M_HWCFG_MODE_RGMII;
+
+   err = phy_write(phydev, MII_M_PHY_EXT_SR, temp);
+   if (err  0)
+   return err;
+   }
+
+   err = phy_write(phydev, MII_BMCR, BMCR_RESET);
+   if (err  0)
+   return err;
+
+   return 0;
+}
+
static int m88e1145_config_init(struct phy_device *phydev)
{
int err;
@@ -152,7 +197,7 @@ static int m88e1145_config_init(struct phy_device *phydev)
if (err  0)
return err;

-   if (phydev-interface == PHY_INTERFACE_MODE_RGMII) {
+   if (phydev-interface == PHY_INTERFACE_MODE_RGMII_ID) {
int temp = phy_read(phydev, MII_M1145_PHY_EXT_CR);
if (temp  0)
return temp;
@@ -206,7 +251,7 @@ static struct phy_driver m88e1101_driver = {
.driver = {.owner = THIS_MODULE,},
};

-static struct phy_driver m88es_driver = {
+static struct phy_driver m88e_driver = {
.phy_id = 0x01410cc0,
.phy_id_mask = 0xfff0,
.name = Marvell 88E,
@@ -216,6 +261,7 @@ static struct phy_driver m88es_driver = {
.read_status = genphy_read_status,
.ack_interrupt = marvell_ack_interrupt,
.config_intr = marvell_config_intr,
+   .config_init = m88e_config_init,
.driver = {.owner = THIS_MODULE,},
};

@@ -241,9 +287,9 @@ static int __init marvell_init(void)
if (ret)
return ret;

-   ret = phy_driver_register(m88es_driver);
+   ret = phy_driver_register(m88e_driver);
if (ret)
-   goto errs;
+   goto err;

ret = phy_driver_register(m88e1145_driver);
if (ret)
@@ -251,9 +297,9 @@ static int __init marvell_init(void)

return 0;

-  err1145:
-   phy_driver_unregister(m88es_driver);
-  errs:
+err1145:
+   phy_driver_unregister(m88e_driver);
+err:
phy_driver_unregister(m88e1101_driver);
return ret;
}
@@ -261,7 +307,7 @@ static int __init marvell_init(void)
static void __exit marvell_exit(void)
{
phy_driver_unregister(m88e1101_driver);
-   phy_driver_unregister(m88es_driver);
+   phy_driver_unregister(m88e_driver);
phy_driver_unregister(m88e1145_driver);
}

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] network splice receive

2007-06-05 Thread Jens Axboe
On Tue, Jun 05 2007, Jens Axboe wrote:
 Seems to work reasonably well for me, sometimes I do see small ranges
 inside the output file that are not correct, but I haven't been able to
 reproduce today. I think it has to do with page reuse, hence the
 NET_COPY_SPLICE ifdef that you can enable to just plain copy the data
 instead of referencing it.

I managed to reproduce. It's segments of 68-80 bytes beyond corrupt in
the middle of the out, and there might be 1-3 of such occurences in the
30mb file I tested with. The first 16 bytes of the corruption are always
the same:

 1800 4ff3 937f e000 6381 7275 0008

Perhaps that hex pattern rings a bell with someone intimate with the
networking. The remaining wrong bytes don't seem to have anything in
common.

Slab poisoning doesn't change the pattern, so it's not use-after-free.


-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET: Multiqueue network device support.

2007-06-05 Thread jamal
On Mon, 2007-04-06 at 14:40 -0700, PJ Waskiewicz wrote:
 API added to support multiple hardware queues on an ethernet device.
 Round-robin scheduler added (sch_rr) to provide a no-scheduling policy
 qdisc for hardware with multiple queues.
 

From a high level i see a good start that you at least have a separate
qdisc. I dont see the need for making any subqueue semantics in the
qdisc. We already have them.
I also still dont see the need for the patching of the prio qdisc or the
subqueue control.
I am now uncertain that after all those discussions (and a lot other
private ones) whether you understood me. We are still not meeting in the
middle. 

Sorry, Peter i dont mean to rain on your parade but i cant let this just
slide by[1]. So please give me sometime and this week i will send
patches to demonstrate my view. I didnt mean to do that, but as i see it
i have no other choice.

BTW, wheres the e1000 change?

cheers,
jamal

[1] If for example you wrote a classifier or a qdisc (as in a recent
discussion I had with Patrick) i would say it is your code and your
effort and i have the choice not to use it (by virtue of there being
other alternatives). I have no such luxury but to use the changes you
make to that code path whenever i use multi tx rings.

PS:- It is polite to CC someone who has engaged you in a conversation;
in particular as i have told you before at times i dont read netdev for
days but read emails addressed to me even when busyed out or travelling.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] network splice receive

2007-06-05 Thread Jens Axboe
On Tue, Jun 05 2007, Jens Axboe wrote:
 On Tue, Jun 05 2007, Jens Axboe wrote:
  Seems to work reasonably well for me, sometimes I do see small ranges
  inside the output file that are not correct, but I haven't been able to
  reproduce today. I think it has to do with page reuse, hence the
  NET_COPY_SPLICE ifdef that you can enable to just plain copy the data
  instead of referencing it.
 
 I managed to reproduce. It's segments of 68-80 bytes beyond corrupt in
 the middle of the out, and there might be 1-3 of such occurences in the
 30mb file I tested with. The first 16 bytes of the corruption are always
 the same:
 
  1800 4ff3 937f e000 6381 7275 0008
 
 Perhaps that hex pattern rings a bell with someone intimate with the
 networking. The remaining wrong bytes don't seem to have anything in
 common.

Ok, the source mac address is 00:18:F3:4F:7F:93 and the destination is
00:E0:81:63:75:72 which are the middle 12 bytes of the 16.

Hope that helps someone clue me in as to which network part is reusing
the data. Do I need to 'pin' the sk_buff until the pipe data has been
consumed?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Failover-friendly TCP retransmission

2007-06-05 Thread noboru . obata . ar
Hi Andi,

Andi Kleen [EMAIL PROTECTED] writes:
  Please note first that I want to address physical failures by
  the failover-capable network devices, which are increasingly
  becoming important as Xen-based VM systems are getting popular.
  Reducing a single-point-of-failure (physical device) is vital on
  such VM systems.
 
 Just you typically still have lots of other single points of failures in 
 a single system, some of them quite less reliable than your typical
 NIC. But at least it gives impressive demos when pulling ethernet cables @)

Indeed :-)


  If TCP retransmission misses the time frame between event #1 and
  #3 in Background above (between 20 and 30sec since network
  failure), a failure causes the system-level failover where the
  network-device-level failover should be enough.
 
 You should probably make sure that the device ends up returning the
 right NET_XMIT_* code for such drops to TCP, in particular
 NET_XMIT_DROP. This might require slight driver interface
 changes. Also right now it only affects the congestion window, I think, 
 it  might be reasonable to let it affect the timer backoff too.

Well, I don't think it can be a help.

Your suggestion, to utilize NET_XMIT_* code returned from an
underlying layer, is done in tcp_transmit_skb.

But my problem is that tcp_transmit_skb is not called during a
certain period of time.  So I'm suggesting to cap RTO value so
that tcp_transmit_skb gets called more frequently.

Does it make sense, Andi?

Regards,

-- 
OBATA Noboru ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

2007-06-05 Thread David Acker

Milton Miller wrote:


On Jun 1, 2007, at 3:45 PM, David Acker wrote:
Ok, I took a stab at coding and testing these ideas.  Below is a patch 
against 2.6.22-rc3.

Let me know what you think.


I think you got most of the ideas.   As Auke noted, your coding style is 
showing again.   And your mailer again munged whitespace (fixed by 
s/^spacespace/space/ s/^$/space/).
Sorry about the coding style.  I instinctively followed what was there 
instead of kernel coding convention.  I will look into how whitespace is 
getting screwed up.



Find a buffer that is complete with rx-el not set and rx-s0 set.
It appears that hardware can read the rfd's el-bit, then software 
can clear the rfd el-bit and set the rfd size, and then hardware can 
come in and read the size.


Yes, since the size is after the EL flag in the descriptor, this can 
happen since the pci read is not atomic.


I am reading the status back, although I don't think that I have to in 
this instance.


Actually, you are reading it when the rfd still has EL set.  Since the 
cpu will never encounter that case, the if condition is never satisfied.
In my tests, every time I found a completed rfd with the el-bit set, the 
receiver was in the out of resources state.



How about creating a state unknown, for when we think we should check 
the device if its running.
If we are in this state and then encounter a received packet without s0 
set, we can set it back

to running.   We set it when we rx a packet with s0 set.

We then move both io_status reads to the caller.

I can look into that as I clean this up.


I am testing a version of this code patched against 2.6.18.4 on my PXA 
255 based system.  I will let you all know how it goes.
The testing I did so far did well.  I will try to get some more going 
tonight, hopefully on a cleaned up patch.



I'm assuming this is why the cleanup of the receiver start to always 
start on rx_to_clean got dropped again. :-)

Yep.  I will get that in the next patch.


Also, I would like a few sentences in the Driver Operation section IV 
Receive big comment.  Something like


In order to keep updates to the RFD link field from colliding with 
hardware writes to mark packets complete, we use the feature that 
hardware will not write to a size 0 descriptor and mark the previous 
packet as end-of-list (EL).   After updating the link, we remove EL and 
only then restore the size such that hardware may use the 
previous-to-end RFD.


at the end of the first paragraph, and insert software before no 
locking is required in the second.

Sounds good to me.

I will see if I can get into a cleaned up patch today and get it out by 
tomorrow.  Thanks for dealing with me...I have been around kernel code 
for awhile but posting official patches to linux is new to me.

-Ack
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] network splice receive

2007-06-05 Thread Evgeniy Polyakov
On Tue, Jun 05, 2007 at 10:05:43AM +0200, Jens Axboe ([EMAIL PROTECTED]) wrote:
 Hi,

Hi Jens.

 Here's an implementation of tcp network splice receive support. It's
 originally based on the patch set that Intel posted some time ago, but
 has been (close to) 100% reworked.
 
 Now, I'm not a networking guru by any stretch of the imagination, so I'd
 like some input on the direction of the main patch. Is the approach
 feasible? Glaring errors? Missing bits?

First one - you seems to create new data_ready callback
tcp_splice_data_ready(), but it is unused, and actually can not be at 
all - there will be a deadlock, since sk_data_ready can be called with locked 
socket and also in bh context.

I will setup this and report abck about bugs.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] add return value check of request_irq()

2007-06-05 Thread Yoichi Yuasa
Hi,

This patch has added return value check of request_irq() to pcmcia net drivers.

Yoichi

Signed-off-by: Yoichi Yuasa [EMAIL PROTECTED]

diff -pruN -X generic/Documentation/dontdiff 
generic-orig/drivers/net/pcmcia/axnet_cs.c generic/drivers/net/pcmcia/axnet_cs.c
--- generic-orig/drivers/net/pcmcia/axnet_cs.c  2007-06-05 11:12:22.03927 
+0900
+++ generic/drivers/net/pcmcia/axnet_cs.c   2007-06-05 22:41:57.017445250 
+0900
@@ -521,6 +521,7 @@ static void mdio_write(kio_addr_t addr, 
 
 static int axnet_open(struct net_device *dev)
 {
+int ret;
 axnet_dev_t *info = PRIV(dev);
 struct pcmcia_device *link = info-p_dev;
 
@@ -529,9 +530,11 @@ static int axnet_open(struct net_device 
 if (!pcmcia_dev_present(link))
return -ENODEV;
 
-link-open++;
+ret = request_irq(dev-irq, ei_irq_wrapper, IRQF_SHARED, axnet_cs, dev);
+if (ret)
+   return ret;
 
-request_irq(dev-irq, ei_irq_wrapper, IRQF_SHARED, axnet_cs, dev);
+link-open++;
 
 info-link_status = 0x00;
 init_timer(info-watchdog);
diff -pruN -X generic/Documentation/dontdiff 
generic-orig/drivers/net/pcmcia/pcnet_cs.c generic/drivers/net/pcmcia/pcnet_cs.c
--- generic-orig/drivers/net/pcmcia/pcnet_cs.c  2007-06-05 11:12:22.043270250 
+0900
+++ generic/drivers/net/pcmcia/pcnet_cs.c   2007-06-05 22:41:45.656735250 
+0900
@@ -960,6 +960,7 @@ static void mii_phy_probe(struct net_dev
 
 static int pcnet_open(struct net_device *dev)
 {
+int ret;
 pcnet_dev_t *info = PRIV(dev);
 struct pcmcia_device *link = info-p_dev;
 
@@ -968,10 +969,12 @@ static int pcnet_open(struct net_device 
 if (!pcmcia_dev_present(link))
return -ENODEV;
 
-link-open++;
-
 set_misc_reg(dev);
-request_irq(dev-irq, ei_irq_wrapper, IRQF_SHARED, dev_info, dev);
+ret = request_irq(dev-irq, ei_irq_wrapper, IRQF_SHARED, dev_info, dev);
+if (ret)
+   return ret;
+
+link-open++;
 
 info-phy_id = info-eth_phy;
 info-link_status = 0x00;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC RTNETLINK 00/09]: Netlink link creation API

2007-06-05 Thread Patrick McHardy
The following patches contain the rtnetlink link creation API I promised,
as well as two simple driver conversion to use the API as an example.
I've also converted VLAN as a more complex example, but these patches
need some more work and are most likely not interesting to all the CCed
parties, so I'm sending them seperately.

A few words about the API:

Drivers wishing to use the API register a struct rtnl_link_ops, which
contains a few function pointers for device setup, registation, changing
and deletion, as well as netlink attribute validation and device dumping.

All netlink communication happens within the AF_UNSPEC family. I
initially introduced new netlink families for this, but removed them
again since that would require adding new protocol families that serve
no further purpose for most drivers. Additionally we currently use
RTM.*LINK messages with ifi_family != AF_UNSPEC for information that
is related to the device, but doesn't come from the driver that created
the device itself, like bridge port state, IPv6 device configuration etc.

The device specific attributes are nested within a new attribute
IFLA_LINKINFO. I didn't use IFLA_PROTINFO since userspace can reasonably
expect to have IFLA_PROTINFO unset for AF_UNSPEC messages, and the
userspace STP daemon does that. Identification of the driver happens
by name, stored in the IFLA_INFO_NAME attribute. IFLA_INFO_DATA contains
driver specific attributes, IFLA_INFO_XSTATS driver specific statistics.

The API does *not* use the existing RTM_SETLINK message type, instead
it adds support for receiving RTM_NEWLINK within the kernel. I did this
because of three reasons: 

- RTM_SETLINK does not follow the usual rtnetlink conventions and ignores
  all netlink flags

- Other rtnetlink subsystems use the same message type for dumps and
  notifications from the kernel as for configuration from userspace,
  which usually allows to recreate an object by simply setting the
  NLM_F_REQUEST flag on message received from the kernel and sending
  it back.

- Easier for userspace to detect support for the new features

The RTM_NEWLINK message type is a superset of RTM_SETLINK, it allows
to change both driver specific and generic attributes of the device.
The set of generic device attributes that may be supplied during
device creation is limited to a few simple ones, it currently does
not support specifying link layer address/broadcast address as well
as device flags. The change operation can change all device attributes.

Not sure what else to say .. comments welcome.


 drivers/net/dummy.c   |  144 +++-
 drivers/net/ifb.c |  115 ++---
 include/linux/if_link.h   |   13 +
 include/linux/netdevice.h |3 
 include/net/fib_rules.h   |2 
 include/net/genetlink.h   |2 
 include/net/ip_fib.h  |2 
 include/net/netlink.h |   12 -
 include/net/rtnetlink.h   |   57 
 net/core/neighbour.c  |4 
 net/core/rtnetlink.c  |  451 +-
 net/decnet/dn_dev.c   |2 
 net/decnet/dn_rules.c |2 
 net/ipv4/devinet.c|2 
 net/ipv4/fib_frontend.c   |2 
 net/ipv4/fib_rules.c  |2 
 net/ipv6/addrconf.c   |2 
 net/ipv6/fib6_rules.c |2 
 net/ipv6/route.c  |2 
 net/netlabel/netlabel_cipso_v4.c  |2 
 net/netlabel/netlabel_mgmt.c  |2 
 net/netlabel/netlabel_unlabeled.c |2 
 net/netlink/attr.c|8 
 net/netlink/genetlink.c   |2 
 24 files changed, 665 insertions(+), 172 deletions(-)

Patrick McHardy (9):
  [NETLINK]: Mark netlink policies const
  [RTNETLINK]: ifindex 0 does not exist
  [RTNETLINK]: Split up rtnl_setlink
  [RTNETLINK]: Link creation API
  [DUMMY]: Use dev-stats
  [DUMMY]: Keep dummy devices on list
  [DUMMY]: Use rtnl_link API
  [IFB]: Keep ifb devices on list
  [IFB]: Use rtnl_link API
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC NETLINK 01/09]: Mark netlink policies const

2007-06-05 Thread Patrick McHardy
[NETLINK]: Mark netlink policies const

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 842f5a7aa8357f4e43b1ab8938c51f4d7ce7aba3
tree 701a1399958ff019ab6939fa0e5a96e5b148b643
parent c1a13ff57ab1ce52a0aae9984594dbfcfbaf68c0
author Patrick McHardy [EMAIL PROTECTED] Mon, 04 Jun 2007 14:52:51 +0200
committer Patrick McHardy [EMAIL PROTECTED] Mon, 04 Jun 2007 14:52:51 +0200

 include/net/fib_rules.h   |2 +-
 include/net/genetlink.h   |2 +-
 include/net/ip_fib.h  |2 +-
 include/net/netlink.h |   12 ++--
 net/core/neighbour.c  |4 ++--
 net/core/rtnetlink.c  |2 +-
 net/decnet/dn_dev.c   |2 +-
 net/decnet/dn_rules.c |2 +-
 net/ipv4/devinet.c|2 +-
 net/ipv4/fib_frontend.c   |2 +-
 net/ipv4/fib_rules.c  |2 +-
 net/ipv6/addrconf.c   |2 +-
 net/ipv6/fib6_rules.c |2 +-
 net/ipv6/route.c  |2 +-
 net/netlabel/netlabel_cipso_v4.c  |2 +-
 net/netlabel/netlabel_mgmt.c  |2 +-
 net/netlabel/netlabel_unlabeled.c |2 +-
 net/netlink/attr.c|8 
 net/netlink/genetlink.c   |2 +-
 19 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index ed3a887..83e41dd 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -64,7 +64,7 @@ struct fib_rules_ops
void(*flush_cache)(void);
 
int nlgroup;
-   struct nla_policy   *policy;
+   const struct nla_policy *policy;
struct list_head*rules_list;
struct module   *owner;
 };
diff --git a/include/net/genetlink.h b/include/net/genetlink.h
index adff4c8..b6eaca1 100644
--- a/include/net/genetlink.h
+++ b/include/net/genetlink.h
@@ -60,7 +60,7 @@ struct genl_ops
 {
u8  cmd;
unsigned intflags;
-   struct nla_policy   *policy;
+   const struct nla_policy *policy;
int(*doit)(struct sk_buff *skb,
   struct genl_info *info);
int(*dumpit)(struct sk_buff *skb,
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 5a4a036..69252cb 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -213,7 +213,7 @@ extern void fib_select_default(const struct flowi *flp, 
struct fib_result *res);
 #endif /* CONFIG_IP_MULTIPLE_TABLES */
 
 /* Exported by fib_frontend.c */
-extern struct nla_policy rtm_ipv4_policy[];
+extern const struct nla_policy rtm_ipv4_policy[];
 extern voidip_fib_init(void);
 extern int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
   struct net_device *dev, __be32 *spec_dst, u32 
*itag);
diff --git a/include/net/netlink.h b/include/net/netlink.h
index 0bf325c..7b510a9 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -222,10 +222,10 @@ extern intnlmsg_notify(struct sock *sk, 
struct sk_buff *skb,
 gfp_t flags);
 
 extern int nla_validate(struct nlattr *head, int len, int maxtype,
-struct nla_policy *policy);
+const struct nla_policy *policy);
 extern int nla_parse(struct nlattr *tb[], int maxtype,
  struct nlattr *head, int len,
- struct nla_policy *policy);
+ const struct nla_policy *policy);
 extern struct nlattr * nla_find(struct nlattr *head, int len, int attrtype);
 extern size_t  nla_strlcpy(char *dst, const struct nlattr *nla,
size_t dstsize);
@@ -360,7 +360,7 @@ static inline struct nlmsghdr *nlmsg_next(struct nlmsghdr 
*nlh, int *remaining)
  */
 static inline int nlmsg_parse(struct nlmsghdr *nlh, int hdrlen,
  struct nlattr *tb[], int maxtype,
- struct nla_policy *policy)
+ const struct nla_policy *policy)
 {
if (nlh-nlmsg_len  nlmsg_msg_size(hdrlen))
return -EINVAL;
@@ -392,7 +392,7 @@ static inline struct nlattr *nlmsg_find_attr(struct 
nlmsghdr *nlh,
  * @policy: validation policy
  */
 static inline int nlmsg_validate(struct nlmsghdr *nlh, int hdrlen, int maxtype,
-struct nla_policy *policy)
+const struct nla_policy *policy)
 {
if (nlh-nlmsg_len  nlmsg_msg_size(hdrlen))
return -EINVAL;
@@ -729,7 +729,7 @@ static inline struct nlattr *nla_find_nested(struct nlattr 
*nla, int attrtype)
  */
 static inline int nla_parse_nested(struct nlattr *tb[], int maxtype,
   struct nlattr *nla,
-  

[RFC RTNETLINK 02/09]: ifindex 0 does not exist

2007-06-05 Thread Patrick McHardy
[RTNETLINK]: ifindex 0 does not exist

ifindex == 0 does not exist and implies we should do a lookup by name if
one was given.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 72104442408666a0ff9395a5c62a23e96a3845ef
tree d46c7ec0b401a99bfbec94ef328e9c7cf7197854
parent 842f5a7aa8357f4e43b1ab8938c51f4d7ce7aba3
author Patrick McHardy [EMAIL PROTECTED] Mon, 04 Jun 2007 23:23:14 +0200
committer Patrick McHardy [EMAIL PROTECTED] Mon, 04 Jun 2007 23:23:14 +0200

 net/core/rtnetlink.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a8a5093..02e8bf0 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -580,7 +580,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct 
nlmsghdr *nlh, void *arg)
 
err = -EINVAL;
ifm = nlmsg_data(nlh);
-   if (ifm-ifi_index = 0)
+   if (ifm-ifi_index  0)
dev = dev_get_by_index(ifm-ifi_index);
else if (tb[IFLA_IFNAME])
dev = dev_get_by_name(ifname);
@@ -672,7 +672,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct 
nlmsghdr *nlh, void *arg)
 * name provided implies that a name change has been
 * requested.
 */
-   if (ifm-ifi_index = 0  ifname[0]) {
+   if (ifm-ifi_index  0  ifname[0]) {
err = dev_change_name(dev, ifname);
if (err  0)
goto errout_dev;
@@ -740,7 +740,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct 
nlmsghdr* nlh, void *arg)
return err;
 
ifm = nlmsg_data(nlh);
-   if (ifm-ifi_index = 0) {
+   if (ifm-ifi_index  0) {
dev = dev_get_by_index(ifm-ifi_index);
if (dev == NULL)
return -ENODEV;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC RTNETLINK 03/09]: Split up rtnl_setlink

2007-06-05 Thread Patrick McHardy
[RTNETLINK]: Split up rtnl_setlink

Split up rtnl_setlink into a function performing validation and a function
performing the actual changes. This allows to share the modifcation logic
with rtnl_newlink, which is introduced by the next patch.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 40b0b8787c1057d055baa6e3d11ff6db7783c982
tree 625d2bb4401e72882e6e5907d4aded9d2dcb416e
parent 72104442408666a0ff9395a5c62a23e96a3845ef
author Patrick McHardy [EMAIL PROTECTED] Mon, 04 Jun 2007 23:28:24 +0200
committer Patrick McHardy [EMAIL PROTECTED] Mon, 04 Jun 2007 23:28:24 +0200

 net/core/rtnetlink.c |  105 +++---
 1 files changed, 57 insertions(+), 48 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 02e8bf0..25ca219 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -561,44 +561,11 @@ static const struct nla_policy ifla_policy[IFLA_MAX+1] = {
[IFLA_LINKMODE] = { .type = NLA_U8 },
 };
 
-static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm,
+ struct nlattr **tb, char *ifname)
 {
-   struct ifinfomsg *ifm;
-   struct net_device *dev;
-   int err, send_addr_notify = 0, modified = 0;
-   struct nlattr *tb[IFLA_MAX+1];
-   char ifname[IFNAMSIZ];
-
-   err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy);
-   if (err  0)
-   goto errout;
-
-   if (tb[IFLA_IFNAME])
-   nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
-   else
-   ifname[0] = '\0';
-
-   err = -EINVAL;
-   ifm = nlmsg_data(nlh);
-   if (ifm-ifi_index  0)
-   dev = dev_get_by_index(ifm-ifi_index);
-   else if (tb[IFLA_IFNAME])
-   dev = dev_get_by_name(ifname);
-   else
-   goto errout;
-
-   if (dev == NULL) {
-   err = -ENODEV;
-   goto errout;
-   }
-
-   if (tb[IFLA_ADDRESS] 
-   nla_len(tb[IFLA_ADDRESS])  dev-addr_len)
-   goto errout_dev;
-
-   if (tb[IFLA_BROADCAST] 
-   nla_len(tb[IFLA_BROADCAST])  dev-addr_len)
-   goto errout_dev;
+   int modified = 0, send_addr_notify = 0;
+   int err;
 
if (tb[IFLA_MAP]) {
struct rtnl_link_ifmap *u_map;
@@ -606,12 +573,12 @@ static int rtnl_setlink(struct sk_buff *skb, struct 
nlmsghdr *nlh, void *arg)
 
if (!dev-set_config) {
err = -EOPNOTSUPP;
-   goto errout_dev;
+   goto errout;
}
 
if (!netif_device_present(dev)) {
err = -ENODEV;
-   goto errout_dev;
+   goto errout;
}
 
u_map = nla_data(tb[IFLA_MAP]);
@@ -624,7 +591,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct 
nlmsghdr *nlh, void *arg)
 
err = dev-set_config(dev, k_map);
if (err  0)
-   goto errout_dev;
+   goto errout;
 
modified = 1;
}
@@ -635,19 +602,19 @@ static int rtnl_setlink(struct sk_buff *skb, struct 
nlmsghdr *nlh, void *arg)
 
if (!dev-set_mac_address) {
err = -EOPNOTSUPP;
-   goto errout_dev;
+   goto errout;
}
 
if (!netif_device_present(dev)) {
err = -ENODEV;
-   goto errout_dev;
+   goto errout;
}
 
len = sizeof(sa_family_t) + dev-addr_len;
sa = kmalloc(len, GFP_KERNEL);
if (!sa) {
err = -ENOMEM;
-   goto errout_dev;
+   goto errout;
}
sa-sa_family = dev-type;
memcpy(sa-sa_data, nla_data(tb[IFLA_ADDRESS]),
@@ -655,7 +622,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct 
nlmsghdr *nlh, void *arg)
err = dev-set_mac_address(dev, sa);
kfree(sa);
if (err)
-   goto errout_dev;
+   goto errout;
send_addr_notify = 1;
modified = 1;
}
@@ -663,7 +630,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct 
nlmsghdr *nlh, void *arg)
if (tb[IFLA_MTU]) {
err = dev_set_mtu(dev, nla_get_u32(tb[IFLA_MTU]));
if (err  0)
-   goto errout_dev;
+   goto errout;
modified = 1;
}
 
@@ -675,7 +642,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct 
nlmsghdr *nlh, void *arg)
if (ifm-ifi_index  0  ifname[0]) {
err = dev_change_name(dev, ifname);
if (err  0)
-  

[RFC RTNETLINK 04/09]: Link creation API

2007-06-05 Thread Patrick McHardy
[RTNETLINK]: Link creation API

Add rtnetlink API for creating, changing and deleting software devices.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 0323e7d1e7d5042492684264cfcba6d7ff55c473
tree 161530836d43b39ddef42a2c2b48b82187580e3c
parent 40b0b8787c1057d055baa6e3d11ff6db7783c982
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:12 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:12 +0200

 include/linux/if_link.h   |   13 ++
 include/linux/netdevice.h |3 
 include/net/rtnetlink.h   |   57 
 net/core/rtnetlink.c  |  338 -
 4 files changed, 404 insertions(+), 7 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 604c243..e46ed94 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -76,6 +76,8 @@ enum
 #define IFLA_WEIGHT IFLA_WEIGHT
IFLA_OPERSTATE,
IFLA_LINKMODE,
+   IFLA_LINKINFO,
+#define IFLA_LINKINFO IFLA_LINKINFO
__IFLA_MAX
 };
 
@@ -140,4 +142,15 @@ struct ifla_cacheinfo
__u32   retrans_time;
 };
 
+enum
+{
+   IFLA_INFO_UNSPEC,
+   IFLA_INFO_NAME,
+   IFLA_INFO_DATA,
+   IFLA_INFO_XSTATS,
+   __IFLA_INFO_MAX,
+};
+
+#define IFLA_INFO_MAX  (__IFLA_INFO_MAX - 1)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3a70f55..e327ccc 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -540,6 +540,9 @@ struct net_device
struct device   dev;
/* space for optional statistics and wireless sysfs groups */
struct attribute_group  *sysfs_groups[3];
+
+   /* rtnetlink link ops */
+   struct rtnl_link_ops*rtnl_link_ops;
 };
 #define to_net_dev(d) container_of(d, struct net_device, dev)
 
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index 3b3d474..d744198 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -22,4 +22,61 @@ static inline int rtnl_msg_family(struct nlmsghdr *nlh)
return AF_UNSPEC;
 }
 
+/**
+ * struct rtnl_link_ops - rtnetlink link operations
+ *
+ * @list: Used internally
+ * @name: Identifier
+ * @maxtype: Highest device specific netlink attribute number
+ * @policy: Netlink policy for device specific attribute validation
+ * @validate: Optional validation function for netlink/changelink 
parameters
+ * @priv_size: sizeof net_device private space
+ * @setup: net_device setup function
+ * @newlink: Function for configuring and registering a new device
+ * @changelink: Function for changing parameters of an existing device
+ * @dellink: Function to remove a device
+ * @get_size: Function to calculate required room for dumping device
+ *specific netlink attributes
+ * @fill_info: Function to dump device specific netlink attributes
+ * @xstats_size: Size of device specific statistics
+ * @fill_xstats: Function to dump device specific statistics
+ */
+struct rtnl_link_ops {
+   struct list_headlist;
+
+   const char  *name;
+
+   size_t  priv_size;
+   void(*setup)(struct net_device *dev);
+
+   int maxtype;
+   const struct nla_policy *policy;
+   int (*validate)(struct nlattr *tb[],
+   struct nlattr *data[]);
+
+   int (*newlink)(struct net_device *dev,
+  struct nlattr *tb[],
+  struct nlattr *data[]);
+   int (*changelink)(struct net_device *dev,
+ struct nlattr *tb[],
+ struct nlattr *data[]);
+   void(*dellink)(struct net_device *dev);
+
+   size_t  (*get_size)(struct net_device *dev);
+   int (*fill_info)(struct sk_buff *skb,
+struct net_device *dev);
+
+   size_t  xstats_size;
+   int (*fill_xstats)(struct sk_buff *skb,
+  struct net_device *dev);
+};
+
+extern int __rtnl_link_register(struct rtnl_link_ops *ops);
+extern void__rtnl_link_unregister(struct rtnl_link_ops *ops);
+
+extern int rtnl_link_register(struct rtnl_link_ops *ops);
+extern voidrtnl_link_unregister(struct rtnl_link_ops *ops);
+
+#define MODULE_ALIAS_RTNL_LINK(name) MODULE_ALIAS(rtnl-link- #name)
+
 #endif
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 25ca219..ed17288 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -243,6 +243,141 @@ void rtnl_unregister_all(int protocol)
 
 EXPORT_SYMBOL_GPL(rtnl_unregister_all);
 
+static LIST_HEAD(link_ops);
+

[RFC DUMMY 05/09]: Use dev-stats

2007-06-05 Thread Patrick McHardy
[DUMMY]: Use dev-stats

Use dev-stats instead of netdev_priv().

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit c5178079b5d191e34a516dc111be862e3382e32b
tree 339bf81b1270b51272a1e9ebc643188dc755a2ee
parent 0323e7d1e7d5042492684264cfcba6d7ff55c473
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:28 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:28 +0200

 drivers/net/dummy.c |   16 +++-
 1 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 60673bc..91b474c 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -38,7 +38,6 @@
 static int numdummies = 1;
 
 static int dummy_xmit(struct sk_buff *skb, struct net_device *dev);
-static struct net_device_stats *dummy_get_stats(struct net_device *dev);
 
 static int dummy_set_address(struct net_device *dev, void *p)
 {
@@ -59,7 +58,6 @@ static void set_multicast_list(struct net_device *dev)
 static void __init dummy_setup(struct net_device *dev)
 {
/* Initialize the device structure. */
-   dev-get_stats = dummy_get_stats;
dev-hard_start_xmit = dummy_xmit;
dev-set_multicast_list = set_multicast_list;
dev-set_mac_address = dummy_set_address;
@@ -76,20 +74,13 @@ static void __init dummy_setup(struct net_device *dev)
 
 static int dummy_xmit(struct sk_buff *skb, struct net_device *dev)
 {
-   struct net_device_stats *stats = netdev_priv(dev);
-
-   stats-tx_packets++;
-   stats-tx_bytes+=skb-len;
+   dev-stats.tx_packets++;
+   dev-stats.tx_bytes += skb-len;
 
dev_kfree_skb(skb);
return 0;
 }
 
-static struct net_device_stats *dummy_get_stats(struct net_device *dev)
-{
-   return netdev_priv(dev);
-}
-
 static struct net_device **dummies;
 
 /* Number of dummy devices to be set up by this module. */
@@ -101,8 +92,7 @@ static int __init dummy_init_one(int index)
struct net_device *dev_dummy;
int err;
 
-   dev_dummy = alloc_netdev(sizeof(struct net_device_stats),
-dummy%d, dummy_setup);
+   dev_dummy = alloc_netdev(0, dummy%d, dummy_setup);
 
if (!dev_dummy)
return -ENOMEM;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC DUMMY 07/09]: Use rtnl_link API

2007-06-05 Thread Patrick McHardy
[DUMMY]: Use rtnl_link API

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit a86d7c15680b4bbeec06f0194c2ca927648e33dd
tree f6f55cbbd72b9643fb2f9c4e5859a3a2d699504a
parent 5785eb9eb2c30be5662261fc115a65d28cc98c17
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:28 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:28 +0200

 drivers/net/dummy.c |   81 ++-
 1 files changed, 61 insertions(+), 20 deletions(-)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 2f2cf3c..b1bb7a0 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -35,6 +35,7 @@
 #include linux/init.h
 #include linux/moduleparam.h
 #include linux/rtnetlink.h
+#include net/rtnetlink.h
 
 struct dummy_priv {
struct net_device *dev;
@@ -61,12 +62,13 @@ static void set_multicast_list(struct net_device *dev)
 {
 }
 
-static void __init dummy_setup(struct net_device *dev)
+static void dummy_setup(struct net_device *dev)
 {
/* Initialize the device structure. */
dev-hard_start_xmit = dummy_xmit;
dev-set_multicast_list = set_multicast_list;
dev-set_mac_address = dummy_set_address;
+   dev-destructor = free_netdev;
 
/* Fill in device structure with ethernet-generic values. */
ether_setup(dev);
@@ -89,6 +91,37 @@ static int dummy_xmit(struct sk_buff *skb, struct net_device 
*dev)
 
 static LIST_HEAD(dummies);
 
+static int dummy_newlink(struct net_device *dev,
+struct nlattr *tb[], struct nlattr *data[])
+{
+   struct dummy_priv *priv = netdev_priv(dev);
+   int err;
+
+   err = register_netdevice(dev);
+   if (err  0)
+   return err;
+
+   priv-dev = dev;
+   list_add_tail(priv-list, dummies);
+   return 0;
+}
+
+static void dummy_dellink(struct net_device *dev)
+{
+   struct dummy_priv *priv = netdev_priv(dev);
+
+   list_del(priv-list);
+   unregister_netdevice(dev);
+}
+
+static struct rtnl_link_ops dummy_link_ops = {
+   .name   = dummy,
+   .priv_size  = sizeof(struct dummy_priv),
+   .setup  = dummy_setup,
+   .newlink= dummy_newlink,
+   .dellink= dummy_dellink,
+};
+
 /* Number of dummy devices to be set up by this module. */
 module_param(numdummies, int, 0);
 MODULE_PARM_DESC(numdummies, Number of dummy pseudo devices);
@@ -105,25 +138,22 @@ static int __init dummy_init_one(void)
if (!dev_dummy)
return -ENOMEM;
 
-   if ((err = register_netdev(dev_dummy))) {
-   free_netdev(dev_dummy);
-   dev_dummy = NULL;
-   } else {
-   priv = netdev_priv(dev_dummy);
-   priv-dev = dev_dummy;
-   list_add_tail(priv-list, dummies);
-   }
+   err = dev_alloc_name(dev_dummy, dev_dummy-name);
+   if (err  0)
+   goto err;
 
-   return err;
-}
+   err = register_netdevice(dev_dummy);
+   if (err  0)
+   goto err;
 
-static void dummy_free_one(struct net_device *dev)
-{
-   struct dummy_priv *priv = netdev_priv(dev);
+   priv = netdev_priv(dev_dummy);
+   priv-dev = dev_dummy;
+   list_add_tail(priv-list, dummies);
+   return 0;
 
-   list_del(priv-list);
-   unregister_netdev(dev);
-   free_netdev(dev);
+err:
+   free_netdev(dev_dummy);
+   return err;
 }
 
 static int __init dummy_init_module(void)
@@ -131,12 +161,18 @@ static int __init dummy_init_module(void)
struct dummy_priv *priv, *next;
int i, err = 0;
 
+   rtnl_lock();
+   err = __rtnl_link_register(dummy_link_ops);
+
for (i = 0; i  numdummies  !err; i++)
err = dummy_init_one();
-   if (err) {
+   if (err  0) {
list_for_each_entry_safe(priv, next, dummies, list)
-   dummy_free_one(priv-dev);
+   dummy_dellink(priv-dev);
+   __rtnl_link_unregister(dummy_link_ops);
}
+   rtnl_unlock();
+
return err;
 }
 
@@ -144,10 +180,15 @@ static void __exit dummy_cleanup_module(void)
 {
struct dummy_priv *priv, *next;
 
+   rtnl_lock();
list_for_each_entry_safe(priv, next, dummies, list)
-   dummy_free_one(priv-dev);
+   dummy_dellink(priv-dev);
+
+   __rtnl_link_unregister(dummy_link_ops);
+   rtnl_unlock();
 }
 
 module_init(dummy_init_module);
 module_exit(dummy_cleanup_module);
 MODULE_LICENSE(GPL);
+MODULE_ALIAS_RTNL_LINK(dummy);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC DUMMY 06/09]: Keep dummy devices on list

2007-06-05 Thread Patrick McHardy
[DUMMY]: Keep dummy devices on list

Use a list instead of an array to allow creating new devices.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 5785eb9eb2c30be5662261fc115a65d28cc98c17
tree 51a0c8e4d7522895e2582f4a90330dad7e4d237e
parent c5178079b5d191e34a516dc111be862e3382e32b
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:28 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:28 +0200

 drivers/net/dummy.c |   47 +--
 1 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 91b474c..2f2cf3c 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -34,6 +34,12 @@
 #include linux/etherdevice.h
 #include linux/init.h
 #include linux/moduleparam.h
+#include linux/rtnetlink.h
+
+struct dummy_priv {
+   struct net_device *dev;
+   struct list_head list;
+};
 
 static int numdummies = 1;
 
@@ -81,18 +87,20 @@ static int dummy_xmit(struct sk_buff *skb, struct 
net_device *dev)
return 0;
 }
 
-static struct net_device **dummies;
+static LIST_HEAD(dummies);
 
 /* Number of dummy devices to be set up by this module. */
 module_param(numdummies, int, 0);
 MODULE_PARM_DESC(numdummies, Number of dummy pseudo devices);
 
-static int __init dummy_init_one(int index)
+static int __init dummy_init_one(void)
 {
struct net_device *dev_dummy;
+   struct dummy_priv *priv;
int err;
 
-   dev_dummy = alloc_netdev(0, dummy%d, dummy_setup);
+   dev_dummy = alloc_netdev(sizeof(struct dummy_priv), dummy%d,
+dummy_setup);
 
if (!dev_dummy)
return -ENOMEM;
@@ -101,40 +109,43 @@ static int __init dummy_init_one(int index)
free_netdev(dev_dummy);
dev_dummy = NULL;
} else {
-   dummies[index] = dev_dummy;
+   priv = netdev_priv(dev_dummy);
+   priv-dev = dev_dummy;
+   list_add_tail(priv-list, dummies);
}
 
return err;
 }
 
-static void dummy_free_one(int index)
+static void dummy_free_one(struct net_device *dev)
 {
-   unregister_netdev(dummies[index]);
-   free_netdev(dummies[index]);
+   struct dummy_priv *priv = netdev_priv(dev);
+
+   list_del(priv-list);
+   unregister_netdev(dev);
+   free_netdev(dev);
 }
 
 static int __init dummy_init_module(void)
 {
+   struct dummy_priv *priv, *next;
int i, err = 0;
-   dummies = kmalloc(numdummies * sizeof(void *), GFP_KERNEL);
-   if (!dummies)
-   return -ENOMEM;
+
for (i = 0; i  numdummies  !err; i++)
-   err = dummy_init_one(i);
+   err = dummy_init_one();
if (err) {
-   i--;
-   while (--i = 0)
-   dummy_free_one(i);
+   list_for_each_entry_safe(priv, next, dummies, list)
+   dummy_free_one(priv-dev);
}
return err;
 }
 
 static void __exit dummy_cleanup_module(void)
 {
-   int i;
-   for (i = 0; i  numdummies; i++)
-   dummy_free_one(i);
-   kfree(dummies);
+   struct dummy_priv *priv, *next;
+
+   list_for_each_entry_safe(priv, next, dummies, list)
+   dummy_free_one(priv-dev);
 }
 
 module_init(dummy_init_module);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC IFB 08/09]: Keep ifb devices on list

2007-06-05 Thread Patrick McHardy
[IFB]: Keep ifb devices on list

Use a list instead of an array to allow creating new devices.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 279921d3d97ab5f310142bdb27b5459742dbba4f
tree b6602b51779ffefeae65a32c80cc92da90a40a6c
parent a86d7c15680b4bbeec06f0194c2ca927648e33dd
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:29 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:29 +0200

 drivers/net/ifb.c |   36 +---
 1 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index 07b4c0d..819945e 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -33,12 +33,15 @@
 #include linux/etherdevice.h
 #include linux/init.h
 #include linux/moduleparam.h
+#include linux/list.h
 #include net/pkt_sched.h
 
 #define TX_TIMEOUT  (2*HZ)
 
 #define TX_Q_LIMIT32
 struct ifb_private {
+   struct list_headlist;
+   struct net_device   *dev;
struct net_device_stats stats;
struct tasklet_struct   ifb_tasklet;
int tasklet_pending;
@@ -197,7 +200,7 @@ static struct net_device_stats *ifb_get_stats(struct 
net_device *dev)
return stats;
 }
 
-static struct net_device **ifbs;
+static LIST_HEAD(ifbs);
 
 /* Number of ifb devices to be set up by this module. */
 module_param(numifbs, int, 0);
@@ -229,6 +232,7 @@ static int ifb_open(struct net_device *dev)
 static int __init ifb_init_one(int index)
 {
struct net_device *dev_ifb;
+   struct ifb_private *priv;
int err;
 
dev_ifb = alloc_netdev(sizeof(struct ifb_private),
@@ -241,30 +245,33 @@ static int __init ifb_init_one(int index)
free_netdev(dev_ifb);
dev_ifb = NULL;
} else {
-   ifbs[index] = dev_ifb;
+   priv = netdev_priv(dev_ifb);
+   priv-dev = dev_ifb;
+   list_add_tail(priv-list, ifbs);
}
 
return err;
 }
 
-static void ifb_free_one(int index)
+static void ifb_free_one(struct net_device *dev)
 {
-   unregister_netdev(ifbs[index]);
-   free_netdev(ifbs[index]);
+   struct ifb_private *priv = netdev_priv(dev);
+
+   list_del(priv-list);
+   unregister_netdev(dev);
+   free_netdev(dev);
 }
 
 static int __init ifb_init_module(void)
 {
+   struct ifb_private *priv, *next;
int i, err = 0;
-   ifbs = kmalloc(numifbs * sizeof(void *), GFP_KERNEL);
-   if (!ifbs)
-   return -ENOMEM;
+
for (i = 0; i  numifbs  !err; i++)
err = ifb_init_one(i);
if (err) {
-   i--;
-   while (--i = 0)
-   ifb_free_one(i);
+   list_for_each_entry_safe(priv, next, ifbs, list)
+   ifb_free_one(priv-dev);
}
 
return err;
@@ -272,11 +279,10 @@ static int __init ifb_init_module(void)
 
 static void __exit ifb_cleanup_module(void)
 {
-   int i;
+   struct ifb_private *priv, *next;
 
-   for (i = 0; i  numifbs; i++)
-   ifb_free_one(i);
-   kfree(ifbs);
+   list_for_each_entry_safe(priv, next, ifbs, list)
+   ifb_free_one(priv-dev);
 }
 
 module_init(ifb_init_module);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC IFB 09/09]: Use rtnl_link API

2007-06-05 Thread Patrick McHardy
[IFB]: Use rtnl_link API

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 6a0492fd68c8c5f528e4277bec20bef8047aec2e
tree 7e2ae7fb9e33d7c547c3030b9923de232aaf9ab3
parent 279921d3d97ab5f310142bdb27b5459742dbba4f
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:29 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 15:40:29 +0200

 drivers/net/ifb.c |   79 -
 1 files changed, 59 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index 819945e..9ec01ff 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -139,13 +139,14 @@ resched:
 
 }
 
-static void __init ifb_setup(struct net_device *dev)
+static void ifb_setup(struct net_device *dev)
 {
/* Initialize the device structure. */
dev-get_stats = ifb_get_stats;
dev-hard_start_xmit = ifb_xmit;
dev-open = ifb_open;
dev-stop = ifb_close;
+   dev-destructor = free_netdev;
 
/* Fill in device structure with ethernet-generic values. */
ether_setup(dev);
@@ -229,6 +230,37 @@ static int ifb_open(struct net_device *dev)
return 0;
 }
 
+static int ifb_newlink(struct net_device *dev,
+  struct nlattr *tb[], struct nlattr *data[])
+{
+   struct ifb_private *priv = netdev_priv(dev);
+   int err;
+
+   err = register_netdevice(dev);
+   if (err  0)
+   return err;
+
+   priv-dev = dev;
+   list_add_tail(priv-list, ifbs);
+   return 0;
+}
+
+static void ifb_dellink(struct net_device *dev)
+{
+   struct ifb_private *priv = netdev_priv(dev);
+
+   list_del(priv-list);
+   unregister_netdevice(dev);
+}
+
+static struct rtnl_link_ops ifb_link_ops = {
+   .name   = ifb,
+   .priv_size  = sizeof(struct ifb_private),
+   .setup  = ifb_setup,
+   .newlink= ifb_newlink,
+   .dellink= ifb_dellink,
+};
+
 static int __init ifb_init_one(int index)
 {
struct net_device *dev_ifb;
@@ -241,38 +273,40 @@ static int __init ifb_init_one(int index)
if (!dev_ifb)
return -ENOMEM;
 
-   if ((err = register_netdev(dev_ifb))) {
-   free_netdev(dev_ifb);
-   dev_ifb = NULL;
-   } else {
-   priv = netdev_priv(dev_ifb);
-   priv-dev = dev_ifb;
-   list_add_tail(priv-list, ifbs);
-   }
+   err = dev_alloc_name(dev_ifb, dev_ifb-name);
+   if (err  0)
+   goto err;
 
-   return err;
-}
+   err = register_netdevice(dev_ifb);
+   if (err  0)
+   goto err;
 
-static void ifb_free_one(struct net_device *dev)
-{
-   struct ifb_private *priv = netdev_priv(dev);
+   priv = netdev_priv(dev_ifb);
+   priv-dev = dev_ifb;
+   list_add_tail(priv-list, ifbs);
+   return 0;
 
-   list_del(priv-list);
-   unregister_netdev(dev);
-   free_netdev(dev);
+err:
+   free_netdev(dev_ifb);
+   return err;
 }
 
 static int __init ifb_init_module(void)
 {
struct ifb_private *priv, *next;
-   int i, err = 0;
+   int i, err;
+
+   rtnl_lock();
+   err = __rtnl_link_register(ifb_link_ops);
 
for (i = 0; i  numifbs  !err; i++)
err = ifb_init_one(i);
if (err) {
list_for_each_entry_safe(priv, next, ifbs, list)
-   ifb_free_one(priv-dev);
+   ifb_dellink(priv-dev);
+   __rtnl_link_unregister(ifb_link_ops);
}
+   rtnl_unlock();
 
return err;
 }
@@ -281,11 +315,16 @@ static void __exit ifb_cleanup_module(void)
 {
struct ifb_private *priv, *next;
 
+   rtnl_lock();
list_for_each_entry_safe(priv, next, ifbs, list)
-   ifb_free_one(priv-dev);
+   ifb_dellink(priv-dev);
+
+   __rtnl_link_unregister(ifb_link_ops);
+   rtnl_unlock();
 }
 
 module_init(ifb_init_module);
 module_exit(ifb_cleanup_module);
 MODULE_LICENSE(GPL);
 MODULE_AUTHOR(Jamal Hadi Salim);
+MODULE_ALIAS_RTNL_LINK(ifb);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC IPROUTE]: iplink: use netlink for link configuration

2007-06-05 Thread Patrick McHardy
The iproute patch for the rtnl_link API. For simple devices
that take no configuration like dummy or ifb no further
changes are needed.

Example:

Create dummy device:

# ip link add type dummy

Show device:

# ip -d link list dummy0

9: dummy0: BROADCAST,NOARP mtu 1500 qdisc noop
link/ether ae:9a:0c:8e:f5:e1 brd ff:ff:ff:ff:ff:ff
dummy

Delete device again:

# ip link delete dummy0

[IPROUTE]: iplink: use netlink for link configuration

Add support for using netlink for link configuration. Kernel-support is
probed, when not available it falls back to using ioctls.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit e59a7a02053c997a2b7ff9a4436bd3deb4781bf4
tree 0c0a45170d43c0b1bca2560851ccfb3f3ccbebaa
parent b16621cafd599499fdbaa79236266d72a53106bb
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:14:50 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:14:50 +0200

 include/linux/if_link.h |   13 ++
 ip/Makefile |2 
 ip/ip.c |5 +
 ip/ip_common.h  |   13 ++
 ip/ipaddress.c  |   35 +
 ip/iplink.c |  311 +++
 6 files changed, 376 insertions(+), 3 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 2920e8a..aac0df1 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -76,6 +76,8 @@ enum
 #define IFLA_WEIGHT IFLA_WEIGHT
IFLA_OPERSTATE,
IFLA_LINKMODE,
+   IFLA_LINKINFO,
+#define IFLA_LINKINFO IFLA_LINKINFO
__IFLA_MAX
 };
 
@@ -137,4 +139,15 @@ struct ifla_cacheinfo
__u32   retrans_time;
 };
 
+enum
+{
+   IFLA_INFO_UNSPEC,
+   IFLA_INFO_NAME,
+   IFLA_INFO_DATA,
+   IFLA_INFO_XSTATS,
+   __IFLA_INFO_MAX,
+};
+
+#define IFLA_INFO_MAX  (__IFLA_INFO_MAX - 1)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/ip/Makefile b/ip/Makefile
index a749993..9a5bfe3 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -22,3 +22,5 @@ install: all
 clean:
rm -f $(ALLOBJ) $(TARGETS)
 
+LDLIBS += -ldl
+LDFLAGS+= -Wl,-export-dynamic
diff --git a/ip/ip.c b/ip/ip.c
index c084292..4bdb83b 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -30,6 +30,7 @@
 
 int preferred_family = AF_UNSPEC;
 int show_stats = 0;
+int show_details = 0;
 int resolve_hosts = 0;
 int oneline = 0;
 int timestamp = 0;
@@ -47,7 +48,7 @@ static void usage(void)
ip [ -force ] [-batch filename\n
 where  OBJECT := { link | addr | route | rule | neigh | ntable | tunnel |\n
maddr | mroute | monitor | xfrm }\n
-   OPTIONS := { -V[ersion] | -s[tatistics] | -r[esolve] |\n
+   OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |\n
 -f[amily] { inet | inet6 | ipx | dnet | link } |\n
 -o[neline] | -t[imestamp] }\n);
exit(-1);
@@ -188,6 +189,8 @@ int main(int argc, char **argv)
} else if (matches(opt, -stats) == 0 ||
   matches(opt, -statistics) == 0) {
++show_stats;
+   } else if (matches(opt, -details) == 0) {
+   ++show_details;
} else if (matches(opt, -resolve) == 0) {
++resolve_hosts;
} else if (matches(opt, -oneline) == 0) {
diff --git a/ip/ip_common.h b/ip/ip_common.h
index 5bfd9b9..642c609 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -45,6 +45,19 @@ static inline int rtm_get_table(struct rtmsg *r, struct 
rtattr **tb)
 
 extern struct rtnl_handle rth;
 
+struct link_util
+{
+   struct link_util*next;
+   const char  *id;
+   int maxattr;
+   int (*parse_opt)(struct link_util *, int, char **,
+struct nlmsghdr *);
+   void(*print_opt)(struct link_util *, FILE *,
+struct rtattr *[]);
+};
+
+struct link_util *get_link_type(const char *type);
+
 #ifndefINFINITY_LIFE_TIME
 #define INFINITY_LIFE_TIME  0xU
 #endif
diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index 98effa3..58254ea 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -134,6 +134,37 @@ void print_queuelen(char *name)
printf(qlen %d, ifr.ifr_qlen);
 }
 
+static void print_linktype(FILE *fp, struct rtattr *tb)
+{
+   struct rtattr *linkinfo[IFLA_INFO_MAX+1];
+   struct link_util *lu;
+   char *type;
+
+   parse_rtattr_nested(linkinfo, IFLA_INFO_MAX, tb);
+
+   if (!linkinfo[IFLA_INFO_NAME])
+   return;
+   type = RTA_DATA(linkinfo[IFLA_INFO_NAME]);
+
+   fprintf(fp, %s, _SL_);
+   fprintf(fp, %s , type);
+
+   lu = get_link_type(type);
+   if (!lu || !lu-print_opt)
+   return;
+
+   if (1) {
+   struct rtattr *attr[lu-maxattr+1], **data = NULL;
+
+   if (linkinfo[IFLA_INFO_DATA]) {
+

[RFC VLAN 00/08]: VLAN netlink support

2007-06-05 Thread Patrick McHardy
These patches convert VLAN to the rtnl_link API as a more complex example.
The patches need some more work, not all VLAN features are supported yet.

 include/linux/if_link.h   |   25 +
 include/linux/if_vlan.h   |8 
 include/linux/rtnetlink.h |3 
 net/8021q/Makefile|2 
 net/8021q/vlan.c  |  698 +-
 net/8021q/vlan.h  |   22 +
 net/8021q/vlan_dev.c  |  170 +++
 net/8021q/vlan_netlink.c  |  194 
 net/8021q/vlanproc.c  |4 
 9 files changed, 627 insertions(+), 499 deletions(-)

Patrick McHardy (8):
  [VLAN]: Move device lookup to ioctl handler
  [VLAN]: Remove unregister_vlan_dev wrapper
  [VLAN]: Add device init callback
  [VLAN]: Move vlan_group allocation to seperate function
  [VLAN]: Split up device creation
  [VLAN]: Use 32 bit value for skb-priority mapping
  [VLAN]: Keep track of number of QoS mappings
  [VLAN]: Use rtnl_link API
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC VLAN 01/08]: Move device lookup to ioctl handler

2007-06-05 Thread Patrick McHardy
[VLAN]: Move device lookup to ioctl handler

Move the device lookup and checks to the ioctl handler under the RTNL and
change all name-based interfaces to take a struct net_device * instead.

This allows to use them from a netlink interface, which identifies devices
based on ifindex not name. It also avoids races between the ioctl interface
and the (upcoming) netlink interface since now all changes happen under the
RTNL.

As a nice side effect this greatly simplifies error handling in the helper
functions and fixes a number of incorrect error codes like -EINVAL for
device not found.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit b2981f2b9c99d414e9ea990cda7ca31c5bef2420
tree a50adf705ff34d1fd79dfc67f8f4305e40d686cb
parent 0047f79615df9f8340275f4088e0869398aaf52d
author Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 15:48:15 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 15:48:15 +0200

 net/8021q/vlan.c |  148 +-
 net/8021q/vlan.h |   13 +++-
 net/8021q/vlan_dev.c |  143 +---
 3 files changed, 109 insertions(+), 195 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index bd93c45..0d95388 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -280,43 +280,16 @@ static int unregister_vlan_dev(struct net_device 
*real_dev,
return ret;
 }
 
-static int unregister_vlan_device(const char *vlan_IF_name)
+static int unregister_vlan_device(struct net_device *dev)
 {
-   struct net_device *dev = NULL;
int ret;
 
+   ret = unregister_vlan_dev(VLAN_DEV_INFO(dev)-real_dev,
+ VLAN_DEV_INFO(dev)-vlan_id);
+   unregister_netdevice(dev);
 
-   dev = dev_get_by_name(vlan_IF_name);
-   ret = -EINVAL;
-   if (dev) {
-   if (dev-priv_flags  IFF_802_1Q_VLAN) {
-   rtnl_lock();
-
-   ret = unregister_vlan_dev(VLAN_DEV_INFO(dev)-real_dev,
- VLAN_DEV_INFO(dev)-vlan_id);
-
-   dev_put(dev);
-   unregister_netdevice(dev);
-
-   rtnl_unlock();
-
-   if (ret == 1)
-   ret = 0;
-   } else {
-   printk(VLAN_ERR
-  %s: ERROR:  Tried to remove a non-vlan 
device 
-  with VLAN code, name: %s  priv_flags: %hX\n,
-  __FUNCTION__, dev-name, dev-priv_flags);
-   dev_put(dev);
-   ret = -EPERM;
-   }
-   } else {
-#ifdef VLAN_DEBUG
-   printk(VLAN_DBG %s: WARNING: Could not find dev.\n, 
__FUNCTION__);
-#endif
-   ret = -EINVAL;
-   }
-
+   if (ret == 1)
+   ret = 0;
return ret;
 }
 
@@ -380,12 +353,11 @@ static struct lock_class_key vlan_netdev_xmit_lock_key;
  *  Returns the device that was created, or NULL if there was
  *  an error of some kind.
  */
-static struct net_device *register_vlan_device(const char *eth_IF_name,
+static struct net_device *register_vlan_device(struct net_device *real_dev,
   unsigned short VLAN_ID)
 {
struct vlan_group *grp;
struct net_device *new_dev;
-   struct net_device *real_dev; /* the ethernet device */
char name[IFNAMSIZ];
int i;
 
@@ -397,15 +369,10 @@ static struct net_device *register_vlan_device(const char 
*eth_IF_name,
if (VLAN_ID = VLAN_VID_MASK)
goto out_ret_null;
 
-   /* find the device relating to eth_IF_name. */
-   real_dev = dev_get_by_name(eth_IF_name);
-   if (!real_dev)
-   goto out_ret_null;
-
if (real_dev-features  NETIF_F_VLAN_CHALLENGED) {
printk(VLAN_DBG %s: VLANs not supported on %s.\n,
__FUNCTION__, real_dev-name);
-   goto out_put_dev;
+   goto out_ret_null;
}
 
if ((real_dev-features  NETIF_F_HW_VLAN_RX) 
@@ -413,7 +380,7 @@ static struct net_device *register_vlan_device(const char 
*eth_IF_name,
 real_dev-vlan_rx_kill_vid == NULL)) {
printk(VLAN_DBG %s: Device %s has buggy VLAN hw accel.\n,
__FUNCTION__, real_dev-name);
-   goto out_put_dev;
+   goto out_ret_null;
}
 
if ((real_dev-features  NETIF_F_HW_VLAN_FILTER) 
@@ -421,24 +388,19 @@ static struct net_device *register_vlan_device(const char 
*eth_IF_name,
 real_dev-vlan_rx_kill_vid == NULL)) {
printk(VLAN_DBG %s: Device %s has buggy VLAN hw accel.\n,
__FUNCTION__, real_dev-name);
-   goto out_put_dev;
+   goto out_ret_null;
}
 
-   /* From this point on, all the data structures must remain

[RFC VLAN 02/08]: Remove unregister_vlan_dev wrapper

2007-06-05 Thread Patrick McHardy
[VLAN]: Remove unregister_vlan_dev wrapper

Remove the now useless unregister_vlan_dev wrapper by moving the final
unregister_netdev to unregister_vlan_dev itself. Also replace a few
impossible conditions by BUG_ON and don't try to handle them as valid.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 7286bd4222a1a9722b1d3b80988078b0af973a67
tree e9a21434dc634051d99513432b051b3e18c5e5c7
parent b2981f2b9c99d414e9ea990cda7ca31c5bef2420
author Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 15:48:16 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 15:48:16 +0200

 net/8021q/vlan.c |  140 +++---
 1 files changed, 48 insertions(+), 92 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 0d95388..87de961 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -51,7 +51,7 @@ static char vlan_buggyright[] = David S. Miller [EMAIL 
PROTECTED];
 
 static int vlan_device_event(struct notifier_block *, unsigned long, void *);
 static int vlan_ioctl_handler(void __user *);
-static int unregister_vlan_dev(struct net_device *, unsigned short );
+static int unregister_vlan_dev(struct net_device *);
 
 static struct notifier_block vlan_notifier_block = {
.notifier_call = vlan_device_event,
@@ -118,12 +118,8 @@ static void __exit vlan_cleanup_devices(void)
 
rtnl_lock();
for_each_netdev_safe(dev, nxt) {
-   if (dev-priv_flags  IFF_802_1Q_VLAN) {
-   unregister_vlan_dev(VLAN_DEV_INFO(dev)-real_dev,
-   VLAN_DEV_INFO(dev)-vlan_id);
-
-   unregister_netdevice(dev);
-   }
+   if (dev-priv_flags  IFF_802_1Q_VLAN)
+   unregister_vlan_dev(dev);
}
rtnl_unlock();
 }
@@ -202,94 +198,61 @@ static void vlan_rcu_free(struct rcu_head *rcu)
vlan_group_free(container_of(rcu, struct vlan_group, rcu));
 }
 
-
-/* This returns 0 if everything went fine.
- * It will return 1 if the group was killed as a result.
- * A negative return indicates failure.
- *
- * The RTNL lock must be held.
+/* Destroy a VLAN device. Returns 1 if the device destroyed was the
+ * last one on the underlying device and the group was destroyed,
+ * 0 otherwise.
  */
-static int unregister_vlan_dev(struct net_device *real_dev,
-  unsigned short vlan_id)
+static int unregister_vlan_dev(struct net_device *dev)
 {
-   struct net_device *dev = NULL;
-   int real_dev_ifindex = real_dev-ifindex;
+   struct net_device *real_dev = VLAN_DEV_INFO(dev)-real_dev;
+   unsigned short vlan_id = VLAN_DEV_INFO(dev)-vlan_id;
struct vlan_group *grp;
-   int i, ret;
+   int i, ret = 0;
 
 #ifdef VLAN_DEBUG
printk(VLAN_DBG %s: VID: %i\n, __FUNCTION__, vlan_id);
 #endif
+   ASSERT_RTNL();
 
-   /* sanity check */
-   if (vlan_id = VLAN_VID_MASK)
-   return -EINVAL;
+   grp = __vlan_find_group(real_dev-ifindex);
 
-   ASSERT_RTNL();
-   grp = __vlan_find_group(real_dev_ifindex);
-
-   ret = 0;
-
-   if (grp) {
-   dev = vlan_group_get_device(grp, vlan_id);
-   if (dev) {
-   /* Remove proc entry */
-   vlan_proc_rem_dev(dev);
-
-   /* Take it out of our own structures, but be sure to
-* interlock with HW accelerating devices or SW vlan
-* input packet processing.
-*/
-   if (real_dev-features 
-   (NETIF_F_HW_VLAN_RX | NETIF_F_HW_VLAN_FILTER)) {
-   real_dev-vlan_rx_kill_vid(real_dev, vlan_id);
-   }
-
-   vlan_group_set_device(grp, vlan_id, NULL);
-   synchronize_net();
-
-
-   /* Caller unregisters (and if necessary, puts)
-* VLAN device, but we get rid of the reference to
-* real_dev here.
-*/
-   dev_put(real_dev);
-
-   /* If the group is now empty, kill off the
-* group.
-*/
-   for (i = 0; i  VLAN_VID_MASK; i++)
-   if (vlan_group_get_device(grp, i))
-   break;
-
-   if (i == VLAN_VID_MASK) {
-   if (real_dev-features  NETIF_F_HW_VLAN_RX)
-   real_dev-vlan_rx_register(real_dev, 
NULL);
-
-   hlist_del_rcu(grp-hlist);
-
-   /* Free the group, after all cpu's are done. */
-   call_rcu(grp-rcu, vlan_rcu_free);
-
-   grp = NULL;
-   ret = 

[RFC VLAN 03/08]: Add device init callback

2007-06-05 Thread Patrick McHardy
[VLAN]: Add device init callback

Move device initialization to -init callback to fix an incorrect iflink
value in the initial NEWLINK message. Additionally this makes sure the
lockdep class is properly set before the device will be used.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit c771a6df20db3f21062bbbf56f09f2a64535cb71
tree 7068db9a14adb06df077b6634318db72d0eb6c7d
parent 7286bd4222a1a9722b1d3b80988078b0af973a67
author Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 15:48:16 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 15:48:16 +0200

 net/8021q/vlan.c |  103 +++---
 1 files changed, 52 insertions(+), 51 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 87de961..c745530 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -256,6 +256,53 @@ static int unregister_vlan_dev(struct net_device *dev)
return ret;
 }
 
+/*
+ * vlan network devices have devices nesting below it, and are a special
+ * super class of normal network devices; split their locks off into a
+ * separate class since they always nest.
+ */
+static struct lock_class_key vlan_netdev_xmit_lock_key;
+
+static int vlan_dev_init(struct net_device *dev)
+{
+   struct net_device *real_dev = VLAN_DEV_INFO(dev)-real_dev;
+
+   /* IFF_BROADCAST|IFF_MULTICAST; ??? */
+   dev-flags  = real_dev-flags  ~IFF_UP;
+   dev-iflink = real_dev-ifindex;
+   dev-state  = (real_dev-state  ((1__LINK_STATE_NOCARRIER) |
+ (1__LINK_STATE_DORMANT))) |
+ (1__LINK_STATE_PRESENT);
+
+   /* need 4 bytes for extra VLAN header info,
+* hope the underlying device can handle it.
+*/
+   dev-mtu = real_dev-mtu;
+
+   /* TODO: maybe just assign it to be ETHERNET? */
+   dev-type = real_dev-type;
+
+   memcpy(dev-broadcast, real_dev-broadcast, real_dev-addr_len);
+   memcpy(dev-dev_addr, real_dev-dev_addr, real_dev-addr_len);
+   dev-addr_len = real_dev-addr_len;
+
+   if (real_dev-features  NETIF_F_HW_VLAN_TX) {
+   dev-hard_header = real_dev-hard_header;
+   dev-hard_header_len = real_dev-hard_header_len;
+   dev-hard_start_xmit = vlan_dev_hwaccel_hard_start_xmit;
+   dev-rebuild_header  = real_dev-rebuild_header;
+   } else {
+   dev-hard_header = vlan_dev_hard_header;
+   dev-hard_header_len = real_dev-hard_header_len + VLAN_HLEN;
+   dev-hard_start_xmit = vlan_dev_hard_start_xmit;
+   dev-rebuild_header  = vlan_dev_rebuild_header;
+   }
+   dev-hard_header_parse = real_dev-hard_header_parse;
+
+   lockdep_set_class(dev-_xmit_lock, vlan_netdev_xmit_lock_key);
+   return 0;
+}
+
 static void vlan_setup(struct net_device *new_dev)
 {
SET_MODULE_OWNER(new_dev);
@@ -276,6 +323,7 @@ static void vlan_setup(struct net_device *new_dev)
 
/* set up method calls */
new_dev-change_mtu = vlan_dev_change_mtu;
+   new_dev-init = vlan_dev_init;
new_dev-open = vlan_dev_open;
new_dev-stop = vlan_dev_stop;
new_dev-set_mac_address = vlan_dev_set_mac_address;
@@ -304,14 +352,6 @@ static void vlan_transfer_operstate(const struct 
net_device *dev, struct net_dev
}
 }
 
-/*
- * vlan network devices have devices nesting below it, and are a special
- * super class of normal network devices; split their locks off into a
- * separate class since they always nest.
- */
-static struct lock_class_key vlan_netdev_xmit_lock_key;
-
-
 /*  Attach a VLAN device to a mac address (ie Ethernet Card).
  *  Returns the device that was created, or NULL if there was
  *  an error of some kind.
@@ -405,52 +445,16 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
 #ifdef VLAN_DEBUG
printk(VLAN_DBG Allocated new name -:%s:-\n, new_dev-name);
 #endif
-   /* IFF_BROADCAST|IFF_MULTICAST; ??? */
-   new_dev-flags = real_dev-flags;
-   new_dev-flags = ~IFF_UP;
-
-   new_dev-state = (real_dev-state  ((1__LINK_STATE_NOCARRIER) |
-(1__LINK_STATE_DORMANT))) |
-(1__LINK_STATE_PRESENT);
-
-   /* need 4 bytes for extra VLAN header info,
-* hope the underlying device can handle it.
-*/
-   new_dev-mtu = real_dev-mtu;
-
-   /* TODO: maybe just assign it to be ETHERNET? */
-   new_dev-type = real_dev-type;
-
-   new_dev-hard_header_len = real_dev-hard_header_len;
-   if (!(real_dev-features  NETIF_F_HW_VLAN_TX)) {
-   /* Regular ethernet + 4 bytes (18 total). */
-   new_dev-hard_header_len += VLAN_HLEN;
-   }
-
-   VLAN_MEM_DBG(new_dev-priv malloc, addr: %p  size: %i\n,
-new_dev-priv,
-sizeof(struct vlan_dev_info));
-
-   memcpy(new_dev-broadcast, real_dev-broadcast, 

[RFC VLAN 04/08]: Move vlan_group allocation to seperate function

2007-06-05 Thread Patrick McHardy
[VLAN]: Move vlan_group allocation to seperate function

Move group allocation to a seperate function to clean up the code a bit
and allocate groups before registering the device. Device registration
is globally visible and causes netlink events, so we shouldn't fail
afterwards.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 1057887432918a89e5b374c29ac36224716cb0e4
tree 280fcd63f25b1c19053285d5b0ebf983a95e08a9
parent c771a6df20db3f21062bbbf56f09f2a64535cb71
author Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 15:48:16 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 15:48:16 +0200

 net/8021q/vlan.c |   78 --
 1 files changed, 41 insertions(+), 37 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index c745530..d470913 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -193,6 +193,34 @@ static void vlan_group_free(struct vlan_group *grp)
kfree(grp);
 }
 
+static struct vlan_group *vlan_group_alloc(int ifindex)
+{
+   struct vlan_group *group;
+   unsigned int size;
+   unsigned int i;
+
+   group = kzalloc(sizeof(struct vlan_group), GFP_KERNEL);
+   if (!group)
+   return NULL;
+
+   size = sizeof(struct net_device *) * VLAN_GROUP_ARRAY_PART_LEN;
+
+   for (i = 0; i  VLAN_GROUP_ARRAY_SPLIT_PARTS; i++) {
+   group-vlan_devices_arrays[i] = kzalloc(size, GFP_KERNEL);
+   if (!group-vlan_devices_arrays[i])
+   goto err;
+   }
+
+   group-real_dev_ifindex = ifindex;
+   hlist_add_head_rcu(group-hlist,
+  vlan_group_hash[vlan_grp_hashfn(ifindex)]);
+   return group;
+
+err:
+   vlan_group_free(group);
+   return NULL;
+}
+
 static void vlan_rcu_free(struct rcu_head *rcu)
 {
vlan_group_free(container_of(rcu, struct vlan_group, rcu));
@@ -359,10 +387,9 @@ static void vlan_transfer_operstate(const struct 
net_device *dev, struct net_dev
 static struct net_device *register_vlan_device(struct net_device *real_dev,
   unsigned short VLAN_ID)
 {
-   struct vlan_group *grp;
+   struct vlan_group *grp, *reg = NULL;
struct net_device *new_dev;
char name[IFNAMSIZ];
-   int i;
 
 #ifdef VLAN_DEBUG
printk(VLAN_DBG %s: if_name -:%s:- vid: %i\n,
@@ -459,9 +486,15 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
printk(VLAN_DBG About to go find the group for idx: %i\n,
   real_dev-ifindex);
 #endif
+   grp = __vlan_find_group(real_dev-ifindex);
+   if (!grp) {
+   reg = grp = vlan_group_alloc(real_dev-ifindex);
+   if (!grp)
+   goto out_free_newdev;
+   }
 
if (register_netdevice(new_dev))
-   goto out_free_newdev;
+   goto out_free_group;
 
vlan_transfer_operstate(real_dev, new_dev);
linkwatch_fire_event(new_dev); /* _MUST_ call rfc2863_policy() */
@@ -469,34 +502,8 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
/* So, got the sucker initialized, now lets place
 * it into our local structure.
 */
-   grp = __vlan_find_group(real_dev-ifindex);
-
-   /* Note, we are running under the RTNL semaphore
-* so it cannot appear on us.
-*/
-   if (!grp) { /* need to add a new group */
-   grp = kzalloc(sizeof(struct vlan_group), GFP_KERNEL);
-   if (!grp)
-   goto out_free_unregister;
-
-   for (i=0; i  VLAN_GROUP_ARRAY_SPLIT_PARTS; i++) {
-   grp-vlan_devices_arrays[i] = kzalloc(
-   sizeof(struct net_device 
*)*VLAN_GROUP_ARRAY_PART_LEN,
-   GFP_KERNEL);
-
-   if (!grp-vlan_devices_arrays[i])
-   goto out_free_arrays;
-   }
-
-   /* printk(KERN_ALERT VLAN REGISTER:  Allocated new group.\n); 
*/
-   grp-real_dev_ifindex = real_dev-ifindex;
-
-   hlist_add_head_rcu(grp-hlist,
-  
vlan_group_hash[vlan_grp_hashfn(real_dev-ifindex)]);
-
-   if (real_dev-features  NETIF_F_HW_VLAN_RX)
-   real_dev-vlan_rx_register(real_dev, grp);
-   }
+   if (reg  real_dev-features  NETIF_F_HW_VLAN_RX)
+   real_dev-vlan_rx_register(real_dev, reg);
 
vlan_group_set_device(grp, VLAN_ID, new_dev);
 
@@ -514,12 +521,9 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
 #endif
return new_dev;
 
-out_free_arrays:
-   vlan_group_free(grp);
-
-out_free_unregister:
-   unregister_netdev(new_dev);
-   goto out_ret_null;
+out_free_group:
+   if (reg)
+   vlan_group_free(reg);
 
 out_free_newdev:

[RFC VLAN 05/08]: Split up device creation

2007-06-05 Thread Patrick McHardy
[VLAN]: Split up device creation

Split up device creation in allocation/registration functions taking
struct net_device * and a small wrapper for the ioctl handling. This is
needed by the netlink interface to properly set up the device before
registration.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit c59e99d32d8085f8ed1aa794106c28e546220581
tree dca7eeb9815666e863e5739a8c0a4698333026bc
parent 1057887432918a89e5b374c29ac36224716cb0e4
author Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 17:52:41 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 17:52:41 +0200

 net/8021q/vlan.c |  196 ++
 net/8021q/vlan.h |4 +
 2 files changed, 100 insertions(+), 100 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index d470913..e5405cf 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -380,64 +380,110 @@ static void vlan_transfer_operstate(const struct 
net_device *dev, struct net_dev
}
 }
 
-/*  Attach a VLAN device to a mac address (ie Ethernet Card).
- *  Returns the device that was created, or NULL if there was
- *  an error of some kind.
- */
-static struct net_device *register_vlan_device(struct net_device *real_dev,
-  unsigned short VLAN_ID)
+int vlan_check_device(struct net_device *dev, unsigned short vlan_id)
 {
-   struct vlan_group *grp, *reg = NULL;
-   struct net_device *new_dev;
-   char name[IFNAMSIZ];
-
-#ifdef VLAN_DEBUG
-   printk(VLAN_DBG %s: if_name -:%s:- vid: %i\n,
-   __FUNCTION__, eth_IF_name, VLAN_ID);
-#endif
-
-   if (VLAN_ID = VLAN_VID_MASK)
-   goto out_ret_null;
-
-   if (real_dev-features  NETIF_F_VLAN_CHALLENGED) {
+   if (dev-features  NETIF_F_VLAN_CHALLENGED) {
printk(VLAN_DBG %s: VLANs not supported on %s.\n,
-   __FUNCTION__, real_dev-name);
-   goto out_ret_null;
+   __FUNCTION__, dev-name);
+   return -EOPNOTSUPP;
}
 
-   if ((real_dev-features  NETIF_F_HW_VLAN_RX) 
-   (real_dev-vlan_rx_register == NULL ||
-real_dev-vlan_rx_kill_vid == NULL)) {
+   if ((dev-features  NETIF_F_HW_VLAN_RX) 
+   (dev-vlan_rx_register == NULL || dev-vlan_rx_kill_vid == NULL)) {
printk(VLAN_DBG %s: Device %s has buggy VLAN hw accel.\n,
-   __FUNCTION__, real_dev-name);
-   goto out_ret_null;
+   __FUNCTION__, dev-name);
+   return -EOPNOTSUPP;
}
 
-   if ((real_dev-features  NETIF_F_HW_VLAN_FILTER) 
-   (real_dev-vlan_rx_add_vid == NULL ||
-real_dev-vlan_rx_kill_vid == NULL)) {
+   if ((dev-features  NETIF_F_HW_VLAN_FILTER) 
+   (dev-vlan_rx_add_vid == NULL || dev-vlan_rx_kill_vid == NULL)) {
printk(VLAN_DBG %s: Device %s has buggy VLAN hw accel.\n,
-   __FUNCTION__, real_dev-name);
-   goto out_ret_null;
+   __FUNCTION__, dev-name);
+   return -EOPNOTSUPP;
}
 
/* The real device must be up and operating in order to
-* assosciate a VLAN device with it.
+* associate a VLAN device with it.
 */
-   if (!(real_dev-flags  IFF_UP))
-   goto out_ret_null;
+   if (!(dev-flags  IFF_UP))
+   return -ENETDOWN;
 
-   if (__find_vlan_dev(real_dev, VLAN_ID) != NULL) {
-   /* was already registered. */
-   printk(VLAN_DBG %s: ALREADY had VLAN registered\n, 
__FUNCTION__);
-   goto out_ret_null;
+   if (__find_vlan_dev(dev, vlan_id) != NULL)
+   return -EEXIST;
+
+   return 0;
+}
+
+int vlan_register_dev(struct net_device *dev)
+{
+   struct vlan_dev_info *vlan = VLAN_DEV_INFO(dev);
+   struct net_device *real_dev = vlan-real_dev;
+   struct vlan_group *group, *reg = NULL;
+   int err;
+
+   group = __vlan_find_group(real_dev-ifindex);
+   if (!group) {
+   err = -ENOMEM;
+   group = reg = vlan_group_alloc(real_dev-ifindex);
+   if (!group)
+   goto err1;
}
 
-   /* Gotta set up the fields for the device. */
+   err = register_netdevice(dev);
+   if (err  0)
+   goto err2;
+
+   /* Account for reference in struct vlan_dev_info */
+   dev_hold(real_dev);
+
+   vlan_transfer_operstate(real_dev, dev);
+   linkwatch_fire_event(dev); /* _MUST_ call rfc2863_policy() */
+
+   /* So, got the sucker initialized, now lets place
+* it into our local structure.
+*/
+   vlan_group_set_device(group, vlan-vlan_id, dev);
+
+   if (reg  real_dev-features  NETIF_F_HW_VLAN_RX)
+   real_dev-vlan_rx_register(real_dev, reg);
+   if (real_dev-features  NETIF_F_HW_VLAN_FILTER)
+   

[RFC VLAN 06/08]: Use 32 bit value for skb-priority mapping

2007-06-05 Thread Patrick McHardy
[VLAN]: Use 32 bit value for skb-priority mapping

skb-priority has only 32 bits and even VLAN uses 32 bit values in its
API.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 24b7a56049e5e9f63e6fc5933d10b47c15294fea
tree 7c1d37c5c3b9f09c3c7d726d6fa922454d57bf16
parent c59e99d32d8085f8ed1aa794106c28e546220581
author Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 17:52:48 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 17:52:48 +0200

 include/linux/if_vlan.h |4 ++--
 net/8021q/vlanproc.c|4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 81e9bc9..1889b29 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -99,7 +99,7 @@ static inline void vlan_group_set_device(struct vlan_group 
*vg, int vlan_id,
 }
 
 struct vlan_priority_tci_mapping {
-   unsigned long priority;
+   unsigned int priority;
unsigned short vlan_qos; /* This should be shifted when first set, so 
we only do it
  * at provisioning time.
  * ((skb-priority  13)  0xE000)
@@ -112,7 +112,7 @@ struct vlan_dev_info {
/** This will be the mapping that correlates skb-priority to
 * 3 bits of VLAN QOS tags...
 */
-   unsigned long ingress_priority_map[8];
+   unsigned int ingress_priority_map[8];
struct vlan_priority_tci_mapping *egress_priority_map[16]; /* hash 
table */
 
unsigned short vlan_id;/*  The VLAN Identifier for this 
interface. */
diff --git a/net/8021q/vlanproc.c b/net/8021q/vlanproc.c
index d216a64..8693b21 100644
--- a/net/8021q/vlanproc.c
+++ b/net/8021q/vlanproc.c
@@ -342,7 +342,7 @@ static int vlandev_seq_show(struct seq_file *seq, void 
*offset)
seq_printf(seq, Device: %s, dev_info-real_dev-name);
/* now show all PRIORITY mappings relating to this VLAN */
seq_printf(seq,
-  \nINGRESS priority mappings: 0:%lu  1:%lu  2:%lu  3:%lu 
 4:%lu  5:%lu  6:%lu 7:%lu\n,
+  \nINGRESS priority mappings: 0:%u  1:%u  2:%u  3:%u  
4:%u  5:%u  6:%u 7:%u\n,
   dev_info-ingress_priority_map[0],
   dev_info-ingress_priority_map[1],
   dev_info-ingress_priority_map[2],
@@ -357,7 +357,7 @@ static int vlandev_seq_show(struct seq_file *seq, void 
*offset)
const struct vlan_priority_tci_mapping *mp
= dev_info-egress_priority_map[i];
while (mp) {
-   seq_printf(seq, %lu:%hu ,
+   seq_printf(seq, %u:%hu ,
   mp-priority, ((mp-vlan_qos  13)  0x7));
mp = mp-next;
}
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC VLAN 07/08]: Keep track of number of QoS mappings

2007-06-05 Thread Patrick McHardy
[VLAN]: Keep track of number of QoS mappings

Keep track of the number of configured ingress/egress QoS mappings to
avoid iteration while calculating the netlink attribute size.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit fba4579d0d3c365f1ff5473e905dac68c5ec62f3
tree 892fa4a95ab163960573ee6bf1430279a1d697b8
parent 24b7a56049e5e9f63e6fc5933d10b47c15294fea
author Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 17:52:48 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 17:52:48 +0200

 include/linux/if_vlan.h |4 
 net/8021q/vlan_dev.c|   27 +--
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 1889b29..ba48340 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -112,9 +112,13 @@ struct vlan_dev_info {
/** This will be the mapping that correlates skb-priority to
 * 3 bits of VLAN QOS tags...
 */
+   unsigned int nr_ingress_mappings;
unsigned int ingress_priority_map[8];
+
+   unsigned int nr_egress_mappings;
struct vlan_priority_tci_mapping *egress_priority_map[16]; /* hash 
table */
 
+
unsigned short vlan_id;/*  The VLAN Identifier for this 
interface. */
unsigned short flags;  /* (1  0) re_order_header   This 
option will cause the
 *   VLAN code to move around the 
ethernet header on
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 0b7e03e..1644626 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -537,35 +537,50 @@ int vlan_dev_change_mtu(struct net_device *dev, int 
new_mtu)
 void vlan_dev_set_ingress_priority(const struct net_device *dev,
   u32 skb_prio, short vlan_prio)
 {
-   VLAN_DEV_INFO(dev)-ingress_priority_map[vlan_prio  0x7] = skb_prio;
+   struct vlan_dev_info *vlan = VLAN_DEV_INFO(dev);
+
+   if (vlan-ingress_priority_map[vlan_prio  0x7]  !skb_prio)
+   vlan-nr_ingress_mappings--;
+   else if (!vlan-ingress_priority_map[vlan_prio  0x7]  skb_prio)
+   vlan-nr_ingress_mappings++;
+
+   vlan-ingress_priority_map[vlan_prio  0x7] = skb_prio;
 }
 
 int vlan_dev_set_egress_priority(const struct net_device *dev,
 u32 skb_prio, short vlan_prio)
 {
+   struct vlan_dev_info *vlan = VLAN_DEV_INFO(dev);
struct vlan_priority_tci_mapping *mp = NULL;
struct vlan_priority_tci_mapping *np;
+   u32 vlan_qos = (vlan_prio  13)  0xE000;
 
/* See if a priority mapping exists.. */
-   mp = VLAN_DEV_INFO(dev)-egress_priority_map[skb_prio  0xF];
+   mp = vlan-egress_priority_map[skb_prio  0xF];
while (mp) {
if (mp-priority == skb_prio) {
-   mp-vlan_qos = ((vlan_prio  13)  0xE000);
+   if (mp-vlan_qos  !vlan_qos)
+   vlan-nr_egress_mappings--;
+   else if (!mp-vlan_qos  vlan_qos)
+   vlan-nr_egress_mappings++;
+   mp-vlan_qos = vlan_qos;
return 0;
}
mp = mp-next;
}
 
/* Create a new mapping then. */
-   mp = VLAN_DEV_INFO(dev)-egress_priority_map[skb_prio  0xF];
+   mp = vlan-egress_priority_map[skb_prio  0xF];
np = kmalloc(sizeof(struct vlan_priority_tci_mapping), GFP_KERNEL);
if (!np)
return -ENOBUFS;
 
np-next = mp;
np-priority = skb_prio;
-   np-vlan_qos = ((vlan_prio  13)  0xE000);
-   VLAN_DEV_INFO(dev)-egress_priority_map[skb_prio  0xF] = np;
+   np-vlan_qos = vlan_qos;
+   vlan-egress_priority_map[skb_prio  0xF] = np;
+   if (vlan_qos)
+   vlan-nr_egress_mappings++;
return 0;
 }
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC VLAN 08/08]: Use rtnl_link API

2007-06-05 Thread Patrick McHardy
[VLAN]: Use rtnl_link API

---
commit 6e11b791711627a87abfe1bd8c21a4f4d369bca6
tree 5fbe3c1ff51dd87c80748b2793878c076ed57ed7
parent fba4579d0d3c365f1ff5473e905dac68c5ec62f3
author Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 17:53:18 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 29 May 2007 17:53:18 +0200

 include/linux/if_link.h   |   25 ++
 include/linux/rtnetlink.h |3 -
 net/8021q/Makefile|2 
 net/8021q/vlan.c  |   33 +---
 net/8021q/vlan.h  |5 +
 net/8021q/vlan_netlink.c  |  194 +
 6 files changed, 248 insertions(+), 14 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 936bd1b..b77e5a8 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -154,4 +154,29 @@ enum
 
 #define IFLA_INFO_MAX  (__IFLA_INFO_MAX - 1)
 
+enum
+{
+   IFLA_VLAN_UNSPEC,
+   IFLA_VLAN_ID,
+   IFLA_VLAN_EGRESS_QOS,
+   IFLA_VLAN_INGRESS_QOS,
+   __IFLA_VLAN_MAX,
+};
+
+#define IFLA_VLAN_MAX  (__IFLA_VLAN_MAX - 1)
+
+struct ifla_vlan_qos_mapping
+{
+   __u32 from;
+   __u32 to;
+};
+
+enum {
+   IFLA_VLAN_QOS_UNSPEC,
+   IFLA_VLAN_QOS_MAPPING,
+   __IFLA_VLAN_QOS_MAX
+};
+
+#define IFLA_VLAN_QOS_MAX  (__IFLA_VLAN_QOS_MAX - 1)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 6f228c7..dcc0202 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -409,7 +409,8 @@ struct rtgenmsg
 
 #define RTNL_LF_DUMMY  128
 #define RTNL_LF_IFB129
-#define RTNL_LF_MAX129
+#define RTNL_LF_VLAN   130
+#define RTNL_LF_MAX130
 
 #define RTNL_LF_OFFSET (RTNL_LF_COMPAT_END - RTNL_LF_COMPAT_MAX)
 #define RTNL_LF_NUM(NPROTO + RTNL_LF_MAX - RTNL_LF_COMPAT_END)
diff --git a/net/8021q/Makefile b/net/8021q/Makefile
index 97feb44..10ca7f4 100644
--- a/net/8021q/Makefile
+++ b/net/8021q/Makefile
@@ -4,7 +4,7 @@
 
 obj-$(CONFIG_VLAN_8021Q) += 8021q.o
 
-8021q-objs := vlan.o vlan_dev.o
+8021q-objs := vlan.o vlan_dev.o vlan_netlink.o
 
 ifeq ($(CONFIG_PROC_FS),y)
 8021q-objs += vlanproc.o
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index e5405cf..475ca57 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -51,7 +51,6 @@ static char vlan_buggyright[] = David S. Miller [EMAIL 
PROTECTED];
 
 static int vlan_device_event(struct notifier_block *, unsigned long, void *);
 static int vlan_ioctl_handler(void __user *);
-static int unregister_vlan_dev(struct net_device *);
 
 static struct notifier_block vlan_notifier_block = {
.notifier_call = vlan_device_event,
@@ -97,15 +96,22 @@ static int __init vlan_proto_init(void)
 
/* Register us to receive netdevice events */
err = register_netdevice_notifier(vlan_notifier_block);
-   if (err  0) {
-   dev_remove_pack(vlan_packet_type);
-   vlan_proc_cleanup();
-   return err;
-   }
+   if (err  0)
+   goto err1;
 
-   vlan_ioctl_set(vlan_ioctl_handler);
+   err = vlan_netlink_init();
+   if (err  0)
+   goto err2;
 
+   vlan_ioctl_set(vlan_ioctl_handler);
return 0;
+
+err2:
+   unregister_netdevice_notifier(vlan_notifier_block);
+err1:
+   vlan_proc_cleanup();
+   dev_remove_pack(vlan_packet_type);
+   return err;
 }
 
 /* Cleanup all vlan devices
@@ -119,7 +125,7 @@ static void __exit vlan_cleanup_devices(void)
rtnl_lock();
for_each_netdev_safe(dev, nxt) {
if (dev-priv_flags  IFF_802_1Q_VLAN)
-   unregister_vlan_dev(dev);
+   vlan_unregister_dev(dev);
}
rtnl_unlock();
 }
@@ -132,6 +138,7 @@ static void __exit vlan_cleanup_module(void)
 {
int i;
 
+   vlan_netlink_fini();
vlan_ioctl_set(NULL);
 
/* Un-register us from receiving netdevice events */
@@ -230,7 +237,7 @@ static void vlan_rcu_free(struct rcu_head *rcu)
  * last one on the underlying device and the group was destroyed,
  * 0 otherwise.
  */
-static int unregister_vlan_dev(struct net_device *dev)
+int vlan_unregister_dev(struct net_device *dev)
 {
struct net_device *real_dev = VLAN_DEV_INFO(dev)-real_dev;
unsigned short vlan_id = VLAN_DEV_INFO(dev)-vlan_id;
@@ -331,7 +338,7 @@ static int vlan_dev_init(struct net_device *dev)
return 0;
 }
 
-static void vlan_setup(struct net_device *new_dev)
+void vlan_setup(struct net_device *new_dev)
 {
SET_MODULE_OWNER(new_dev);
 
@@ -358,6 +365,8 @@ static void vlan_setup(struct net_device *new_dev)
new_dev-set_multicast_list = vlan_dev_set_multicast_list;
new_dev-destructor = free_netdev;
new_dev-do_ioctl = vlan_dev_ioctl;
+
+   new_dev-link_family = RTNL_LF_VLAN;
 }
 
 static void vlan_transfer_operstate(const struct net_device *dev, struct 
net_device *vlandev)
@@ -595,7 +604,7 

[RFC IPROUTE]: VLAN support

2007-06-05 Thread Patrick McHardy
This patch contains the iproute changes needed for VLAN configuration.

Examples:

Add VLAN device:

# ip link add dev eth0.1000 link eth0 type vlan \
id 1000 \
egress-qos-map 1:2 2:3 \
ingress-qos-map 2:1 3:2

Show new device:

# ip link list eth0.1000

9: [EMAIL PROTECTED]: BROADCAST,MULTICAST mtu 1500 qdisc noop
link/ether 8e:bb:6f:05:29:e9 brd ff:ff:ff:ff:ff:ff
vlan id 1000
  ingress-qos-map { 2:1 3:2 }
  egress-qos-map { 1:2 2:3 }

Delete device:

# ip link delete eth0.1000

[IPROUTE]: VLAN support

---
commit a00ac9fcbd7b84f442c5290594afa8572cf6fd8b
tree b60f6e564a83d40f8173c8dda62d9d46e51cb655
parent e59a7a02053c997a2b7ff9a4436bd3deb4781bf4
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:20:34 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:20:34 +0200

 include/linux/if_link.h |   25 
 ip/Makefile |2 +
 ip/iplink_vlan.c|  140 +++
 3 files changed, 167 insertions(+), 0 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index aac0df1..4ab8a07 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -150,4 +150,29 @@ enum
 
 #define IFLA_INFO_MAX  (__IFLA_INFO_MAX - 1)
 
+enum
+{
+   IFLA_VLAN_UNSPEC,
+   IFLA_VLAN_ID,
+   IFLA_VLAN_EGRESS_QOS,
+   IFLA_VLAN_INGRESS_QOS,
+   __IFLA_VLAN_MAX,
+};
+
+#define IFLA_VLAN_MAX  (__IFLA_VLAN_MAX - 1)
+
+struct ifla_vlan_qos_mapping
+{
+   __u32 from;
+   __u32 to;
+};
+
+enum {
+   IFLA_VLAN_QOS_UNSPEC,
+   IFLA_VLAN_QOS_MAPPING,
+   __IFLA_VLAN_QOS_MAX
+};
+
+#define IFLA_VLAN_QOS_MAX  (__IFLA_VLAN_QOS_MAX - 1)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/ip/Makefile b/ip/Makefile
index 9a5bfe3..b6d8693 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -3,6 +3,8 @@ IPOBJ=ip.o ipaddress.o iproute.o iprule.o \
 ipmaddr.o ipmonitor.o ipmroute.o ipprefix.o \
 ipxfrm.o xfrm_state.o xfrm_policy.o xfrm_monitor.o
 
+IPOBJ += iplink_vlan.o
+
 RTMONOBJ=rtmon.o
 
 ALLOBJ=$(IPOBJ) $(RTMONOBJ)
diff --git a/ip/iplink_vlan.c b/ip/iplink_vlan.c
new file mode 100644
index 000..00fce6c
--- /dev/null
+++ b/ip/iplink_vlan.c
@@ -0,0 +1,140 @@
+/*
+ * iplink_vlan.c   VLAN device support
+ *
+ *  This program is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU General Public License
+ *  as published by the Free Software Foundation; either version
+ *  2 of the License, or (at your option) any later version.
+ *
+ * Authors: Patrick McHardy [EMAIL PROTECTED]
+ */
+
+#include stdio.h
+#include stdlib.h
+#include string.h
+
+#include rt_names.h
+#include utils.h
+#include ip_common.h
+
+static void explain(void)
+{
+   fprintf(stderr,
+   Usage: ... vlan id VLANID [ ingress-qos-map QOS-MAP ] [ 
egress-qos-map QOS-MAP ]\n
+   \n
+   VLANID := 0-4095\n
+   QOS-MAP := [ QOS-MAP ] QOS-MAPPING\n
+   QOS-MAPPING := FROM:TO\n
+   );
+}
+
+static int vlan_parse_qos_map(int *argcp, char ***argvp, struct nlmsghdr *n,
+ int attrtype)
+{
+   int argc = *argcp;
+   char **argv = *argvp;
+   struct ifla_vlan_qos_mapping m;
+   struct rtattr *tail;
+
+   tail = NLMSG_TAIL(n);
+   addattr_l(n, 1024, attrtype, NULL, 0);
+
+   while (argc  0) {
+   char *colon = strchr(*argv, ':');
+
+   if (!colon)
+   break;
+   *colon = '\0';
+
+   if (get_u32(m.from, *argv, 0))
+   return 1;
+   if (get_u32(m.to, colon + 1, 0))
+   return 1;
+   argc--, argv++;
+
+   addattr_l(n, 1024, IFLA_VLAN_QOS_MAPPING, m, sizeof(m));
+   }
+
+   tail-rta_len = (void *) NLMSG_TAIL(n) - (void *)tail;
+
+   *argcp = argc;
+   *argvp = argv;
+   return 0;
+}
+
+static int vlan_parse_opt(struct link_util *lu, int argc, char **argv,
+ struct nlmsghdr *n)
+{
+   __u16 id;
+
+   while (argc  0) {
+   if (matches(*argv, id) == 0) {
+   NEXT_ARG();
+   if (get_u16(id, *argv, 0))
+   invarg(id is invalid, *argv);
+   addattr_l(n, 1024, IFLA_VLAN_ID, id, 2);
+   } else if (matches(*argv, ingress-qos-map) == 0) {
+   NEXT_ARG();
+   if (vlan_parse_qos_map(argc, argv, n,
+  IFLA_VLAN_INGRESS_QOS))
+   invarg(invalid ingress-qos-map, *argv);
+   continue;
+   } else if (matches(*argv, egress-qos-map) == 0) {
+   NEXT_ARG();
+   if (vlan_parse_qos_map(argc, argv, n,
+ 

Re: [PATCH][RFC] network splice receive

2007-06-05 Thread Evgeniy Polyakov
On Tue, Jun 05, 2007 at 10:05:43AM +0200, Jens Axboe ([EMAIL PROTECTED]) wrote:
 Here's an implementation of tcp network splice receive support. It's
 originally based on the patch set that Intel posted some time ago, but
 has been (close to) 100% reworked.
 
 Now, I'm not a networking guru by any stretch of the imagination, so I'd
 like some input on the direction of the main patch. Is the approach
 feasible? Glaring errors? Missing bits?

  263.709262] [ cut here ]
  [  263.713932] kernel BUG at include/linux/mm.h:285!
  [  263.718678] invalid opcode:  [1] PREEMPT SMP 
  [  263.723561] CPU 0 
  [  263.725665] Modules linked in: button loop snd_intel8x0
  snd_ac97_codec ac97_bus snd_pcm snd_timer snd soundcore psmouse
  snd_page_alloc k8temp i2c_nforcen
  [  263.755666] Pid: 2709, comm: splice-fromnet Not tainted
  2.6.22-rc4-splice #2
  [  263.762759] RIP: 0010:[8038c60c]  [8038c60c]
  skb_splice_bits+0xac/0x1c9
  [  263.771212] RSP: 0018:81003c79fc88  EFLAGS: 00010246
  [  263.776564] RAX:  RBX: 05a8 RCX:
  81003ff04778
  [  263.783743] RDX: 81003ff04778 RSI: 0ab2 RDI:
  0003d52d
  [  263.790925] RBP: 81003c79fdd8 R08:  R09:
  81003d927b78
  [  263.798104] R10: 803b0181 R11: 81003c79fde8 R12:
  81003d52d000
  [  263.805284] R13: 054e R14: 81003d927b78 R15:
  81003bbc6ea0
  [  263.812463] FS:  2ac4089a86d0() GS:804fb000()
  knlGS:
  [  263.820611] CS:  0010 DS:  ES:  CR0: 8005003b
  [  263.826396] CR2: 2ac4088320e0 CR3: 3c987000 CR4:
  06e0
  [  263.833578] Process splice-fromnet (pid: 2709, threadinfo
  81003c79e000, task 81003755c380)
  [  263.842591] Stack:  81003ff04720 
  81003755c380 0046
  [  263.850897]  00c6 0046 81003b0428b8
  81003d0b5b10
  [  263.858543]  00c6 81003d0b5b10 81003b0428b8
  81003d0b5b10
  [  263.865957] Call Trace:
  [  263.868683]  [803dc630] _read_unlock_irq+0x31/0x4e
  [  263.874393]  [803afb54] tcp_splice_data_recv+0x20/0x22
  [  263.880447]  [803afa2b] tcp_read_sock+0xa2/0x1ab
  [  263.885983]  [803afb34] tcp_splice_data_recv+0x0/0x22
  [  263.891951]  [803b01c1] tcp_splice_read+0xae/0x1a3
  [  263.897655]  [8038920f] sock_def_readable+0x0/0x6f
  [  263.903366]  [80384a65] sock_splice_read+0x15/0x17
  [  263.909072]  [8029e773] do_splice_to+0x76/0x88
  [  263.914432]  [8029fcc8] sys_splice+0x1a8/0x232
  [  263.919795]  [802097ce] system_call+0x7e/0x83
  [  263.925067] 
  [  263.926606] 
  [  263.926607] Code: 0f 0b eb fe 44 89 e6 81 e6 ff 0f 00 00 90 ff 42
  08 48 63 55 
  [  263.936418] RIP  [8038c60c] skb_splice_bits+0xac/0x1c9
  [  263.942516]  RSP 81003c79fc88

This a vm_bug_on in get_page().

 +static inline int spd_fill_page(struct splice_pipe_desc *spd, struct page 
 *page,
 + unsigned int len, unsigned int offset)
 +{
 + struct page *p;
 +
 + if (unlikely(spd-nr_pages == PIPE_BUFFERS))
 + return 1;
 +
 +#ifdef NET_COPY_SPLICE
 + p = alloc_pages(GFP_KERNEL, 0);
 + if (!p)
 + return 1;
 +
 + memcpy(page_address(p) + offset, page_address(page) + offset, len);
 +#else
 + p = page;
 + get_page(p);
 +#endif

Some pages have zero reference counter here.

Is commented NET_COPY_SPLICE part from old implementation?
It will be always slower than existing approach due to allocation
overhead.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC VLAN 00/08]: VLAN netlink support

2007-06-05 Thread Patrick McHardy
Patrick McHardy wrote:
 These patches convert VLAN to the rtnl_link API as a more complex example.
 The patches need some more work, not all VLAN features are supported yet.


Unfortunately I managed to send the wrong patchset :(
Please ignore (only the VLAN kernel patches), new patches
coming up.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC VLAN 00/10]: VLAN netlink support try 2

2007-06-05 Thread Patrick McHardy
The last VLAN patchset was outdated, sorry about the mixup. These are the
correct patches. The iproute patch I posted was correct, so no repost of
that one.


 include/linux/if_link.h  |   25 ++
 include/linux/if_vlan.h  |7 
 net/8021q/Makefile   |2 
 net/8021q/vlan.c |  537 +++
 net/8021q/vlan.h |   21 +
 net/8021q/vlan_dev.c |  170 +-
 net/8021q/vlan_netlink.c |  209 ++
 net/8021q/vlanproc.c |4 
 8 files changed, 586 insertions(+), 389 deletions(-)

Patrick McHardy (10):
  [VLAN]: Fix off-by-ones in VLAN ID checks
  [VLAN]: Convert name-based configuration functions to struct netdevice *
  [VLAN]: Move some device intialization code to dev-init callback
  [VLAN]: Move vlan_group allocation to seperate function
  [VLAN]: Split up device checks
  [VLAN]: Move device registation to seperate function
  [VLAN]: Return proper error codes in register_vlan_device
  [VLAN]: Use 32 bit value for skb-priority mapping
  [VLAN]: Keep track of number of QoS mappings
  [VLAN]: Use rtnl_link API
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC VLAN 01/10]: Fix off-by-ones in VLAN ID checks

2007-06-05 Thread Patrick McHardy
[VLAN]: Fix off-by-ones in VLAN ID checks

The valid range of VLAN IDs is 0-4095, fix using ID 4095.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 576d73325b7ce3d454c544dcdfea0a2e3ae6486e
tree 2039651421302cf1863cd4b70f502d18a04bc7ac
parent 6a0492fd68c8c5f528e4277bec20bef8047aec2e
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:44 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:44 +0200

 net/8021q/vlan.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index ceef57c..f3b1328 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -173,7 +173,7 @@ static struct vlan_group *__vlan_find_group(int 
real_dev_ifindex)
return NULL;
 }
 
-/*  Find the protocol handler.  Assumes VID  VLAN_VID_MASK.
+/*  Find the protocol handler.  Assumes VID = VLAN_VID_MASK.
  *
  * Must be invoked with RCU read lock (no preempt)
  */
@@ -222,7 +222,7 @@ static int unregister_vlan_dev(struct net_device *real_dev,
 #endif
 
/* sanity check */
-   if (vlan_id = VLAN_VID_MASK)
+   if (vlan_id  VLAN_VID_MASK)
return -EINVAL;
 
ASSERT_RTNL();
@@ -256,11 +256,11 @@ static int unregister_vlan_dev(struct net_device 
*real_dev,
/* If the group is now empty, kill off the
 * group.
 */
-   for (i = 0; i  VLAN_VID_MASK; i++)
+   for (i = 0; i = VLAN_VID_MASK; i++)
if (vlan_group_get_device(grp, i))
break;
 
-   if (i == VLAN_VID_MASK) {
+   if (i  VLAN_VID_MASK) {
if (real_dev-features  NETIF_F_HW_VLAN_RX)
real_dev-vlan_rx_register(real_dev, 
NULL);
 
@@ -392,7 +392,7 @@ static struct net_device *register_vlan_device(const char 
*eth_IF_name,
__FUNCTION__, eth_IF_name, VLAN_ID);
 #endif
 
-   if (VLAN_ID = VLAN_VID_MASK)
+   if (VLAN_ID  VLAN_VID_MASK)
goto out_ret_null;
 
/* find the device relating to eth_IF_name. */
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC VLAN 02/10]: Convert name-based configuration functions to struct netdevice *

2007-06-05 Thread Patrick McHardy
[VLAN]: Convert name-based configuration functions to struct netdevice *

Move the device lookup and checks to the ioctl handler under the RTNL and
change all name-based interfaces to take a struct net_device * instead.

This allows to use them from a netlink interface, which identifies devices
based on ifindex not name. It also avoids races between the ioctl interface
and the (upcoming) netlink interface since now all changes happen under the
RTNL.

As a nice side effect this greatly simplifies error handling in the helper
functions and fixes a number of incorrect error codes like -EINVAL for
device not found.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 9f146bc68a86cb588f3150894b3af5646cfb4fdc
tree f28cb67843e6c566b6b3923aaefc5097c1c82d58
parent 576d73325b7ce3d454c544dcdfea0a2e3ae6486e
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:44 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:44 +0200

 net/8021q/vlan.c |  152 +-
 net/8021q/vlan.h |   13 +++-
 net/8021q/vlan_dev.c |  143 +--
 3 files changed, 109 insertions(+), 199 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index f3b1328..788324a 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -278,43 +278,16 @@ static int unregister_vlan_dev(struct net_device 
*real_dev,
return ret;
 }
 
-static int unregister_vlan_device(const char *vlan_IF_name)
+static int unregister_vlan_device(struct net_device *dev)
 {
-   struct net_device *dev = NULL;
int ret;
 
+   ret = unregister_vlan_dev(VLAN_DEV_INFO(dev)-real_dev,
+ VLAN_DEV_INFO(dev)-vlan_id);
+   unregister_netdevice(dev);
 
-   dev = dev_get_by_name(vlan_IF_name);
-   ret = -EINVAL;
-   if (dev) {
-   if (dev-priv_flags  IFF_802_1Q_VLAN) {
-   rtnl_lock();
-
-   ret = unregister_vlan_dev(VLAN_DEV_INFO(dev)-real_dev,
- VLAN_DEV_INFO(dev)-vlan_id);
-
-   dev_put(dev);
-   unregister_netdevice(dev);
-
-   rtnl_unlock();
-
-   if (ret == 1)
-   ret = 0;
-   } else {
-   printk(VLAN_ERR
-  %s: ERROR:  Tried to remove a non-vlan 
device 
-  with VLAN code, name: %s  priv_flags: %hX\n,
-  __FUNCTION__, dev-name, dev-priv_flags);
-   dev_put(dev);
-   ret = -EPERM;
-   }
-   } else {
-#ifdef VLAN_DEBUG
-   printk(VLAN_DBG %s: WARNING: Could not find dev.\n, 
__FUNCTION__);
-#endif
-   ret = -EINVAL;
-   }
-
+   if (ret == 1)
+   ret = 0;
return ret;
 }
 
@@ -378,12 +351,11 @@ static struct lock_class_key vlan_netdev_xmit_lock_key;
  *  Returns the device that was created, or NULL if there was
  *  an error of some kind.
  */
-static struct net_device *register_vlan_device(const char *eth_IF_name,
+static struct net_device *register_vlan_device(struct net_device *real_dev,
   unsigned short VLAN_ID)
 {
struct vlan_group *grp;
struct net_device *new_dev;
-   struct net_device *real_dev; /* the ethernet device */
char name[IFNAMSIZ];
int i;
 
@@ -395,46 +367,36 @@ static struct net_device *register_vlan_device(const char 
*eth_IF_name,
if (VLAN_ID  VLAN_VID_MASK)
goto out_ret_null;
 
-   /* find the device relating to eth_IF_name. */
-   real_dev = dev_get_by_name(eth_IF_name);
-   if (!real_dev)
-   goto out_ret_null;
-
if (real_dev-features  NETIF_F_VLAN_CHALLENGED) {
printk(VLAN_DBG %s: VLANs not supported on %s.\n,
__FUNCTION__, real_dev-name);
-   goto out_put_dev;
+   goto out_ret_null;
}
 
if ((real_dev-features  NETIF_F_HW_VLAN_RX) 
!real_dev-vlan_rx_register) {
printk(VLAN_DBG %s: Device %s has buggy VLAN hw accel.\n,
__FUNCTION__, real_dev-name);
-   goto out_put_dev;
+   goto out_ret_null;
}
 
if ((real_dev-features  NETIF_F_HW_VLAN_FILTER) 
(!real_dev-vlan_rx_add_vid || !real_dev-vlan_rx_kill_vid)) {
printk(VLAN_DBG %s: Device %s has buggy VLAN hw accel.\n,
__FUNCTION__, real_dev-name);
-   goto out_put_dev;
+   goto out_ret_null;
}
 
-   /* From this point on, all the data structures must remain
-* consistent.
-*/
-   rtnl_lock();
-
/* The real device must be up and operating in order to
 * assosciate a 

[RFC VLAN 03/10]: Move some device intialization code to dev-init callback

2007-06-05 Thread Patrick McHardy
[VLAN]: Move some device intialization code to dev-init callback

Move some device initialization code to new dev-init callback to make
it shareable with netlink. Additionally this fixes a minor bug, dev-iflink
is set after registration, which causes an incorrect value in the initial
netlink message.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit ffed416aac143a9431c7d9d6b3214288809eb89a
tree 5c4be92ecd9fddaac6899d5a704398ba95c9bdcb
parent 9f146bc68a86cb588f3150894b3af5646cfb4fdc
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:44 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:44 +0200

 net/8021q/vlan.c |   98 +++---
 1 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 788324a..7da8210 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -291,6 +291,53 @@ static int unregister_vlan_device(struct net_device *dev)
return ret;
 }
 
+/*
+ * vlan network devices have devices nesting below it, and are a special
+ * super class of normal network devices; split their locks off into a
+ * separate class since they always nest.
+ */
+static struct lock_class_key vlan_netdev_xmit_lock_key;
+
+static int vlan_dev_init(struct net_device *dev)
+{
+   struct net_device *real_dev = VLAN_DEV_INFO(dev)-real_dev;
+
+   /* IFF_BROADCAST|IFF_MULTICAST; ??? */
+   dev-flags  = real_dev-flags  ~IFF_UP;
+   dev-iflink = real_dev-ifindex;
+   dev-state  = (real_dev-state  ((1__LINK_STATE_NOCARRIER) |
+ (1__LINK_STATE_DORMANT))) |
+ (1__LINK_STATE_PRESENT);
+
+   /* need 4 bytes for extra VLAN header info,
+* hope the underlying device can handle it.
+*/
+   dev-mtu = real_dev-mtu;
+
+   /* TODO: maybe just assign it to be ETHERNET? */
+   dev-type = real_dev-type;
+
+   memcpy(dev-broadcast, real_dev-broadcast, real_dev-addr_len);
+   memcpy(dev-dev_addr, real_dev-dev_addr, real_dev-addr_len);
+   dev-addr_len = real_dev-addr_len;
+
+   if (real_dev-features  NETIF_F_HW_VLAN_TX) {
+   dev-hard_header = real_dev-hard_header;
+   dev-hard_header_len = real_dev-hard_header_len;
+   dev-hard_start_xmit = vlan_dev_hwaccel_hard_start_xmit;
+   dev-rebuild_header  = real_dev-rebuild_header;
+   } else {
+   dev-hard_header = vlan_dev_hard_header;
+   dev-hard_header_len = real_dev-hard_header_len + VLAN_HLEN;
+   dev-hard_start_xmit = vlan_dev_hard_start_xmit;
+   dev-rebuild_header  = vlan_dev_rebuild_header;
+   }
+   dev-hard_header_parse = real_dev-hard_header_parse;
+
+   lockdep_set_class(dev-_xmit_lock, vlan_netdev_xmit_lock_key);
+   return 0;
+}
+
 static void vlan_setup(struct net_device *new_dev)
 {
SET_MODULE_OWNER(new_dev);
@@ -311,6 +358,7 @@ static void vlan_setup(struct net_device *new_dev)
 
/* set up method calls */
new_dev-change_mtu = vlan_dev_change_mtu;
+   new_dev-init = vlan_dev_init;
new_dev-open = vlan_dev_open;
new_dev-stop = vlan_dev_stop;
new_dev-set_mac_address = vlan_dev_set_mac_address;
@@ -339,14 +387,6 @@ static void vlan_transfer_operstate(const struct 
net_device *dev, struct net_dev
}
 }
 
-/*
- * vlan network devices have devices nesting below it, and are a special
- * super class of normal network devices; split their locks off into a
- * separate class since they always nest.
- */
-static struct lock_class_key vlan_netdev_xmit_lock_key;
-
-
 /*  Attach a VLAN device to a mac address (ie Ethernet Card).
  *  Returns the device that was created, or NULL if there was
  *  an error of some kind.
@@ -437,47 +477,10 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
 
 #ifdef VLAN_DEBUG
printk(VLAN_DBG Allocated new name -:%s:-\n, new_dev-name);
-#endif
-   /* IFF_BROADCAST|IFF_MULTICAST; ??? */
-   new_dev-flags = real_dev-flags;
-   new_dev-flags = ~IFF_UP;
-
-   new_dev-state = (real_dev-state  ((1__LINK_STATE_NOCARRIER) |
-(1__LINK_STATE_DORMANT))) |
-(1__LINK_STATE_PRESENT);
-
-   /* need 4 bytes for extra VLAN header info,
-* hope the underlying device can handle it.
-*/
-   new_dev-mtu = real_dev-mtu;
-
-   /* TODO: maybe just assign it to be ETHERNET? */
-   new_dev-type = real_dev-type;
-
-   new_dev-hard_header_len = real_dev-hard_header_len;
-   if (!(real_dev-features  NETIF_F_HW_VLAN_TX)) {
-   /* Regular ethernet + 4 bytes (18 total). */
-   new_dev-hard_header_len += VLAN_HLEN;
-   }
-
VLAN_MEM_DBG(new_dev-priv malloc, addr: %p  size: %i\n,
 new_dev-priv,
 sizeof(struct 

[RFC VLAN 04/10]: Move vlan_group allocation to seperate function

2007-06-05 Thread Patrick McHardy
[VLAN]: Move vlan_group allocation to seperate function

Move group allocation to a seperate function to clean up the code a bit
and allocate groups before registering the device. Device registration
is globally visible and causes netlink events, so we shouldn't fail
afterwards.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 77fa17d4f74fcc3a330b95023fc335713eba995a
tree d1fdc97531109725b31bf1feb27440a5af370d4a
parent ffed416aac143a9431c7d9d6b3214288809eb89a
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:44 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:44 +0200

 net/8021q/vlan.c |   78 --
 1 files changed, 41 insertions(+), 37 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 7da8210..f193d78 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -197,6 +197,34 @@ static void vlan_group_free(struct vlan_group *grp)
kfree(grp);
 }
 
+static struct vlan_group *vlan_group_alloc(int ifindex)
+{
+   struct vlan_group *grp;
+   unsigned int size;
+   unsigned int i;
+
+   grp = kzalloc(sizeof(struct vlan_group), GFP_KERNEL);
+   if (!grp)
+   return NULL;
+
+   size = sizeof(struct net_device *) * VLAN_GROUP_ARRAY_PART_LEN;
+
+   for (i = 0; i  VLAN_GROUP_ARRAY_SPLIT_PARTS; i++) {
+   grp-vlan_devices_arrays[i] = kzalloc(size, GFP_KERNEL);
+   if (!grp-vlan_devices_arrays[i])
+   goto err;
+   }
+
+   grp-real_dev_ifindex = ifindex;
+   hlist_add_head_rcu(grp-hlist,
+  vlan_group_hash[vlan_grp_hashfn(ifindex)]);
+   return grp;
+
+err:
+   vlan_group_free(grp);
+   return NULL;
+}
+
 static void vlan_rcu_free(struct rcu_head *rcu)
 {
vlan_group_free(container_of(rcu, struct vlan_group, rcu));
@@ -394,10 +422,9 @@ static void vlan_transfer_operstate(const struct 
net_device *dev, struct net_dev
 static struct net_device *register_vlan_device(struct net_device *real_dev,
   unsigned short VLAN_ID)
 {
-   struct vlan_group *grp;
+   struct vlan_group *grp, *ngrp = NULL;
struct net_device *new_dev;
char name[IFNAMSIZ];
-   int i;
 
 #ifdef VLAN_DEBUG
printk(VLAN_DBG %s: if_name -:%s:- vid: %i\n,
@@ -491,9 +518,15 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
printk(VLAN_DBG About to go find the group for idx: %i\n,
   real_dev-ifindex);
 #endif
+   grp = __vlan_find_group(real_dev-ifindex);
+   if (!grp) {
+   ngrp = grp = vlan_group_alloc(real_dev-ifindex);
+   if (!grp)
+   goto out_free_newdev;
+   }
 
if (register_netdevice(new_dev))
-   goto out_free_newdev;
+   goto out_free_group;
 
vlan_transfer_operstate(real_dev, new_dev);
linkwatch_fire_event(new_dev); /* _MUST_ call rfc2863_policy() */
@@ -501,34 +534,8 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
/* So, got the sucker initialized, now lets place
 * it into our local structure.
 */
-   grp = __vlan_find_group(real_dev-ifindex);
-
-   /* Note, we are running under the RTNL semaphore
-* so it cannot appear on us.
-*/
-   if (!grp) { /* need to add a new group */
-   grp = kzalloc(sizeof(struct vlan_group), GFP_KERNEL);
-   if (!grp)
-   goto out_free_unregister;
-
-   for (i=0; i  VLAN_GROUP_ARRAY_SPLIT_PARTS; i++) {
-   grp-vlan_devices_arrays[i] = kzalloc(
-   sizeof(struct net_device 
*)*VLAN_GROUP_ARRAY_PART_LEN,
-   GFP_KERNEL);
-
-   if (!grp-vlan_devices_arrays[i])
-   goto out_free_arrays;
-   }
-
-   /* printk(KERN_ALERT VLAN REGISTER:  Allocated new group.\n); 
*/
-   grp-real_dev_ifindex = real_dev-ifindex;
-
-   hlist_add_head_rcu(grp-hlist,
-  
vlan_group_hash[vlan_grp_hashfn(real_dev-ifindex)]);
-
-   if (real_dev-features  NETIF_F_HW_VLAN_RX)
-   real_dev-vlan_rx_register(real_dev, grp);
-   }
+   if (ngrp  real_dev-features  NETIF_F_HW_VLAN_RX)
+   real_dev-vlan_rx_register(real_dev, ngrp);
 
vlan_group_set_device(grp, VLAN_ID, new_dev);
 
@@ -546,12 +553,9 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
 #endif
return new_dev;
 
-out_free_arrays:
-   vlan_group_free(grp);
-
-out_free_unregister:
-   unregister_netdev(new_dev);
-   goto out_ret_null;
+out_free_group:
+   if (ngrp)
+   vlan_group_free(ngrp);
 
 out_free_newdev:
free_netdev(new_dev);
-
To 

[RFC VLAN 05/10]: Split up device checks

2007-06-05 Thread Patrick McHardy
[VLAN]: Split up device checks

Move the checks of the underlying device to a seperate function.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 2d952c2f38391a43f380028a42106ee8880574be
tree e877b367527691ea5abc0342bf5212fa1e97205d
parent 77fa17d4f74fcc3a330b95023fc335713eba995a
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200

 net/8021q/vlan.c |   56 +++---
 1 files changed, 32 insertions(+), 24 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index f193d78..6233885 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -415,57 +415,65 @@ static void vlan_transfer_operstate(const struct 
net_device *dev, struct net_dev
}
 }
 
-/*  Attach a VLAN device to a mac address (ie Ethernet Card).
- *  Returns the device that was created, or NULL if there was
- *  an error of some kind.
- */
-static struct net_device *register_vlan_device(struct net_device *real_dev,
-  unsigned short VLAN_ID)
+static int vlan_check_real_dev(struct net_device *real_dev, unsigned short 
vlan_id)
 {
-   struct vlan_group *grp, *ngrp = NULL;
-   struct net_device *new_dev;
-   char name[IFNAMSIZ];
-
-#ifdef VLAN_DEBUG
-   printk(VLAN_DBG %s: if_name -:%s:- vid: %i\n,
-   __FUNCTION__, eth_IF_name, VLAN_ID);
-#endif
-
-   if (VLAN_ID  VLAN_VID_MASK)
-   goto out_ret_null;
-
if (real_dev-features  NETIF_F_VLAN_CHALLENGED) {
printk(VLAN_DBG %s: VLANs not supported on %s.\n,
__FUNCTION__, real_dev-name);
-   goto out_ret_null;
+   return -EOPNOTSUPP;
}
 
if ((real_dev-features  NETIF_F_HW_VLAN_RX) 
!real_dev-vlan_rx_register) {
printk(VLAN_DBG %s: Device %s has buggy VLAN hw accel.\n,
__FUNCTION__, real_dev-name);
-   goto out_ret_null;
+   return -EOPNOTSUPP;
}
 
if ((real_dev-features  NETIF_F_HW_VLAN_FILTER) 
(!real_dev-vlan_rx_add_vid || !real_dev-vlan_rx_kill_vid)) {
printk(VLAN_DBG %s: Device %s has buggy VLAN hw accel.\n,
__FUNCTION__, real_dev-name);
-   goto out_ret_null;
+   return -EOPNOTSUPP;
}
 
/* The real device must be up and operating in order to
 * assosciate a VLAN device with it.
 */
if (!(real_dev-flags  IFF_UP))
-   goto out_ret_null;
+   return -ENETDOWN;
 
-   if (__find_vlan_dev(real_dev, VLAN_ID) != NULL) {
+   if (__find_vlan_dev(real_dev, vlan_id) != NULL) {
/* was already registered. */
printk(VLAN_DBG %s: ALREADY had VLAN registered\n, 
__FUNCTION__);
-   goto out_ret_null;
+   return -EEXIST;
}
 
+   return 0;
+}
+
+/*  Attach a VLAN device to a mac address (ie Ethernet Card).
+ *  Returns the device that was created, or NULL if there was
+ *  an error of some kind.
+ */
+static struct net_device *register_vlan_device(struct net_device *real_dev,
+  unsigned short VLAN_ID)
+{
+   struct vlan_group *grp, *ngrp = NULL;
+   struct net_device *new_dev;
+   char name[IFNAMSIZ];
+
+#ifdef VLAN_DEBUG
+   printk(VLAN_DBG %s: if_name -:%s:- vid: %i\n,
+   __FUNCTION__, eth_IF_name, VLAN_ID);
+#endif
+
+   if (VLAN_ID  VLAN_VID_MASK)
+   goto out_ret_null;
+
+   if (vlan_check_real_dev(real_dev, VLAN_ID)  0)
+   goto out_ret_null;
+
/* Gotta set up the fields for the device. */
 #ifdef VLAN_DEBUG
printk(VLAN_DBG About to allocate name, vlan_name_type: %i\n,
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC VLAN 06/10]: Move device registation to seperate function

2007-06-05 Thread Patrick McHardy
[VLAN]: Move device registation to seperate function

Move device registration and configuration of the underlying device to a
seperate function.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 312dc0ee04bed170ad8091f777fcc2108105ca2c
tree 34a83935bc1bfbac67cc50dc83bbdf67645dcc0d
parent 2d952c2f38391a43f380028a42106ee8880574be
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200

 net/8021q/vlan.c |   83 +++---
 1 files changed, 47 insertions(+), 36 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 6233885..0fa071d 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -452,6 +452,51 @@ static int vlan_check_real_dev(struct net_device 
*real_dev, unsigned short vlan_
return 0;
 }
 
+static int register_vlan_dev(struct net_device *dev)
+{
+   struct vlan_dev_info *vlan = VLAN_DEV_INFO(dev);
+   struct net_device *real_dev = vlan-real_dev;
+   unsigned short vlan_id = vlan-vlan_id;
+   struct vlan_group *grp, *ngrp = NULL;
+   int err;
+
+   grp = __vlan_find_group(real_dev-ifindex);
+   if (!grp) {
+   ngrp = grp = vlan_group_alloc(real_dev-ifindex);
+   if (!grp)
+   return -ENOBUFS;
+   }
+
+   err = register_netdevice(dev);
+   if (err  0)
+   goto out_free_group;
+
+   /* Account for reference in struct vlan_dev_info */
+   dev_hold(real_dev);
+
+   vlan_transfer_operstate(real_dev, dev);
+   linkwatch_fire_event(dev); /* _MUST_ call rfc2863_policy() */
+
+   /* So, got the sucker initialized, now lets place
+* it into our local structure.
+*/
+   vlan_group_set_device(grp, vlan_id, dev);
+   if (ngrp  real_dev-features  NETIF_F_HW_VLAN_RX)
+   real_dev-vlan_rx_register(real_dev, ngrp);
+   if (real_dev-features  NETIF_F_HW_VLAN_FILTER)
+   real_dev-vlan_rx_add_vid(real_dev, vlan_id);
+
+   if (vlan_proc_add_dev(dev)  0)
+   printk(KERN_WARNING VLAN: failed to add proc entry for %s\n,
+  dev-name);
+   return 0;
+
+out_free_group:
+   if (ngrp)
+   vlan_group_free(ngrp);
+   return err;
+}
+
 /*  Attach a VLAN device to a mac address (ie Ethernet Card).
  *  Returns the device that was created, or NULL if there was
  *  an error of some kind.
@@ -459,7 +504,6 @@ static int vlan_check_real_dev(struct net_device *real_dev, 
unsigned short vlan_
 static struct net_device *register_vlan_device(struct net_device *real_dev,
   unsigned short VLAN_ID)
 {
-   struct vlan_group *grp, *ngrp = NULL;
struct net_device *new_dev;
char name[IFNAMSIZ];
 
@@ -522,37 +566,8 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
VLAN_DEV_INFO(new_dev)-dent = NULL;
VLAN_DEV_INFO(new_dev)-flags = 1;
 
-#ifdef VLAN_DEBUG
-   printk(VLAN_DBG About to go find the group for idx: %i\n,
-  real_dev-ifindex);
-#endif
-   grp = __vlan_find_group(real_dev-ifindex);
-   if (!grp) {
-   ngrp = grp = vlan_group_alloc(real_dev-ifindex);
-   if (!grp)
-   goto out_free_newdev;
-   }
-
-   if (register_netdevice(new_dev))
-   goto out_free_group;
-
-   vlan_transfer_operstate(real_dev, new_dev);
-   linkwatch_fire_event(new_dev); /* _MUST_ call rfc2863_policy() */
-
-   /* So, got the sucker initialized, now lets place
-* it into our local structure.
-*/
-   if (ngrp  real_dev-features  NETIF_F_HW_VLAN_RX)
-   real_dev-vlan_rx_register(real_dev, ngrp);
-
-   vlan_group_set_device(grp, VLAN_ID, new_dev);
-
-   if (vlan_proc_add_dev(new_dev)0)/* create it's proc entry */
-   printk(KERN_WARNING VLAN: failed to add proc entry for %s\n,
-new_dev-name);
-
-   if (real_dev-features  NETIF_F_HW_VLAN_FILTER)
-   real_dev-vlan_rx_add_vid(real_dev, VLAN_ID);
+   if (register_vlan_dev(new_dev)  0)
+   goto out_free_newdev;
 
/* Account for reference in struct vlan_dev_info */
dev_hold(real_dev);
@@ -561,10 +576,6 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
 #endif
return new_dev;
 
-out_free_group:
-   if (ngrp)
-   vlan_group_free(ngrp);
-
 out_free_newdev:
free_netdev(new_dev);
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC VLAN 07/10]: Return proper error codes in register_vlan_device

2007-06-05 Thread Patrick McHardy
[VLAN]: Return proper error codes in register_vlan_device

The returned device is unused, return proper error codes instead and avoid
having the ioctl handler guess the error.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit f2eb0828c4cfaf4d4aed225b55ca65b052109657
tree 967dc78c7ed8931a62bdaaa0e57304e75bfe91b9
parent 312dc0ee04bed170ad8091f777fcc2108105ca2c
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200

 net/8021q/vlan.c |   32 ++--
 1 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 0fa071d..03b67b9 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -498,14 +498,14 @@ out_free_group:
 }
 
 /*  Attach a VLAN device to a mac address (ie Ethernet Card).
- *  Returns the device that was created, or NULL if there was
- *  an error of some kind.
+ *  Returns 0 if the device was created or a negative error code otherwise.
  */
-static struct net_device *register_vlan_device(struct net_device *real_dev,
-  unsigned short VLAN_ID)
+static int register_vlan_device(struct net_device *real_dev,
+   unsigned short VLAN_ID)
 {
struct net_device *new_dev;
char name[IFNAMSIZ];
+   int err;
 
 #ifdef VLAN_DEBUG
printk(VLAN_DBG %s: if_name -:%s:- vid: %i\n,
@@ -513,10 +513,11 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
 #endif
 
if (VLAN_ID  VLAN_VID_MASK)
-   goto out_ret_null;
+   return -ERANGE;
 
-   if (vlan_check_real_dev(real_dev, VLAN_ID)  0)
-   goto out_ret_null;
+   err = vlan_check_real_dev(real_dev, VLAN_ID);
+   if (err  0)
+   return err;
 
/* Gotta set up the fields for the device. */
 #ifdef VLAN_DEBUG
@@ -552,7 +553,7 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
   vlan_setup);
 
if (new_dev == NULL)
-   goto out_ret_null;
+   return -ENOBUFS;
 
 #ifdef VLAN_DEBUG
printk(VLAN_DBG Allocated new name -:%s:-\n, new_dev-name);
@@ -566,7 +567,8 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
VLAN_DEV_INFO(new_dev)-dent = NULL;
VLAN_DEV_INFO(new_dev)-flags = 1;
 
-   if (register_vlan_dev(new_dev)  0)
+   err = register_vlan_dev(new_dev);
+   if (err  0)
goto out_free_newdev;
 
/* Account for reference in struct vlan_dev_info */
@@ -574,13 +576,11 @@ static struct net_device *register_vlan_device(struct 
net_device *real_dev,
 #ifdef VLAN_DEBUG
printk(VLAN_DBG Allocated new device successfully, returning.\n);
 #endif
-   return new_dev;
+   return 0;
 
 out_free_newdev:
free_netdev(new_dev);
-
-out_ret_null:
-   return NULL;
+   return err;
 }
 
 static int vlan_device_event(struct notifier_block *unused, unsigned long 
event, void *ptr)
@@ -753,11 +753,7 @@ static int vlan_ioctl_handler(void __user *arg)
err = -EPERM;
if (!capable(CAP_NET_ADMIN))
break;
-   if (register_vlan_device(dev, args.u.VID)) {
-   err = 0;
-   } else {
-   err = -EINVAL;
-   }
+   err = register_vlan_device(dev, args.u.VID);
break;
 
case DEL_VLAN_CMD:
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC VLAN 08/10]: Use 32 bit value for skb-priority mapping

2007-06-05 Thread Patrick McHardy
[VLAN]: Use 32 bit value for skb-priority mapping

skb-priority has only 32 bits and even VLAN uses 32 bit values in its API.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 3151f6fdca072a91c2184192dc1d431a4d9248b7
tree 21f76dd1682e385f7d958d5d5ebb66c3585ed2ca
parent f2eb0828c4cfaf4d4aed225b55ca65b052109657
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200

 include/linux/if_vlan.h |4 ++--
 net/8021q/vlanproc.c|4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 81e9bc9..aeddb49 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -99,7 +99,7 @@ static inline void vlan_group_set_device(struct vlan_group 
*vg, int vlan_id,
 }
 
 struct vlan_priority_tci_mapping {
-   unsigned long priority;
+   u32 priority;
unsigned short vlan_qos; /* This should be shifted when first set, so 
we only do it
  * at provisioning time.
  * ((skb-priority  13)  0xE000)
@@ -112,7 +112,7 @@ struct vlan_dev_info {
/** This will be the mapping that correlates skb-priority to
 * 3 bits of VLAN QOS tags...
 */
-   unsigned long ingress_priority_map[8];
+   u32 ingress_priority_map[8];
struct vlan_priority_tci_mapping *egress_priority_map[16]; /* hash 
table */
 
unsigned short vlan_id;/*  The VLAN Identifier for this 
interface. */
diff --git a/net/8021q/vlanproc.c b/net/8021q/vlanproc.c
index d216a64..8693b21 100644
--- a/net/8021q/vlanproc.c
+++ b/net/8021q/vlanproc.c
@@ -342,7 +342,7 @@ static int vlandev_seq_show(struct seq_file *seq, void 
*offset)
seq_printf(seq, Device: %s, dev_info-real_dev-name);
/* now show all PRIORITY mappings relating to this VLAN */
seq_printf(seq,
-  \nINGRESS priority mappings: 0:%lu  1:%lu  2:%lu  3:%lu 
 4:%lu  5:%lu  6:%lu 7:%lu\n,
+  \nINGRESS priority mappings: 0:%u  1:%u  2:%u  3:%u  
4:%u  5:%u  6:%u 7:%u\n,
   dev_info-ingress_priority_map[0],
   dev_info-ingress_priority_map[1],
   dev_info-ingress_priority_map[2],
@@ -357,7 +357,7 @@ static int vlandev_seq_show(struct seq_file *seq, void 
*offset)
const struct vlan_priority_tci_mapping *mp
= dev_info-egress_priority_map[i];
while (mp) {
-   seq_printf(seq, %lu:%hu ,
+   seq_printf(seq, %u:%hu ,
   mp-priority, ((mp-vlan_qos  13)  0x7));
mp = mp-next;
}
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC VLAN 09/10]: Keep track of number of QoS mappings

2007-06-05 Thread Patrick McHardy
[VLAN]: Keep track of number of QoS mappings

Keep track of the number of configured ingress/egress QoS mappings to
avoid iteration while calculating the netlink attribute size.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit e38a7f02fcf29d44a81cb1e697f7381a8764c28d
tree 2e24e2b1421f381c2f13bafe9a10e73c97d7ef7f
parent 3151f6fdca072a91c2184192dc1d431a4d9248b7
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200

 include/linux/if_vlan.h |3 +++
 net/8021q/vlan_dev.c|   27 +--
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index aeddb49..b46d422 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -112,7 +112,10 @@ struct vlan_dev_info {
/** This will be the mapping that correlates skb-priority to
 * 3 bits of VLAN QOS tags...
 */
+   unsigned int nr_ingress_mappings;
u32 ingress_priority_map[8];
+
+   unsigned int nr_egress_mappings;
struct vlan_priority_tci_mapping *egress_priority_map[16]; /* hash 
table */
 
unsigned short vlan_id;/*  The VLAN Identifier for this 
interface. */
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 0b7e03e..1644626 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -537,35 +537,50 @@ int vlan_dev_change_mtu(struct net_device *dev, int 
new_mtu)
 void vlan_dev_set_ingress_priority(const struct net_device *dev,
   u32 skb_prio, short vlan_prio)
 {
-   VLAN_DEV_INFO(dev)-ingress_priority_map[vlan_prio  0x7] = skb_prio;
+   struct vlan_dev_info *vlan = VLAN_DEV_INFO(dev);
+
+   if (vlan-ingress_priority_map[vlan_prio  0x7]  !skb_prio)
+   vlan-nr_ingress_mappings--;
+   else if (!vlan-ingress_priority_map[vlan_prio  0x7]  skb_prio)
+   vlan-nr_ingress_mappings++;
+
+   vlan-ingress_priority_map[vlan_prio  0x7] = skb_prio;
 }
 
 int vlan_dev_set_egress_priority(const struct net_device *dev,
 u32 skb_prio, short vlan_prio)
 {
+   struct vlan_dev_info *vlan = VLAN_DEV_INFO(dev);
struct vlan_priority_tci_mapping *mp = NULL;
struct vlan_priority_tci_mapping *np;
+   u32 vlan_qos = (vlan_prio  13)  0xE000;
 
/* See if a priority mapping exists.. */
-   mp = VLAN_DEV_INFO(dev)-egress_priority_map[skb_prio  0xF];
+   mp = vlan-egress_priority_map[skb_prio  0xF];
while (mp) {
if (mp-priority == skb_prio) {
-   mp-vlan_qos = ((vlan_prio  13)  0xE000);
+   if (mp-vlan_qos  !vlan_qos)
+   vlan-nr_egress_mappings--;
+   else if (!mp-vlan_qos  vlan_qos)
+   vlan-nr_egress_mappings++;
+   mp-vlan_qos = vlan_qos;
return 0;
}
mp = mp-next;
}
 
/* Create a new mapping then. */
-   mp = VLAN_DEV_INFO(dev)-egress_priority_map[skb_prio  0xF];
+   mp = vlan-egress_priority_map[skb_prio  0xF];
np = kmalloc(sizeof(struct vlan_priority_tci_mapping), GFP_KERNEL);
if (!np)
return -ENOBUFS;
 
np-next = mp;
np-priority = skb_prio;
-   np-vlan_qos = ((vlan_prio  13)  0xE000);
-   VLAN_DEV_INFO(dev)-egress_priority_map[skb_prio  0xF] = np;
+   np-vlan_qos = vlan_qos;
+   vlan-egress_priority_map[skb_prio  0xF] = np;
+   if (vlan_qos)
+   vlan-nr_egress_mappings++;
return 0;
 }
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC VLAN 10/10]: Use rtnl_link API

2007-06-05 Thread Patrick McHardy
[VLAN]: Use rtnl_link API

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 0c8500625d51c2a0c38404d941ab925c23c972b3
tree 9fbf7296bb873dc918d6865aa4437a9dfaf53237
parent e38a7f02fcf29d44a81cb1e697f7381a8764c28d
author Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200
committer Patrick McHardy [EMAIL PROTECTED] Tue, 05 Jun 2007 16:33:45 +0200

 include/linux/if_link.h  |   25 ++
 net/8021q/Makefile   |2 
 net/8021q/vlan.c |   28 --
 net/8021q/vlan.h |8 ++
 net/8021q/vlan_netlink.c |  209 ++
 5 files changed, 261 insertions(+), 11 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index e46ed94..4bb2d5a 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -153,4 +153,29 @@ enum
 
 #define IFLA_INFO_MAX  (__IFLA_INFO_MAX - 1)
 
+enum
+{
+   IFLA_VLAN_UNSPEC,
+   IFLA_VLAN_ID,
+   IFLA_VLAN_EGRESS_QOS,
+   IFLA_VLAN_INGRESS_QOS,
+   __IFLA_VLAN_MAX,
+};
+
+#define IFLA_VLAN_MAX  (__IFLA_VLAN_MAX - 1)
+
+struct ifla_vlan_qos_mapping
+{
+   __u32 from;
+   __u32 to;
+};
+
+enum {
+   IFLA_VLAN_QOS_UNSPEC,
+   IFLA_VLAN_QOS_MAPPING,
+   __IFLA_VLAN_QOS_MAX
+};
+
+#define IFLA_VLAN_QOS_MAX  (__IFLA_VLAN_QOS_MAX - 1)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/net/8021q/Makefile b/net/8021q/Makefile
index 97feb44..10ca7f4 100644
--- a/net/8021q/Makefile
+++ b/net/8021q/Makefile
@@ -4,7 +4,7 @@
 
 obj-$(CONFIG_VLAN_8021Q) += 8021q.o
 
-8021q-objs := vlan.o vlan_dev.o
+8021q-objs := vlan.o vlan_dev.o vlan_netlink.o
 
 ifeq ($(CONFIG_PROC_FS),y)
 8021q-objs += vlanproc.o
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 03b67b9..8837d4f 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -97,15 +97,22 @@ static int __init vlan_proto_init(void)
 
/* Register us to receive netdevice events */
err = register_netdevice_notifier(vlan_notifier_block);
-   if (err  0) {
-   dev_remove_pack(vlan_packet_type);
-   vlan_proc_cleanup();
-   return err;
-   }
+   if (err  0)
+   goto err1;
 
-   vlan_ioctl_set(vlan_ioctl_handler);
+   err = vlan_netlink_init();
+   if (err  0)
+   goto err2;
 
+   vlan_ioctl_set(vlan_ioctl_handler);
return 0;
+
+err2:
+   unregister_netdevice_notifier(vlan_notifier_block);
+err1:
+   vlan_proc_cleanup();
+   dev_remove_pack(vlan_packet_type);
+   return err;
 }
 
 /* Cleanup all vlan devices
@@ -136,6 +143,7 @@ static void __exit vlan_cleanup_module(void)
 {
int i;
 
+   vlan_netlink_fini();
vlan_ioctl_set(NULL);
 
/* Un-register us from receiving netdevice events */
@@ -306,7 +314,7 @@ static int unregister_vlan_dev(struct net_device *real_dev,
return ret;
 }
 
-static int unregister_vlan_device(struct net_device *dev)
+int unregister_vlan_device(struct net_device *dev)
 {
int ret;
 
@@ -366,7 +374,7 @@ static int vlan_dev_init(struct net_device *dev)
return 0;
 }
 
-static void vlan_setup(struct net_device *new_dev)
+void vlan_setup(struct net_device *new_dev)
 {
SET_MODULE_OWNER(new_dev);
 
@@ -415,7 +423,7 @@ static void vlan_transfer_operstate(const struct net_device 
*dev, struct net_dev
}
 }
 
-static int vlan_check_real_dev(struct net_device *real_dev, unsigned short 
vlan_id)
+int vlan_check_real_dev(struct net_device *real_dev, unsigned short vlan_id)
 {
if (real_dev-features  NETIF_F_VLAN_CHALLENGED) {
printk(VLAN_DBG %s: VLANs not supported on %s.\n,
@@ -452,7 +460,7 @@ static int vlan_check_real_dev(struct net_device *real_dev, 
unsigned short vlan_
return 0;
 }
 
-static int register_vlan_dev(struct net_device *dev)
+int register_vlan_dev(struct net_device *dev)
 {
struct vlan_dev_info *vlan = VLAN_DEV_INFO(dev);
struct net_device *real_dev = vlan-real_dev;
diff --git a/net/8021q/vlan.h b/net/8021q/vlan.h
index b837390..6cb2238 100644
--- a/net/8021q/vlan.h
+++ b/net/8021q/vlan.h
@@ -72,4 +72,12 @@ void vlan_dev_get_realdev_name(const struct net_device *dev, 
char *result);
 void vlan_dev_get_vid(const struct net_device *dev, unsigned short *result);
 void vlan_dev_set_multicast_list(struct net_device *vlan_dev);
 
+int vlan_check_real_dev(struct net_device *real_dev, unsigned short vlan_id);
+void vlan_setup(struct net_device *dev);
+int register_vlan_dev(struct net_device *dev);
+int unregister_vlan_device(struct net_device *dev);
+
+int vlan_netlink_init(void);
+void vlan_netlink_fini(void);
+
 #endif /* !(__BEN_VLAN_802_1Q_INC__) */
diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c
new file mode 100644
index 000..92d6a46
--- /dev/null
+++ b/net/8021q/vlan_netlink.c
@@ -0,0 +1,209 @@
+/*
+ * VLAN netlink control interface
+ *
+ * Copyright (c) 2007 Patrick McHardy [EMAIL PROTECTED]
+ *
+ * This program is 

Re: [3/5] 2.6.22-rc4: known regressions

2007-06-05 Thread Michal Piotrowski

Hi all,

Here is a list of some known regressions in 2.6.22-rc4.

Feel free to add new regressions/remove fixed etc.
http://kernelnewbies.org/known_regressions



Networking

Subject: list_add corruption. prev-next should be next (f7d28794), but was 
f0df8ed4 (prev=f0df8ed4) Kernel Bug at lib/list_debug.c:33
References : http://bugzilla.kernel.org/show_bug.cgi?id=8561
Submitter  : Paulo Pereira [EMAIL PROTECTED]
Handled-By : Alan Stern [EMAIL PROTECTED]
Status : problem is being debugged

Subject: no irda0 interface (2.6.21 was OK), smsc does not find chip
References : http://lkml.org/lkml/2007/6/3/16
Submitter  : Andrey Borzenkov [EMAIL PROTECTED]
Handled-By : Samuel Ortiz [EMAIL PROTECTED]
Bjorn Helgaas [EMAIL PROTECTED]
Status : problem is being debugged

Subject: b44: regression in 2.6.22
References : http://lkml.org/lkml/2007/5/27/108
Submitter  : Maximilian Engelhardt [EMAIL PROTECTED]
Handled-By : Gary Zambrano [EMAIL PROTECTED]
Michael Buesch [EMAIL PROTECTED]
Thomas Gleixner [EMAIL PROTECTED]
Status : problem is being debugged



PCMCIA

Subject: libata and legacy ide pcmcia failure
References : http://lkml.org/lkml/2007/5/17/305
Submitter  : Robert de Rooy [EMAIL PROTECTED]
Status : Unknown



Regards,
Michal

--
Najbardziej brakowało mi twojego milczenia.
-- Andrzej Sapkowski Coś więcej

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] network splice receive

2007-06-05 Thread Evgeniy Polyakov
On Tue, Jun 05, 2007 at 06:31:31PM +0400, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
   [  263.936418] RIP  [8038c60c] skb_splice_bits+0xac/0x1c9
   [  263.942516]  RSP 81003c79fc88
 
 This a vm_bug_on in get_page().
 
  +static inline int spd_fill_page(struct splice_pipe_desc *spd, struct page 
  *page,
  +   unsigned int len, unsigned int offset)
  +{
  +   struct page *p;
  +
  +   if (unlikely(spd-nr_pages == PIPE_BUFFERS))
  +   return 1;
  +
  +#ifdef NET_COPY_SPLICE
  +   p = alloc_pages(GFP_KERNEL, 0);
  +   if (!p)
  +   return 1;
  +
  +   memcpy(page_address(p) + offset, page_address(page) + offset, len);
  +#else
  +   p = page;
  +   get_page(p);
  +#endif
 
 Some pages have zero reference counter here.

Very likley bug with mac address is related to this one and you do not
have vm debug enabled in the config? Naive atomic_inc and
atomic_dec_return with bug_on  0 instead of that get_page and put_page
in spd_fill_page()/sock_pipe_buf_release() resulted in broken file -
initial data contained 6B and 5A instead of zeroes sent. Even more naive 
atomic_add(2, page) ended with:

[   48.273345] page:81003ff22a18 flags:0x0100
mapping: mapcount:0 count:2
[   48.273347] Trying to fix it up, but a reboot is needed
[   48.273349] Backtrace:
[   48.295576] 
[   48.295577] Call Trace:
[   48.299624]  [8025f075] bad_page+0x67/0x95
[   48.304636]  [8025f771] __free_pages_ok+0x76/0x2c1
[   48.310343]  [8025fbec] __free_pages+0x29/0x2b
[   48.315703]  [8025fc38] free_pages+0x4a/0x4f
[   48.320884]  [8027b15f] kmem_freepages+0xd9/0xe2
[   48.326416]  [8027bd93] slab_destroy+0xef/0x114
[   48.331865]  [8027bf15] free_block+0x15d/0x19f
[   48.337227]  [8027c0bb] cache_flusharray+0x95/0xff
[   48.342933]  [8027c36a] kfree+0x1cd/0x1ec
[   48.347863]  [8038b233] skb_release_data+0xab/0xb0
[   48.353567]  [8038aff3] kfree_skbmem+0x11/0x7e
[   48.358927]  [8038b104] __kfree_skb+0xa4/0xa9
[   48.364204]  [803afa9e] tcp_read_sock+0x101/0x1ab
[   48.369823]  [803afb48] tcp_splice_data_recv+0x0/0x22
[   48.375791]  [803b01d5] tcp_splice_read+0xae/0x1a3
[   48.381497]  [8038920f] sock_def_readable+0x0/0x6f
[   48.387209]  [80384a65] sock_splice_read+0x15/0x17
[   48.392913]  [8029e773] do_splice_to+0x76/0x88
[   48.398273]  [8029fcc8] sys_splice+0x1a8/0x232
[   48.403636]  [802097ce] system_call+0x7e/0x83

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[1/2] 2.6.22-rc4: known regressions with patches

2007-06-05 Thread Michal Piotrowski

Hi all,

Here is a list of some known regressions in 2.6.22-rc4
with patches available.

Feel free to add new regressions/remove fixed etc.
http://kernelnewbies.org/known_regressions



Block devices

Subject: loop devices limited to one single device
References : http://lkml.org/lkml/2007/5/16/229
Submitter  : Uwe Bugla [EMAIL PROTECTED]
Handled-By : Ken Chen [EMAIL PROTECTED]
Patch  : http://lkml.org/lkml/2007/5/21/483
Status : patch available



File systems

Subject: JFFS2 issues
References : http://lists.infradead.org/pipermail/linux-mtd/2007-May/018426.html
Submitter  : Haavard Skinnemoen [EMAIL PROTECTED]
Caused-By  : commit 10731f83009e2556f98ffa5c7c2cbffe66dacfb3
Artem Bityutskiy [EMAIL PROTECTED]
Handled-By : Artem Bityutskiy [EMAIL PROTECTED]
Patch  : http://lists.infradead.org/pipermail/linux-mtd/2007-May/018453.html
Status : patch available



Memory management

Subject: bug in i386 MTRR initialization
References : http://lkml.org/lkml/2007/5/19/93
Submitter  : Andrea Righi [EMAIL PROTECTED]
Status : patch available



Networking

Subject: OOPS iproute2/tc/u32_destroy in 2.6.22-rc3-git6
References : http://lkml.org/lkml/2007/6/3/66
Submitter  : Strobl Anton [EMAIL PROTECTED]
Handled-By : Patrick McHardy [EMAIL PROTECTED]
Patch  : http://lkml.org/lkml/2007/6/3/137
Status : patch available



Regards,
Michal

--
Najbardziej brakowało mi twojego milczenia.
-- Andrzej Sapkowski Coś więcej
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: dma_skb_copy_datagram_iovec not EXPORTed

2007-06-05 Thread Sam Hopkins

Hello,

I'm forwarding along a message sent to kernelnewbies as I was directed
to address my question to this list.  I am not a member of netdev, so
please copy me on all discussion of this topic.

Cheers,

Sam

---

Greetings,

I am a developer of the kernel AoE (ATA over Ethernet) driver.  I am
attempting to add dma offloading of data receive, following in TCP's
footsteps via config option NET_DMA.  There is a function in
net/core/user_dma.c, dma_skb_copy_datagram_iovec, that appears to be
exactly what I need, but this function is not EXPORTed for use by
external modules.

Since the NET_DMA option is documented as Network: TCP receive copy
offload perhaps it's just an oversight that anyone else might benefit
from it?

Cheers,

Sam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Socket hangs in read() when putting interface down

2007-06-05 Thread chris schlund
Using read() on a socket(AF_INET, SOCK_STREAM) connected via any network
interface (e.g eth0) and putting this interface down (ifconfig) while
the socket is in read() produces a hanger (the read() will not return
anymore).
Even if I install sigaction signal handler I got no signal when the
interface is removed.

This may be ok on a major interface like eth0 but it is a problem when
using a ppp interface via GSM data or GPRS connection.
Killing the pppd while the socket waits in read() produces the same
behaviour. Also a broken connection looks the same.
It is reproducable on several 2.6 kernel versions including the lastest
stable 2.6.21.

On my understanding ifconfig down should lead into a read() error.

Unfortunately I am not very familiar with in-depth kernel programming
and I am still wondering about the magic inside the kernel:

ifconfig - net/dev.c:dev_close() notifies a call chain which probably
could/should handle this.
But I miss the dependency to net/ipv4/tcp.c: tcp_recvmsg() function.
This is where I think my receive is blocking.

Can someone give me a hint where to look?

Thanks and kind regards
Chris


p.s.
This happens only, when socket is in blocking mode.
I am using a simple example like:
s = socket();
connect(s);
while(){
  read(s)
}


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/2] 2.6.22-rc4: known regressions with patches

2007-06-05 Thread David Woodhouse
On Tue, 2007-06-05 at 16:54 +0200, Michal Piotrowski wrote:
 File systems
 
 Subject: JFFS2 issues
 References : 
 http://lists.infradead.org/pipermail/linux-mtd/2007-May/018426.html
 Submitter  : Haavard Skinnemoen [EMAIL PROTECTED]
 Caused-By  : commit 10731f83009e2556f98ffa5c7c2cbffe66dacfb3
  Artem Bityutskiy [EMAIL PROTECTED]
 Handled-By : Artem Bityutskiy [EMAIL PROTECTED]
 Patch  : 
 http://lists.infradead.org/pipermail/linux-mtd/2007-May/018453.html
 Status : patch available
 

This is fixed in 2.6.22-rc4 with this commit:
http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ea55d30798ac206c9f584ac264b6b8eb093d237a

-- 
dwmw2

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e100 resume failure

2007-06-05 Thread Kok, Auke

Andrew Morton wrote:

I was doing some suspend-to-ram testing on the Vaio with the 2.6.22-rc3-mm1
lineup.  After 10 or 15 cycles a resume failed:

[  357.119436] Suspending device full
[  357.120450] Suspending device zero
[  358.084978] Suspending device port
[  358.085664] Suspending device null
[  358.086432] Suspending device kmem
[  358.087200] Suspending device mem
[  358.087975] Suspending device 00:09
[  358.088764] Suspending device 00:08
[  358.089546] Suspending device 00:07
[  358.090343] Suspending device 00:06
[  358.091125] Suspending device 00:05
[  358.091912] Suspending device 00:04
[  358.092702] Suspending device 00:03
[  358.093486] Suspending device 00:02
[  358.094272] Suspending device 00:01
[  358.095074] Suspending device 00:00
[  358.095863] Suspending device pnp0
[  358.096672] Suspending device :06
[  358.097482] Suspending device :07
[  358.098275] Suspending device :06:0b.0
[  358.101644] Suspending device :06:08.0
[18014750.543703] ATA: abnormal status 0x7F on port 0x000118af
[18014750.555105] e100: eth0: e100_exec_cb_wait: ucode load failed

and the machine hung.


I've been wondering about this one for a while but apart from a freak of nature 
nothing comes up and I have not seen this before. I'll try to keep an eye open 
for it.


Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

2007-06-05 Thread Milton Miller


First, a question especially to Auke and Jeff:

Since this patch both reverts the broken change that is currently in 
-rc and creates the fixed driver, I'm not sure I like the subject 
stating on ARM, although that is the feature of the rewrite, and was 
the intent of merging the previous patch.  This is actually its a fix 
for all systems relative to current, including those where dma is not 
cache coherent, (unlike a simple revert).


Should we just put a comment about reverting the previous patch early 
in the change log?


Something like this:

Fix the e100 receiver handling, supporting cache incoherent DMA.

Discard the concept of setting the S (suspend) bit with the EL bit 
introduced in commit d52df4a35af569071fda3f4eb08e47cc7023f094.  In 
addition to it not setting either bit, the hardware doesn't work that 
way.



Thoughts?



Here is the changelog portion of the latest patch (quoted), with my 
comments thrown in:



On the ARM, their is a race condition between software allocating a 
new receive


On systems that have cache incoherent DMA, including ARM,

buffer and hardware writing into a buffer.  The two race on touching 
the last
Receive Frame Descriptor (RFD).  It has its el-bit set and its next 
link equal
to 0.  When hardware encounters this buffer it attempts to write data 
to it
and then update Status Word bits and Actual Count in the RFD.  At the 
same time
software may try to clear the el-bit and set the link address to a new 
buffer.

Since the entire RFD is once cache-line, the two write operations can
collide.  This can lead to the receive unit stalling or freed receive 
buffers

getting written to.


This can lead to the receive unit stalling or interpreting random 
memory as its receive area.




The fix is to set the el-bit on and the size to 0 on the next to last 
buffer
in the chain.  When the hardware encounters this buffer it stops and 
does

not write to it at all.  The hardware issues an RNR interrupt with the
receive unit in the No Resources state.  When software allocates 
buffers,

it can update the tail of the list when either it knows the hardware
has stopped or the previous to the new one to mark marked.


Software can write to the tail of the list because it knows hardware 
will stop on the previous descriptor that was marked as the end of 
list.



Once it has a new next to last buffer prepared, it can clear the el-bit
and set the size on the previous one.  The race on this buffer is safe
since the link already points to a valid next buffer.


and we can handle the race setting the size (assuming aligned 16 bit 
writes are atomic with respect to the DMA read).



This paragraph changed from third person (the software or hardware) to 
second person (we).



  We keep flags
in our software descriptor to note if the el bit is set and if the size
was 0.  When we clear the RFD's el bit and set its size, we also clear
the el flag but we leave the size was 0 bit set.  This was we can find
this buffer again later.


This way software can identify them when the race may have occurred 
when cleaning the ring.  On these descriptors, it looks ahead and if 
the next one is complete then hardware must have skipped the current 
one.  Logic is added to prevent two packets in a row being marked while 
the receiver is running to avoid running in lockstep with the hardware 
and thereby limiting the required lookahead.



If the hardware sees the el-bit cleared without the size set, it will
move on to the next buffer and skip this one.  If it sees
the size set but the el-bit still set, it will complete that buffer
and then RNR interrupt and wait.


These sentences should be moved to the mention of the race above to 
reducing mixing descriptions of the hardware and the software.



milton

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

2007-06-05 Thread Milton Miller

On Jun 5, 2007, at 8:34 AM, David Acker wrote:

Milton Miller wrote:

On Jun 1, 2007, at 3:45 PM, David Acker wrote:
Ok, I took a stab at coding and testing these ideas.  Below is a 
patch against 2.6.22-rc3.

Let me know what you think.
I think you got most of the ideas.   As Auke noted, your coding style 
is showing again.   And your mailer again munged whitespace (fixed by 
s/^spacespace/space/ s/^$/space/).
Sorry about the coding style.  I instinctively followed what was there 
instead of kernel coding convention.  I will look into how whitespace 
is getting screwed up.


I have to watch my coding style too (I like to indent the closing 
brace).


At least the white space damage seems to be reversable.  More than I 
can say for this mailer.



Find a buffer that is complete with rx-el not set and rx-s0 set.
It appears that hardware can read the rfd's el-bit, then 
software can clear the rfd el-bit and set the rfd size, and then 
hardware can come in and read the size.
Yes, since the size is after the EL flag in the descriptor, this can 
happen since the pci read is not atomic.
I am reading the status back, although I don't think that I have to 
in this instance.
Actually, you are reading it when the rfd still has EL set.  Since 
the cpu will never encounter that case, the if condition is never 
satisfied.
In my tests, every time I found a completed rfd with the el-bit set, 
the receiver was in the out of resources state.


Yes, if the EL was set, it would be a real hard race to find the 
completed packet with EL but not RNR.   I was trying to refer to where 
you find a completed packet and then check for EL in the RFD.  That is 
what I was claiming can not be observed by the cpu (unless the card 
writes the EL bit back, and not just the status u16).


If the unless ... above is true, then please put a comment that the 
device can write RFD-EL back to 1 if we raced.



How about creating a state unknown, for when we think we should check 
the device if its running.
If we are in this state and then encounter a received packet without 
s0 set, we can set it back

to running.   We set it when we rx a packet with s0 set.
We then move both io_status reads to the caller.

I can look into that as I clean this up.


I am testing a version of this code patched against 2.6.18.4 on my 
PXA 255 based system.  I will let you all know how it goes.
The testing I did so far did well.  I will try to get some more going 
tonight, hopefully on a cleaned up patch.


Good to hear our expectiations match reality.




I'm assuming this is why the cleanup of the receiver start to always 
start on rx_to_clean got dropped again. :-)

Yep.  I will get that in the next patch.


Ok.

Also, I would like a few sentences in the Driver Operation section IV 
Receive big comment.  Something like
In order to keep updates to the RFD link field from colliding with 
hardware writes to mark packets complete, we use the feature that 
hardware will not write to a size 0 descriptor and mark the previous 
packet as end-of-list (EL).   After updating the link, we remove EL 
and only then restore the size such that hardware may use the 
previous-to-end RFD.
at the end of the first paragraph, and insert software before no 
locking is required in the second.

Sounds good to me.

I will see if I can get into a cleaned up patch today and get it out 
by tomorrow.  Thanks for dealing with me...I have been around kernel 
code for awhile but posting official patches to linux is new to me.

-Ack


I've just learned by watching the lists over the last several years.  
Well, and actually writing the odd patch here and there.


It occurs to me that I have been focusing on the code and not the 
changelog.   I'll send a seperate reply on that thread shortly.


One more thing I'll state here ... as per the perfect patch guidelines, 
it is preferred that the meta-discussion about the patch and its 
history go after the change log, seperated from it by a line of ---  
so that the patch application scripts can just extract the email 
subject as the title and through the firsst line of --- as the commit 
log.  (This saves some manual editing).


[1] http://kernelnewbies.org/UpstreamMerge

milton

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC VLAN 01/10]: Fix off-by-ones in VLAN ID checks

2007-06-05 Thread Ben Greear

Patrick McHardy wrote:

[VLAN]: Fix off-by-ones in VLAN ID checks

The valid range of VLAN IDs is 0-4095, fix using ID 4095.
  

Per http://standards.ieee.org/getieee802/download/802.1Q-1998.pdf
page 69

0xFFF is reserved for future use, so we should not be using VLANs with 
that VID.


Thanks,
Ben


--
Ben Greear [EMAIL PROTECTED] 
Candela Technologies Inc  http://www.candelatech.com



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC VLAN 00/10]: VLAN netlink support try 2

2007-06-05 Thread Ben Greear

Patrick McHardy wrote:

The last VLAN patchset was outdated, sorry about the mixup. These are the
correct patches. The iproute patch I posted was correct, so no repost of
that one.

  
Is there any significant performance penalty in creating VLANs using 
these patches?


If you have a test station handy, could you let us know what 'time' 
shows for a script

creating 4000 VLANs with the old vconfig/ioctl method v/s this new method?

I'll try to look through the rest of the patches in detail later today.

Thanks,
Ben

--
Ben Greear [EMAIL PROTECTED] 
Candela Technologies Inc  http://www.candelatech.com



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: network interface regularly stops working

2007-06-05 Thread Kacper Bielecki
 Can you try 2.6.22-rc3 (or -rc4) + patches ..0013 from
 http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.22-rc3/r8169-20070527

 Which motherboard is it ?

 Please Cc: netdev@vger.kernel.org

The motherboadr is ASRock ConRoe945G-DVI
I will get you know when I try new kernel.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.22 0/5] cxgb3 - bug fixes

2007-06-05 Thread Divy Le Ray

Divy Le Ray wrote:

Divy Le Ray wrote:

Hi Jeff,

I'm submitting a set of bug fixes for inclusion in 2.6.22.
The patches are built against Linus'git tree.



The mail title was incomplete. sorry about that.

Cheers,
Divy



Hi Jeff,

Did you get a chance to review the series I posted for driver fixes and 
updates ?


Cheers,
Divy
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG] UDP : bind() checks are not complete

2007-06-05 Thread Eric Dumazet
David

I discovered one big problem with UDP binding in 2.6.22-rc4 :

Consider you have eth0 with addr 192.168.0.1

Consider one UDP socket was bound to 192.168.0.1:32769. It will be stored on a 
slot != 1

Another UDP socket is created and binded to (0.0.0.0:0)

__udp_lib_get_port() is called with snum=0 and ANY_ADDR

We try to find a hash chain with the lowest count of sockets.

If we find an empty chain (slot=1 result=32769 for example), we consider we 
finished our checks.

This not true since last udp lookups changes. 

If we allow the new socket to get port 32769, and source address of
outgoing message is set to 192.168.0.1, then answers 
(incoming messages to 192.168.0.1:32769) will go to first socket, because 

We really should check no socket is bound to XXX.XXX.XXX.XXX:32769. With 
current hashing, 
it means checking all slots in udptable[] :(

Our choices are :

1) Drop all thoses patches and re-think them for 2.6.23 eventually
2) Add the extra check for ANY_ADDR sockets and perform a full scan

What do you think ?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: dma_skb_copy_datagram_iovec not EXPORTed

2007-06-05 Thread Randy Dunlap
On Tue, 5 Jun 2007 10:56:58 -0400 Sam Hopkins wrote:

 Hello,
 
 I'm forwarding along a message sent to kernelnewbies as I was directed
 to address my question to this list.  I am not a member of netdev, so
 please copy me on all discussion of this topic.
 
 Cheers,
 
 Sam
 
 ---
 
 Greetings,
 
 I am a developer of the kernel AoE (ATA over Ethernet) driver.  I am
 attempting to add dma offloading of data receive, following in TCP's
 footsteps via config option NET_DMA.  There is a function in
 net/core/user_dma.c, dma_skb_copy_datagram_iovec, that appears to be
 exactly what I need, but this function is not EXPORTed for use by
 external modules.
 
 Since the NET_DMA option is documented as Network: TCP receive copy
 offload perhaps it's just an oversight that anyone else might benefit
 from it?

Makes sense to me.  Maybe someone in the hierarchy will agree,
or you may need to just send a patch and see if it is accepted.


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

2007-06-05 Thread Kok, Auke

Milton Miller wrote:

First, a question especially to Auke and Jeff:

Since this patch both reverts the broken change that is currently in 
-rc and creates the fixed driver, I'm not sure I like the subject 
stating on ARM, although that is the feature of the rewrite, and was 
the intent of merging the previous patch.  This is actually its a fix 
for all systems relative to current, including those where dma is not 
cache coherent, (unlike a simple revert).


Should we just put a comment about reverting the previous patch early 
in the change log?


yes


Something like this:

Fix the e100 receiver handling, supporting cache incoherent DMA.

Discard the concept of setting the S (suspend) bit with the EL bit 
introduced in commit d52df4a35af569071fda3f4eb08e47cc7023f094.  In 
addition to it not setting either bit, the hardware doesn't work that 
way.



Thoughts?


the same comment I made about the coding style counts for this too: I will clean 
up the patch and gladly adjust the topic, which in this case seems the right 
thing to do. I am too grateful that you guys are digging into this so deeply to 
send you back with comments on style - I'll gladly fix that up :)


Here is the changelog portion of the latest patch (quoted), with my 
comments thrown in:


OK, I will buffer this info and make sure this gets picked up on the final 
version.

this opens up another question:

We need to make sure that now that we're getting closer to 2.6.22 we don't end 
up killing e100 in it. Should we drop the current fixes in it to be on the safe 
side and aim for 2.6.23? I would hate to see an untested codepath breaking e100 
on something like ppc or mips... that will be very painful


Jeff, your thoughts on that?

Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Socket hangs in read() when putting interface down

2007-06-05 Thread Rick Jones

chris schlund wrote:

Using read() on a socket(AF_INET, SOCK_STREAM) connected via any network
interface (e.g eth0) and putting this interface down (ifconfig) while
the socket is in read() produces a hanger (the read() will not return
anymore).
Even if I install sigaction signal handler I got no signal when the
interface is removed.


I'd not think there was a direct tie-in between an application signal 
handler and an interface being downed.



This may be ok on a major interface like eth0 but it is a problem when
using a ppp interface via GSM data or GPRS connection.
Killing the pppd while the socket waits in read() produces the same
behaviour. Also a broken connection looks the same.
It is reproducable on several 2.6 kernel versions including the lastest
stable 2.6.21.

On my understanding ifconfig down should lead into a read() error.


Why?  Can't traffic for the endpoint arrive via some other interface on 
the system?  I would think that getting out of a blocking read() 
requires the application to either have a timer running (and so get out 
of read() with EINTR) or to have keepalives enabled on the TCP 
connection (I'm assuming TCP for an AF_INET, SOCK_STREAM although I 
suppose it could be SCTP)


rick jones
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

2007-06-05 Thread David Acker

Jeff Garzik wrote:

On Tue, Jun 05, 2007 at 10:27:19AM -0700, Kok, Auke wrote:
We need to make sure that now that we're getting closer to 2.6.22 we don't 
end up killing e100 in it. Should we drop the current fixes in it to be on 
the safe side and aim for 2.6.23? I would hate to see an untested codepath 
breaking e100 on something like ppc or mips... that will be very painful


I certainly agree with this assessment...

I've been wondering if, based on all this recent work, we should revert
the s-bit stuff and wait for 2.6.23.


I think so.  It will be no worse then it was and make the patch that 
fixes it clearer.

-Ack
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

2007-06-05 Thread Kok, Auke

Jeff Garzik wrote:

On Tue, Jun 05, 2007 at 10:27:19AM -0700, Kok, Auke wrote:
We need to make sure that now that we're getting closer to 2.6.22 we don't 
end up killing e100 in it. Should we drop the current fixes in it to be on 
the safe side and aim for 2.6.23? I would hate to see an untested codepath 
breaking e100 on something like ppc or mips... that will be very painful


I certainly agree with this assessment...

I've been wondering if, based on all this recent work, we should revert
the s-bit stuff and wait for 2.6.23.


Yes, that's my point. If Milton and David agree I think we should do so 
immediately.

If so, do you want me to write a revert-patch or do you have some magic to do 
that for me?


Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

2007-06-05 Thread Milton Miller


On Jun 5, 2007, at 12:43 PM, Kok, Auke wrote:


Jeff Garzik wrote:

On Tue, Jun 05, 2007 at 10:27:19AM -0700, Kok, Auke wrote:
We need to make sure that now that we're getting closer to 2.6.22 we 
don't end up killing e100 in it. Should we drop the current fixes in 
it to be on the safe side and aim for 2.6.23? I would hate to see an 
untested codepath breaking e100 on something like ppc or mips... 
that will be very painful

I certainly agree with this assessment...
I've been wondering if, based on all this recent work, we should 
revert

the s-bit stuff and wait for 2.6.23.


Yes, that's my point. If Milton and David agree I think we should do 
so immediately.


We definitely need something other than what is in now.


If so, do you want me to write a revert-patch or do you have some 
magic to do that for me?




The simple git revert won't work because there have been other changes 
(ioread for instance) that conflict.


milton

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

2007-06-05 Thread Jeff Garzik
On Tue, Jun 05, 2007 at 10:27:19AM -0700, Kok, Auke wrote:
 We need to make sure that now that we're getting closer to 2.6.22 we don't 
 end up killing e100 in it. Should we drop the current fixes in it to be on 
 the safe side and aim for 2.6.23? I would hate to see an untested codepath 
 breaking e100 on something like ppc or mips... that will be very painful

I certainly agree with this assessment...

I've been wondering if, based on all this recent work, we should revert
the s-bit stuff and wait for 2.6.23.

Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]: Add security check before flushing SAD/SPD

2007-06-05 Thread Joy Latten
On Mon, 2007-06-04 at 15:44 -0400, James Morris wrote:
 On Mon, 4 Jun 2007, Eric Paris wrote:
 
  Some time ago this thread bounced back and forth and seemed to come to
  rest with the patch below, I cleaned up the comments and put all the
  ACKs it received in one place below, but so much time has passed I doubt
  if they should still count for free.  I also rediffed the patch against
  the latest miller tree.  Is the idea or patch in any way flawed or
  unacceptable to people at the moment?
  
  Anyone willing to step up an re-ack the patch to get it moving into the
  tree?
 
 Looks good to me.
 
 Acked-by: James Morris [EMAIL PROTECTED]
 
 

I have also tested with 2.6.22-rc3-git7 and all appears to be working as
expected. 

Acked-by: Joy Latten [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC VLAN 01/10]: Fix off-by-ones in VLAN ID checks

2007-06-05 Thread Patrick McHardy
Ben Greear wrote:
 Patrick McHardy wrote:
 
 [VLAN]: Fix off-by-ones in VLAN ID checks

 The valid range of VLAN IDs is 0-4095, fix using ID 4095.
   
 
 Per http://standards.ieee.org/getieee802/download/802.1Q-1998.pdf
 page 69
 
 0xFFF is reserved for future use, so we should not be using VLANs with
 that VID.


Thanks, I just noticed id 4095 didn't work and assumed it was a bug.
I'm going to drop that patch.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC VLAN 00/10]: VLAN netlink support try 2

2007-06-05 Thread Patrick McHardy
Ben Greear wrote:
 Patrick McHardy wrote:
 
 The last VLAN patchset was outdated, sorry about the mixup. These are the
 correct patches. The iproute patch I posted was correct, so no repost of
 that one.

   
 
 Is there any significant performance penalty in creating VLANs using
 these patches?
 
 If you have a test station handy, could you let us know what 'time'
 shows for a script
 creating 4000 VLANs with the old vconfig/ioctl method v/s this new method?


Not significantly more, for 1000 VLANs I get:

ip link add:

real0m22.836s
user0m0.100s
sys 0m5.850s

vconfig add:

real0m19.739s
user0m0.090s
sys 0m3.600s


ip -b (batch add):

real0m5.239s
user0m0.280s
sys 0m3.480s

 I'll try to look through the rest of the patches in detail later today.

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Failover-friendly TCP retransmission

2007-06-05 Thread Andi Kleen
 Your suggestion, to utilize NET_XMIT_* code returned from an
 underlying layer, is done in tcp_transmit_skb.
 
 But my problem is that tcp_transmit_skb is not called during a
 certain period of time.  So I'm suggesting to cap RTO value so
 that tcp_transmit_skb gets called more frequently.

The transmit code controls the transmission timeout. Or at least
it could change it if it really wanted.

What I wanted to say is: if the loss still happens under control
of the sending end device and TCP knows this then it could change
the retransmit timer to fire earlier or even just wait for an 
event from the device that tells it to retransmit early.

I admit I have not thought through all the implications of this,
but it would seem to me a better approach than capping RTO or
doing other intrusive TCP changes.

The problem with capping RTO is that when there is a loss
in the network for some other reasons (and there is no reason
bonding can't be used when talking to the internet) you
might be too aggressive or not aggressive enough anymore
to get the data through.

But if you only change behaviour when you detect a local
loss this cannot happen.

Just using a very short timeout of one jiffie on local loss might work 
(the stack already does this sometimes). Upcalls would be more 
complicated and might have some bad side effects (like not 
interacting well with qdiscs or possibly being unfair if there 
are a lot of sockets). But that might be solvable too. 

In a virtualized environments it might be also needed 
to pass NET_XMIT_* through the paravirtual driver interface.

-Andi

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC RTNETLINK 00/09]: Netlink link creation API

2007-06-05 Thread David Miller
From: Patrick McHardy [EMAIL PROTECTED]
Date: Tue,  5 Jun 2007 16:12:51 +0200 (MEST)

 A few words about the API:
 
 Drivers wishing to use the API register a struct rtnl_link_ops, which
 contains a few function pointers for device setup, registation, changing
 and deletion, as well as netlink attribute validation and device dumping.
 
 All netlink communication happens within the AF_UNSPEC family. I
 initially introduced new netlink families for this, but removed them
 again since that would require adding new protocol families that serve
 no further purpose for most drivers. Additionally we currently use
 RTM.*LINK messages with ifi_family != AF_UNSPEC for information that
 is related to the device, but doesn't come from the driver that created
 the device itself, like bridge port state, IPv6 device configuration etc.
 
 The device specific attributes are nested within a new attribute
 IFLA_LINKINFO. I didn't use IFLA_PROTINFO since userspace can reasonably
 expect to have IFLA_PROTINFO unset for AF_UNSPEC messages, and the
 userspace STP daemon does that. Identification of the driver happens
 by name, stored in the IFLA_INFO_NAME attribute. IFLA_INFO_DATA contains
 driver specific attributes, IFLA_INFO_XSTATS driver specific statistics.
 
 The API does *not* use the existing RTM_SETLINK message type, instead
 it adds support for receiving RTM_NEWLINK within the kernel. I did this
 because of three reasons: 
 
 - RTM_SETLINK does not follow the usual rtnetlink conventions and ignores
   all netlink flags
 
 - Other rtnetlink subsystems use the same message type for dumps and
   notifications from the kernel as for configuration from userspace,
   which usually allows to recreate an object by simply setting the
   NLM_F_REQUEST flag on message received from the kernel and sending
   it back.
 
 - Easier for userspace to detect support for the new features
 
 The RTM_NEWLINK message type is a superset of RTM_SETLINK, it allows
 to change both driver specific and generic attributes of the device.
 The set of generic device attributes that may be supplied during
 device creation is limited to a few simple ones, it currently does
 not support specifying link layer address/broadcast address as well
 as device flags. The change operation can change all device attributes.
 
 Not sure what else to say .. comments welcome.

This excellent description of the APIs (particularly the background
and reasoning) belongs in a file under Documentation/networking/ :-)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NETLINK 01/09]: Mark netlink policies const

2007-06-05 Thread David Miller
From: Patrick McHardy [EMAIL PROTECTED]
Date: Tue,  5 Jun 2007 16:12:52 +0200 (MEST)

 [NETLINK]: Mark netlink policies const
 
 Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

This one looks good enough for 2.6.22, as I consider more
accurate typing a way to prevent and find bugs.

So I'll apply this one now, thanks!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC RTNETLINK 02/09]: ifindex 0 does not exist

2007-06-05 Thread David Miller
From: Patrick McHardy [EMAIL PROTECTED]
Date: Tue,  5 Jun 2007 16:12:54 +0200 (MEST)

 [RTNETLINK]: ifindex 0 does not exist
 
 ifindex == 0 does not exist and implies we should do a lookup by name if
 one was given.
 
 Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

This one is also more like a bug fix, thus applied.

I thought for a moment that perhaps we should signal an
error on ifindex==0 but I think this behavior you choose
makes more sense.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC RTNETLINK 03/09]: Split up rtnl_setlink

2007-06-05 Thread David Miller
From: Patrick McHardy [EMAIL PROTECTED]
Date: Tue,  5 Jun 2007 16:12:55 +0200 (MEST)

 [RTNETLINK]: Split up rtnl_setlink
 
 Split up rtnl_setlink into a function performing validation and a function
 performing the actual changes. This allows to share the modifcation logic
 with rtnl_newlink, which is introduced by the next patch.
 
 Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

This looks fine to me.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC RTNETLINK 04/09]: Link creation API

2007-06-05 Thread David Miller
From: Patrick McHardy [EMAIL PROTECTED]
Date: Tue,  5 Jun 2007 16:12:57 +0200 (MEST)

 [RTNETLINK]: Link creation API
 
 Add rtnetlink API for creating, changing and deleting software devices.
 
 Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

Looks mostly fine, perhaps you can make even more use of 'const'
for instances of struct rtnl_link_ops * at least as function
arguments deeper in the implementation.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC DUMMY 05/09]: Use dev-stats

2007-06-05 Thread David Miller
From: Patrick McHardy [EMAIL PROTECTED]
Date: Tue,  5 Jun 2007 16:12:59 +0200 (MEST)

 [DUMMY]: Use dev-stats
 
 Use dev-stats instead of netdev_priv().
 
 Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

Looks good.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC DUMMY 06/09]: Keep dummy devices on list

2007-06-05 Thread David Miller
From: Patrick McHardy [EMAIL PROTECTED]
Date: Tue,  5 Jun 2007 16:13:00 +0200 (MEST)

 [DUMMY]: Keep dummy devices on list
 
 Use a list instead of an array to allow creating new devices.
 
 Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

Looks good.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >