Re: [Bugme-new] [Bug 8057] New: slab corruption running ip6sic

2007-04-23 Thread Jarek Poplawski
On Fri, Apr 20, 2007 at 04:35:15PM -0700, David Miller wrote:
 From: Jarek Poplawski [EMAIL PROTECTED]
 Date: Mon, 12 Mar 2007 11:24:03 +0100
 
   the ipcomp handler is xfrm6_rcv(), which calls xfrm6_rcv_spi(), which 
   contrary
   to all other handlers returns -1 instead of 0 after calling kfree_skb() 
   on the
   skb. Changing the return value to 0 in xfrm6_input.c:xfrm6_rcv_spi() 
   fixes the
   problem.
   But I got no clue at all if this would be a correct fix
  
  I think your diagnose is correct (all return -1 should be
  changed to return 0 in xfrm6_input.c).

I've corrected this, yet:

Sorry! Of course should be:
I think your diagnose is correct (all return -1 should be
changed to return 0 in xfrm6_rcv_spi()).

It's just like Eric diagnosed:

xfrm6_rcv() calls tunnel6_rcv(), which calls handlers-handler()
and if handler() returns anything but 0, skb is kfreed. But
handler: xfrm6_tunnel_rcv() calls xfrm6_rcv_spi() and returns its
return without changing, which is only 1 and -1. It seems, in
every -1 case skb is kfreed by xfrm6_rcv_spi() or by functions
called by it, probably meaning skb was handled (delivered or
kfreed). The only path where skb is not kfreed returns 1.

tunnel6_rcv() treats both returns the same way - so some skbs
are kfreed 2 times.

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-23 Thread David Howells

 We only care when del_timer() returns true. In that case, if the timer
 function still runs (possible for single-threaded wqs), it has already
 passed __queue_work().

Why do you assume that?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Suspicious fackets_out handling

2007-04-23 Thread Ilpo Järvinen
Hi,


While looking through the users of fackets_out, i found this from 
tcp_fragment(...):

/* If this packet has been sent out already, we must
 * adjust the various packet counters.
 */
if (!before(tp-snd_nxt, TCP_SKB_CB(buff)-end_seq)) {
int diff = old_factor - tcp_skb_pcount(skb) -
tcp_skb_pcount(buff);

[...snip...]

if (diff  0) {
/* Adjust Reno SACK estimate. */
if (!tp-rx_opt.sack_ok) {
tp-sacked_out -= diff;
if ((int)tp-sacked_out  0)
tp-sacked_out = 0;
tcp_sync_left_out(tp);
}

tp-fackets_out -= diff;
if ((int)tp-fackets_out  0)
tp-fackets_out = 0;
}
}
[...]

There are IMHO two problems in it. First of all, nothing ensures that the 
skb TCP is fragmenting is actually below the forwardmost sack block (and 
thus is included to the fackets_out)... What I'm not sure of though, is 
how to fix this in net-2.6(.22), it is due to the fact that there is no 
pointer/seq which can be used in testing for it like in tcp-2.6 which has 
the highest_sack. Second problem is even more obvious: if adjustment here 
is being done and the sacktag code then uses fastpath at the arrival of 
the next ACK, the sacktag code will use a stale value from 
fastpath_cnt_hint and fails to notice all that math TCP did here unless we 
start clearing fastpath_skb_hint.


-- 
 i.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[XFRM]: beet: fix pseudo header length value

2007-04-23 Thread Patrick McHardy
[XFRM]: beet: fix pseudo header length value

draft-nikander-esp-beet-mode-07.txt is not entirely clear on how the length
value of the pseudo header should be calculated, it states The Header Length
field contains the length of the pseudo header, IPv4 options, and padding in
8 octets units., but also states Length in octets (Header Len + 1) * 8.
draft-nikander-esp-beet-mode-08-pre1.txt [1] clarifies this, the header length
should not include the first 8 byte.

This change affects backwards compatibility, but option encapsulation didn't
work until very recently anyway.

[1] http://users.piuha.net/jmelen/BEET/draft-nikander-esp-beet-mode-08-pre1.txt

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 4b33b63040d82c55a90eb8ff2d6054bf719514f6
tree c23f5b82272fcef02e804314971b55728185cab7
parent 0f851021c0f91e5073fa89f26b5ac68e23df8e11
author Patrick McHardy [EMAIL PROTECTED] Mon, 23 Apr 2007 13:26:27 +0200
committer Patrick McHardy [EMAIL PROTECTED] Mon, 23 Apr 2007 13:26:27 +0200

 net/ipv4/xfrm4_mode_beet.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/xfrm4_mode_beet.c b/net/ipv4/xfrm4_mode_beet.c
index f68dfd8..d419e15 100644
--- a/net/ipv4/xfrm4_mode_beet.c
+++ b/net/ipv4/xfrm4_mode_beet.c
@@ -52,7 +52,7 @@ static int xfrm4_beet_output(struct xfrm_state *x, struct 
sk_buff *skb)
 
ph = (struct ip_beet_phdr *)skb-h.raw;
ph-padlen = 4 - (optlen  4);
-   ph-hdrlen = (optlen + ph-padlen + sizeof(*ph)) / 8;
+   ph-hdrlen = optlen / 8;
ph-nexthdr = top_iph-protocol;
if (ph-padlen)
memset(ph + 1, IPOPT_NOP, ph-padlen);
@@ -85,7 +85,7 @@ static int xfrm4_beet_input(struct xfrm_state *x, struct 
sk_buff *skb)
ph = (struct ip_beet_phdr *)(skb-h.ipiph + 1);
 
phlen = sizeof(*ph) + ph-padlen;
-   optlen = ph-hdrlen * 8 - phlen;
+   optlen = ph-hdrlen * 8 + (IPV4_BEET_PHMAXLEN - phlen);
if (optlen  0 || optlen  3 || optlen  250)
goto out;
 


[PATCH/RESEND] ehea: fix for dlpar and sysfs entries

2007-04-23 Thread Jan-Bernd Themann
This patch includes:
- dlpar fix: 
certain resources may only be allocated when first
logical port is available, and must be removed when
last logical port has been removed

- sysfs entries:
create symbolic link from each logical port to ehea driver

Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED]
---


This patch applies on top of the netdev upstream branch for 2.6.22



diff --git a/drivers/net/ehea/ehea.h b/drivers/net/ehea/ehea.h
index 1405d0b..173994d 100644
--- a/drivers/net/ehea/ehea.h
+++ b/drivers/net/ehea/ehea.h
@@ -39,7 +39,7 @@ #include asm/abs_addr.h
 #include asm/io.h
 
 #define DRV_NAME   ehea
-#define DRV_VERSIONEHEA_0055
+#define DRV_VERSIONEHEA_0056
 
 #define EHEA_MSG_DEFAULT (NETIF_MSG_LINK | NETIF_MSG_TIMER \
| NETIF_MSG_RX_ERR | NETIF_MSG_TX_ERR)
diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index a36a023..f9f3133 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -78,6 +78,28 @@ MODULE_PARM_DESC(sq_entries,  Number of
 __MODULE_STRING(EHEA_DEF_ENTRIES_SQ) ));
 MODULE_PARM_DESC(use_mcs,  0:NAPI, 1:Multiple receive queues, Default = 1 );
 
+static int port_name_cnt = 0;
+
+static int __devinit ehea_probe_adapter(struct ibmebus_dev *dev,
+const struct of_device_id *id);
+
+static int __devexit ehea_remove(struct ibmebus_dev *dev);
+
+static struct of_device_id ehea_device_table[] = {
+   {
+   .name = lhea,
+   .compatible = IBM,lhea,
+   },
+   {},
+};
+
+static struct ibmebus_driver ehea_driver = {
+   .name = ehea,
+   .id_table = ehea_device_table,
+   .probe = ehea_probe_adapter,
+   .remove = ehea_remove,
+};
+
 void ehea_dump(void *adr, int len, char *msg) {
int x;
unsigned char *deb = adr;
@@ -2108,6 +2130,28 @@ static int ehea_clean_all_portres(struct
return ret;
 }
 
+static void ehea_remove_adapter_mr (struct ehea_adapter *adapter)
+{
+   int i;
+
+   for (i=0; i  EHEA_MAX_PORTS; i++)
+   if (adapter-port[i])
+   return;
+
+   ehea_rem_mr(adapter-mr);
+}
+
+static int ehea_add_adapter_mr (struct ehea_adapter *adapter)
+{
+   int i;
+
+   for (i=0; i  EHEA_MAX_PORTS; i++)
+   if (adapter-port[i])
+   return 0;
+
+   return ehea_reg_kernel_mr(adapter, adapter-mr);
+}
+
 static int ehea_up(struct net_device *dev)
 {
int ret, i;
@@ -2361,6 +2405,34 @@ static void __devinit logical_port_relea
of_node_put(port-ofdev.node);
 }
 
+static int ehea_driver_sysfs_add(struct device *dev,
+ struct device_driver *driver)
+{
+   int ret;
+
+   ret = sysfs_create_link(driver-kobj, dev-kobj,
+   kobject_name(dev-kobj));
+   if (ret == 0) {
+   ret = sysfs_create_link(dev-kobj, driver-kobj,
+   driver);
+   if (ret)
+   sysfs_remove_link(driver-kobj,
+ kobject_name(dev-kobj));
+   }
+   return ret;
+}
+
+static void ehea_driver_sysfs_remove(struct device *dev,
+ struct device_driver *driver)
+{
+   struct device_driver *drv = driver;
+
+   if (drv) {
+   sysfs_remove_link(drv-kobj, kobject_name(dev-kobj));
+   sysfs_remove_link(dev-kobj, driver);
+   }
+}
+
 static struct device *ehea_register_port(struct ehea_port *port,
 struct device_node *dn)
 {
@@ -2368,8 +2440,9 @@ static struct device *ehea_register_port
 
port-ofdev.node = of_node_get(dn);
port-ofdev.dev.parent = port-adapter-ebus_dev-ofdev.dev;
+   port-ofdev.dev.bus = ibmebus_bus_type;
 
-   sprintf(port-ofdev.dev.bus_id, port%d, port-logical_port_id);
+   sprintf(port-ofdev.dev.bus_id, port%d, port_name_cnt++);
port-ofdev.dev.release = logical_port_release;
 
ret = of_device_register(port-ofdev);
@@ -2384,8 +2457,16 @@ static struct device *ehea_register_port
goto out_unreg_of_dev;
}
 
+   ret = ehea_driver_sysfs_add(port-ofdev.dev, ehea_driver.driver);
+   if (ret) {
+   ehea_error(failed to register sysfs driver link);
+   goto out_rem_dev_file;
+   }
+
return port-ofdev.dev;
 
+out_rem_dev_file:
+   device_remove_file(port-ofdev.dev, dev_attr_log_port_id);
 out_unreg_of_dev:
of_device_unregister(port-ofdev);
 out:
@@ -2394,6 +2475,7 @@ out:
 
 static void ehea_unregister_port(struct ehea_port *port)
 {
+   ehea_driver_sysfs_remove(port-ofdev.dev, ehea_driver.driver);
device_remove_file(port-ofdev.dev, dev_attr_log_port_id);
of_device_unregister(port-ofdev);
 }
@@ -2520,7 +2602,6 @@ static int ehea_setup_ports(struct ehea_
struct 

Async processing using NetFilters

2007-04-23 Thread Lokesh

Hi All,
I posted this to Netfiler list but got no response.
dont know if it is a right place to ask this question, but any 
answer/clue will be helpful to me.

Thanks
-Lokesh



I'm using Linux RedHat 9  kernel version 2.4.20-8
I want to do some asynchronous processing of packets in my kernel 
module that is registered as
Netfilter Hook. I tried to make use of NF_QUEUE but found that only 
one call back function per
protocol family (PF) is provided and already some module (probably 
NET_LINK socket interface)
is making use of it for PF_INET.  Is there any way to use NF_QUEUE 
facility without disturbing
NET_LINK ? why per hook registration of call back function of 
NF_QUEUE is not provided?
How to proceed if someone had to do async processing of packets so 
that pkts can be queued to crypto hardware processing unit and can 
be safely reinjected back into kernel later so that other registered 
hooks can get pkts?

any help is highly appreciated.
Thanks





This email message (including any attachments) is for the sole use of the intended recipient(s) 
and may contain confidential, proprietary and privileged information. Any unauthorized review, 
use, disclosure or distribution is prohibited. If you are not the intended recipient, 
please immediately notify the sender by reply email and destroy all copies of the original message. 
Thank you.


Intoto Inc. 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/5 2.6.21-rc7] l2tp: introduce PPP over L2TP driver

2007-04-23 Thread James Chapman
This patch set adds a driver for PPP over L2TP. Patches to follow.

The following changes have been made since the previous version
submitted 23-MAR-07, addressing review comments received so far:-

- Fold makefile changes into the patches that add new source files.

- Make API structures 64-bit safe by avoiding holes inserted by the
  compiler for field alignment.

- Fix list locking.

- Use inlines rather than macros for obtaining local context
  pointers. (Previous code obscured use of variables and labels in
  macros.)

- Use endian-annotated types where appropriate.

- Use skb_queue_walk() for walking skb lists.

- Add check that session doesn't already exist before allowing it to
  be created.

- Remove unnecessary try_module_get() / put_module() calls.

- No need to check for NULL before calling kfree().

- Use macros in jiffies.h for time/jiffy conversion.

- Minimize size of L2TP's private data that is stored in skb-cb[].

- Change transmit path to build UDP header locally and transmit using
  ip_queue_xmit() rather than use the UDP socket's sendmsg(). This
  change also avoids the use of a kernel thread to do the transmit.

- Fix seq file code to avoid potential buffer overflow when there are
  lots of tunnels/sessions.

- Fix some socket refcount/lock bugs. Fixes derived from recent pppoe
  patches. 

The implementation uses the existing PPPoX subsystem that is currently
used only by PPPoE. A userspace daemon handles all L2TP control
protocol messages, while a PPPoX socket carries user data. The system
architecture is similar to PPPoE, where a pppd plugin uses the PPPoX
socket to send/receive PPP frames over an L2TP tunnel. PPP control
frames are delivered to pppd while data frames are handled entirely by
the kernel.

There are 5 patches in the series:-

1 - pppol2tp driver core. New code. Requires patches 2  3.

2 - API changes for L2TP. Adds definitions for L2TP in existing headers.

3 - Introduce skb_queue_walk_safe() macro.

4 - Change to PPPoX core to allow PPPoX protocol modules such as
pppol2tp to be autoloaded via modprobe.conf.

5 - Adds an entry to MAINTAINERS file.

The driver is being used on x86, ia64, ppc, arm, mips32 and possibly
other architectures. It is also known to work on multi-core SMP boxes.

--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5 2.6.21-rc7] l2tp: pppol2tp changes to existing ppp kernel headers

2007-04-23 Thread James Chapman
[L2TP]: Modify kernel headers for L2TP.

This patch adds L2TP definitions to existing PPP and socket headers.

Signed-off-by: James Chapman [EMAIL PROTECTED]

---

Add struct sockaddr_pppol2tp to carry L2TP-specific address
information for the PPPoX (PPPoL2TP) socket. Unfortunately we can't
use the union inside struct sockaddr_pppox because the L2TP-specific
data is larger than the current size of the union and we must preserve
the size of struct sockaddr_pppox for binary compatibility.

Also add an ioctl to allow userspace to obtain L2TP counters and state
from the kernel.

Index: linux-2.6.21-rc7/include/linux/if_ppp.h
===
--- linux-2.6.21-rc7.orig/include/linux/if_ppp.h
+++ linux-2.6.21-rc7/include/linux/if_ppp.h
@@ -110,6 +110,21 @@ struct ifpppcstatsreq {
struct ppp_comp_stats stats;
 };
 
+/* For PPPIOCGL2TPSTATS */
+struct pppol2tp_ioc_stats {
+   __u16   tunnel_id;  /* redundant */
+   __u16   session_id; /* if zero, get tunnel stats */
+   __u32   using_ipsec:1;  /* valid only for session_id == 0 */
+   __u64   tx_packets;
+   __u64   tx_bytes;
+   __u64   tx_errors;
+   __u64   rx_packets;
+   __u64   rx_bytes;
+   __u64   rx_seq_discards;
+   __u64   rx_oos_packets;
+   __u64   rx_errors;
+};
+
 #define ifr__name   b.ifr_ifrn.ifrn_name
 #define stats_ptr   b.ifr_ifru.ifru_data
 
@@ -146,6 +161,7 @@ struct ifpppcstatsreq {
 #define PPPIOCDISCONN  _IO('t', 57)/* disconnect channel */
 #define PPPIOCATTCHAN  _IOW('t', 56, int)  /* attach to ppp channel */
 #define PPPIOCGCHAN_IOR('t', 55, int)  /* get ppp channel number */
+#define PPPIOCGL2TPSTATS _IOR('t', 54, struct pppol2tp_ioc_stats)
 
 #define SIOCGPPPSTATS   (SIOCDEVPRIVATE + 0)
 #define SIOCGPPPVER (SIOCDEVPRIVATE + 1)   /* NEVER change this!! */
Index: linux-2.6.21-rc7/include/linux/if_pppox.h
===
--- linux-2.6.21-rc7.orig/include/linux/if_pppox.h
+++ linux-2.6.21-rc7/include/linux/if_pppox.h
@@ -18,7 +18,19 @@
 
 
 #include asm/types.h
+#ifdef __KERNEL__
 #include asm/byteorder.h
+#else
+#include endian.h
+#include byteswap.h
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define __LITTLE_ENDIAN_BITFIELD
+#elif __BYTE_ORDER == __BIG_ENDIAN
+#define __BIG_ENDIAN_BITFIELD
+#else
+#error Adjust your endian.h defines.
+#endif
+#endif
 
 #ifdef  __KERNEL__
 #include linux/if_ether.h
@@ -27,6 +39,7 @@
 #include asm/semaphore.h
 #include linux/ppp_channel.h
 #endif /* __KERNEL__ */
+#include linux/if_pppol2tp.h
 
 /* For user-space programs to pick up these definitions
  * which they wouldn't get otherwise without defining __KERNEL__
@@ -50,8 +63,9 @@ struct pppoe_addr{ 
  * Protocols supported by AF_PPPOX 
  */ 
 #define PX_PROTO_OE0 /* Currently just PPPoE */
-#define PX_MAX_PROTO   1   
- 
+#define PX_PROTO_OL2TP 1 /* Now L2TP also */
+#define PX_MAX_PROTO   2
+
 struct sockaddr_pppox { 
sa_family_t sa_family;/* address family, AF_PPPOX */ 
unsigned intsa_protocol;  /* protocol identifier */ 
@@ -60,6 +74,16 @@ struct sockaddr_pppox { 
}sa_addr; 
 }__attribute__ ((packed)); 
 
+/* The use of the above union isn't viable because the size of this
+ * struct must stay fixed over time -- applications use sizeof(struct
+ * sockaddr_pppox) to fill it. We use a protocol specific sockaddr
+ * type instead.
+ */
+struct sockaddr_pppol2tp {
+   sa_family_t sa_family;  /* address family, AF_PPPOX */
+   unsigned intsa_protocol;/* protocol identifier */
+   struct pppol2tp_addr pppol2tp;
+}__attribute__ ((packed));
 
 /*
  *
Index: linux-2.6.21-rc7/include/linux/socket.h
===
--- linux-2.6.21-rc7.orig/include/linux/socket.h
+++ linux-2.6.21-rc7/include/linux/socket.h
@@ -284,6 +284,7 @@ struct ucred {
 #define SOL_DCCP   269
 #define SOL_NETLINK270
 #define SOL_TIPC   271
+#define SOL_PPPOL2TP   272
 
 /* IPX options */
 #define IPX_TYPE   1
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5 2.6.21-rc7] skb: introduce skb_queue_walk_safe()

2007-04-23 Thread James Chapman
[SKB]: Introduce skb_queue_walk_safe()

This patch provides a method for walking skb lists while inserting or
removing skbs from the list.

Signed-off-by: James Chapman [EMAIL PROTECTED]

Index: linux-2.6.21-rc7/include/linux/skbuff.h
===
--- linux-2.6.21-rc7.orig/include/linux/skbuff.h
+++ linux-2.6.21-rc7/include/linux/skbuff.h
@@ -1299,6 +1299,11 @@ static inline int pskb_trim_rcsum(struct
 prefetch(skb-next), (skb != (struct sk_buff *)(queue));   
\
 skb = skb-next)
 
+#define skb_queue_walk_safe(queue, skb, tmp)   
\
+   for (skb = (queue)-next, tmp = skb-next;  
\
+skb != (struct sk_buff *)(queue);  
\
+skb = tmp, tmp = skb-next)
+
 #define skb_queue_reverse_walk(queue, skb) \
for (skb = (queue)-prev;   
\
 prefetch(skb-prev), (skb != (struct sk_buff *)(queue));   
\
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RESEND] ehea: fix for dlpar and sysfs entries

2007-04-23 Thread Arnd Bergmann
On Monday 23 April 2007, Jan-Bernd Themann wrote:
 - dlpar fix: 
 certain resources may only be allocated when first
 logical port is available, and must be removed when
 last logical port has been removed
 
 - sysfs entries:
 create symbolic link from each logical port to ehea driver
 

I can't see anything wrong with the patch contents, but if you know that there
are two changes, you really should make it two separate patches.

Arnd 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5 2.6.21-rc7] l2tp: introduce PPP over L2TP driver

2007-04-23 Thread Patrick McHardy
James Chapman wrote:
 This patch set adds a driver for PPP over L2TP. Patches to follow.
 
 The following changes have been made since the previous version
 submitted 23-MAR-07, addressing review comments received so far:-
 
 - Fold makefile changes into the patches that add new source files.


The reason to do this was to keep the tree compiling at all stages,
which this patchset still doesn't do. You should first add your changes
to other files (like skb_queue_walk_safe, ppp headers, ...), then
add the code that uses them.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/5 2.6.21-rc7] l2tp: introduce PPP over L2TP driver

2007-04-23 Thread James Chapman
This patch set adds a driver for PPP over L2TP. Patches to follow.

The following changes have been made since the previous version
submitted 23-MAR-07, addressing review comments received so far:-

- Fold makefile changes into the patches that add new source files.

- Make API structures 64-bit safe by avoiding holes inserted by the
  compiler for field alignment.

- Fix list locking.

- Use inlines rather than macros for obtaining local context
  pointers. (Previous code obscured use of variables and labels in
  macros.)

- Use endian-annotated types where appropriate.

- Use skb_queue_walk() for walking skb lists.

- Add check that session doesn't already exist before allowing it to
  be created.

- Remove unnecessary try_module_get() / put_module() calls.

- No need to check for NULL before calling kfree().

- Use macros in jiffies.h for time/jiffy conversion.

- Minimize size of L2TP's private data that is stored in skb-cb[].

- Change transmit path to build UDP header locally and transmit using
  ip_queue_xmit() rather than use the UDP socket's sendmsg(). This
  change also avoids the use of a kernel thread to do the transmit.

- Fix seq file code to avoid potential buffer overflow when there are
  lots of tunnels/sessions.

- Fix some socket refcount/lock bugs. Fixes derived from recent pppoe
  patches. 

The implementation uses the existing PPPoX subsystem that is currently
used only by PPPoE. A userspace daemon handles all L2TP control
protocol messages, while a PPPoX socket carries user data. The system
architecture is similar to PPPoE, where a pppd plugin uses the PPPoX
socket to send/receive PPP frames over an L2TP tunnel. PPP control
frames are delivered to pppd while data frames are handled entirely by
the kernel.

There are 5 patches in the series:-

1 - Introduce skb_queue_walk_safe() macro.

2 - API changes for L2TP. Adds definitions for L2TP in existing headers.

3 - pppol2tp driver core. New code. Requires patches 1  2.

4 - Change to PPPoX core to allow PPPoX protocol modules such as
pppol2tp to be autoloaded via modprobe.conf.

5 - Adds an entry to MAINTAINERS file.

The driver is being used on x86, ia64, ppc, arm, mips32 and possibly
other architectures. It is also known to work on multi-core SMP boxes.

--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5 2.6.21-rc7] skb: introduce skb_queue_walk_safe()

2007-04-23 Thread James Chapman
[SKB]: Introduce skb_queue_walk_safe()

This patch provides a method for walking skb lists while inserting or
removing skbs from the list.

Signed-off-by: James Chapman [EMAIL PROTECTED]

Index: linux-2.6.21-rc7/include/linux/skbuff.h
===
--- linux-2.6.21-rc7.orig/include/linux/skbuff.h
+++ linux-2.6.21-rc7/include/linux/skbuff.h
@@ -1299,6 +1299,11 @@ static inline int pskb_trim_rcsum(struct
 prefetch(skb-next), (skb != (struct sk_buff *)(queue));   
\
 skb = skb-next)
 
+#define skb_queue_walk_safe(queue, skb, tmp)   
\
+   for (skb = (queue)-next, tmp = skb-next;  
\
+skb != (struct sk_buff *)(queue);  
\
+skb = tmp, tmp = skb-next)
+
 #define skb_queue_reverse_walk(queue, skb) \
for (skb = (queue)-prev;   
\
 prefetch(skb-prev), (skb != (struct sk_buff *)(queue));   
\
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5 2.6.21-rc7] l2tp: pppol2tp changes to existing ppp kernel headers

2007-04-23 Thread James Chapman
[L2TP]: Modify kernel headers for L2TP.

This patch adds L2TP definitions to existing PPP and socket headers.

Signed-off-by: James Chapman [EMAIL PROTECTED]

---

Add struct sockaddr_pppol2tp to carry L2TP-specific address
information for the PPPoX (PPPoL2TP) socket. Unfortunately we can't
use the union inside struct sockaddr_pppox because the L2TP-specific
data is larger than the current size of the union and we must preserve
the size of struct sockaddr_pppox for binary compatibility.

Also add an ioctl to allow userspace to obtain L2TP counters and state
from the kernel.

Index: linux-2.6.21-rc7/include/linux/if_ppp.h
===
--- linux-2.6.21-rc7.orig/include/linux/if_ppp.h
+++ linux-2.6.21-rc7/include/linux/if_ppp.h
@@ -110,6 +110,21 @@ struct ifpppcstatsreq {
struct ppp_comp_stats stats;
 };
 
+/* For PPPIOCGL2TPSTATS */
+struct pppol2tp_ioc_stats {
+   __u16   tunnel_id;  /* redundant */
+   __u16   session_id; /* if zero, get tunnel stats */
+   __u32   using_ipsec:1;  /* valid only for session_id == 0 */
+   __u64   tx_packets;
+   __u64   tx_bytes;
+   __u64   tx_errors;
+   __u64   rx_packets;
+   __u64   rx_bytes;
+   __u64   rx_seq_discards;
+   __u64   rx_oos_packets;
+   __u64   rx_errors;
+};
+
 #define ifr__name   b.ifr_ifrn.ifrn_name
 #define stats_ptr   b.ifr_ifru.ifru_data
 
@@ -146,6 +161,7 @@ struct ifpppcstatsreq {
 #define PPPIOCDISCONN  _IO('t', 57)/* disconnect channel */
 #define PPPIOCATTCHAN  _IOW('t', 56, int)  /* attach to ppp channel */
 #define PPPIOCGCHAN_IOR('t', 55, int)  /* get ppp channel number */
+#define PPPIOCGL2TPSTATS _IOR('t', 54, struct pppol2tp_ioc_stats)
 
 #define SIOCGPPPSTATS   (SIOCDEVPRIVATE + 0)
 #define SIOCGPPPVER (SIOCDEVPRIVATE + 1)   /* NEVER change this!! */
Index: linux-2.6.21-rc7/include/linux/if_pppox.h
===
--- linux-2.6.21-rc7.orig/include/linux/if_pppox.h
+++ linux-2.6.21-rc7/include/linux/if_pppox.h
@@ -18,7 +18,19 @@
 
 
 #include asm/types.h
+#ifdef __KERNEL__
 #include asm/byteorder.h
+#else
+#include endian.h
+#include byteswap.h
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define __LITTLE_ENDIAN_BITFIELD
+#elif __BYTE_ORDER == __BIG_ENDIAN
+#define __BIG_ENDIAN_BITFIELD
+#else
+#error Adjust your endian.h defines.
+#endif
+#endif
 
 #ifdef  __KERNEL__
 #include linux/if_ether.h
@@ -27,6 +39,7 @@
 #include asm/semaphore.h
 #include linux/ppp_channel.h
 #endif /* __KERNEL__ */
+#include linux/if_pppol2tp.h
 
 /* For user-space programs to pick up these definitions
  * which they wouldn't get otherwise without defining __KERNEL__
@@ -50,8 +63,9 @@ struct pppoe_addr{ 
  * Protocols supported by AF_PPPOX 
  */ 
 #define PX_PROTO_OE0 /* Currently just PPPoE */
-#define PX_MAX_PROTO   1   
- 
+#define PX_PROTO_OL2TP 1 /* Now L2TP also */
+#define PX_MAX_PROTO   2
+
 struct sockaddr_pppox { 
sa_family_t sa_family;/* address family, AF_PPPOX */ 
unsigned intsa_protocol;  /* protocol identifier */ 
@@ -60,6 +74,16 @@ struct sockaddr_pppox { 
}sa_addr; 
 }__attribute__ ((packed)); 
 
+/* The use of the above union isn't viable because the size of this
+ * struct must stay fixed over time -- applications use sizeof(struct
+ * sockaddr_pppox) to fill it. We use a protocol specific sockaddr
+ * type instead.
+ */
+struct sockaddr_pppol2tp {
+   sa_family_t sa_family;  /* address family, AF_PPPOX */
+   unsigned intsa_protocol;/* protocol identifier */
+   struct pppol2tp_addr pppol2tp;
+}__attribute__ ((packed));
 
 /*
  *
Index: linux-2.6.21-rc7/include/linux/socket.h
===
--- linux-2.6.21-rc7.orig/include/linux/socket.h
+++ linux-2.6.21-rc7/include/linux/socket.h
@@ -284,6 +284,7 @@ struct ucred {
 #define SOL_DCCP   269
 #define SOL_NETLINK270
 #define SOL_TIPC   271
+#define SOL_PPPOL2TP   272
 
 /* IPX options */
 #define IPX_TYPE   1
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5 2.6.21-rc7] l2tp: pppox protocol module load

2007-04-23 Thread James Chapman
[L2TP]: Add the ability to autoload a pppox protocol module.

This patch allows a name pppox-proto-nnn to be used in modprobe.conf
to autoload a PPPoX protocol nnn.

Signed-off-by: James Chapman [EMAIL PROTECTED]

Index: linux-2.6.21-rc7/drivers/net/pppox.c
===
--- linux-2.6.21-rc7.orig/drivers/net/pppox.c
+++ linux-2.6.21-rc7/drivers/net/pppox.c
@@ -114,6 +114,13 @@ static int pppox_create(struct socket *s
goto out;
 
rc = -EPROTONOSUPPORT;
+#ifdef CONFIG_KMOD
+   if (!pppox_protos[protocol]) {
+   char buffer[32];
+   sprintf(buffer, pppox-proto-%d, protocol);
+   request_module(buffer);
+   }
+#endif
if (!pppox_protos[protocol] ||
!try_module_get(pppox_protos[protocol]-owner))
goto out;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5 2.6.21-rc7] l2tp: add pppol2tp maintainer

2007-04-23 Thread James Chapman
[L2TP]: Update maintainers file for PPP over L2TP.

Signed-off-by: James Chapman [EMAIL PROTECTED]

Index: linux-2.6.21-rc7/MAINTAINERS
===
--- linux-2.6.21-rc7.orig/MAINTAINERS
+++ linux-2.6.21-rc7/MAINTAINERS
@@ -2709,6 +2709,11 @@ P:   Michal Ostrowski
 M: [EMAIL PROTECTED]
 S: Maintained
 
+PPP OVER L2TP
+P: James Chapman
+M: [EMAIL PROTECTED]
+S: Maintained
+
 PREEMPTIBLE KERNEL
 P: Robert Love
 M: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/5 2.6.21-rc7] l2tp: pppol2tp changes to existing ppp kernel headers

2007-04-23 Thread Dale Farnsworth
James Chapman wrote:
 [L2TP]: Modify kernel headers for L2TP.

Since the Subject: line becomes the first line of the commit comments,
the above line is redundant.

 This patch adds L2TP definitions to existing PPP and socket headers.
 
 Signed-off-by: James Chapman [EMAIL PROTECTED]
 
 ---
 
 Add struct sockaddr_pppol2tp to carry L2TP-specific address
 information for the PPPoX (PPPoL2TP) socket. Unfortunately we can't
 use the union inside struct sockaddr_pppox because the L2TP-specific
 data is larger than the current size of the union and we must preserve
 the size of struct sockaddr_pppox for binary compatibility.
 
 Also add an ioctl to allow userspace to obtain L2TP counters and state
 from the kernel.

In my opinion, the above info is sufficiently useful to be included
commit comments.

-Dale Farnsworth
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-23 Thread Oleg Nesterov
On 04/23, David Howells wrote:
 
  We only care when del_timer() returns true. In that case, if the timer
  function still runs (possible for single-threaded wqs), it has already
  passed __queue_work().
 
 Why do you assume that?

If del_timer() returns true, the timer was pending. This means it was started
by work-func() (note that __run_timers() clears timer_pending() before calling
timer-function). This in turn means that delayed_work_timer_fn() has already
called __queue_work(dwork), otherwise work-func() has no chance to run.

When del_timer() returns true and delayed_work_timer_fn() doesn't run we are
safe, this doesn't differ from del_timer_sync().

Oleg.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8342] New: sctp_getsockopt_local_addrs_old() calls copy_to_user() while a spinlock is held

2007-04-23 Thread Vlad Yasevich
Andrew Morton wrote:
 On Mon, 16 Apr 2007 14:34:22 -0700
 [EMAIL PROTECTED] wrote:
 
 http://bugzilla.kernel.org/show_bug.cgi?id=8342

Summary: sctp_getsockopt_local_addrs_old() calls copy_to_user()
 while a spinlock is held
 Kernel Version: 2.6.20
 Status: NEW
   Severity: normal
  Owner: [EMAIL PROTECTED]
  Submitter: [EMAIL PROTECTED]


 Problem Description:

 sctp_getsockopt_local_addrs_old() in net/sctp/socket.c calls copy_to_user()
 while the spinlock addr_lock is held. this should not be done as 
 copy_to_user()
 might sleep. the call to sctp_copy_laddrs_to_user() while holding the lock is
 also problematic as it calls copy_to_user()

 
 yup.

Thanks for reporting.

The area of this particular lock is quite ugly and will need to be cleaned up.
In the mean time, here is a patch that fixes this for now.

-vlad
[PATCH] [SCTP] Fix sctp_getsockopt_local_addrs_old() to use local storage

sctp_getsockopt_local_addrs_old() in net/sctp/socket.c calls copy_to_user()
while the spinlock addr_lock is held. this should not be done as copy_to_user()
might sleep. the call to sctp_copy_laddrs_to_user() while holding the lock is
also problematic as it calls copy_to_user()

Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 net/sctp/socket.c |   96 +
 1 files changed, 60 insertions(+), 36 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 6bfae12..56ef543 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -3849,7 +3849,7 @@ static int sctp_getsockopt_peer_addrs(struct sock *sk, 
int len,
memcpy(temp, from-ipaddr, sizeof(temp));
sctp_get_pf_specific(sk-sk_family)-addr_v4map(sp, temp);
addrlen = sctp_get_af_specific(sk-sk_family)-sockaddr_len;
-   if(space_left  addrlen)
+   if (space_left  addrlen)
return -ENOMEM;
if (copy_to_user(to, temp, addrlen))
return -EFAULT;
@@ -3938,8 +3938,9 @@ done:
 /* Helper function that copies local addresses to user and returns the number
  * of addresses copied.
  */
-static int sctp_copy_laddrs_to_user_old(struct sock *sk, __u16 port, int 
max_addrs,
-   void __user *to)
+static int sctp_copy_laddrs_old(struct sock *sk, __u16 port,
+   int max_addrs, void *to,
+   size_t *bytes_copied)
 {
struct list_head *pos, *next;
struct sctp_sockaddr_entry *addr;
@@ -3956,10 +3957,10 @@ static int sctp_copy_laddrs_to_user_old(struct sock 
*sk, __u16 port, int max_add
sctp_get_pf_specific(sk-sk_family)-addr_v4map(sctp_sk(sk),
temp);
addrlen = sctp_get_af_specific(temp.sa.sa_family)-sockaddr_len;
-   if (copy_to_user(to, temp, addrlen))
-   return -EFAULT;
+   memcpy(to, temp, addrlen);
 
to += addrlen;
+   *bytes_copied += addrlen;
cnt ++;
if (cnt = max_addrs) break;
}
@@ -3967,8 +3968,8 @@ static int sctp_copy_laddrs_to_user_old(struct sock *sk, 
__u16 port, int max_add
return cnt;
 }
 
-static int sctp_copy_laddrs_to_user(struct sock *sk, __u16 port,
-   void __user **to, size_t space_left)
+static int sctp_copy_laddrs(struct sock *sk, __u16 port, void *to,
+   size_t space_left, size_t *bytes_copied)
 {
struct list_head *pos, *next;
struct sctp_sockaddr_entry *addr;
@@ -3985,14 +3986,14 @@ static int sctp_copy_laddrs_to_user(struct sock *sk, 
__u16 port,
sctp_get_pf_specific(sk-sk_family)-addr_v4map(sctp_sk(sk),
temp);
addrlen = sctp_get_af_specific(temp.sa.sa_family)-sockaddr_len;
-   if(space_leftaddrlen)
+   if (space_left  addrlen)
return -ENOMEM;
-   if (copy_to_user(*to, temp, addrlen))
-   return -EFAULT;
+   memcpy(to, temp, addrlen);
 
-   *to += addrlen;
+   to += addrlen;
cnt ++;
space_left -= addrlen;
+   bytes_copied += addrlen;
}
 
return cnt;
@@ -4016,6 +4017,8 @@ static int sctp_getsockopt_local_addrs_old(struct sock 
*sk, int len,
int addrlen;
rwlock_t *addr_lock;
int err = 0;
+   void *addrs;
+   size_t bytes_copied = 0;
 
if (len != sizeof(struct sctp_getaddrs_old))
return -EINVAL;
@@ -4043,6 +4046,15 @@ static int sctp_getsockopt_local_addrs_old(struct sock 
*sk, int len,
 
to = getaddrs.addrs;
 
+   /* Allocate space for a local instance of packed 

[PATCH] [SCTP] Verify all destination ports in sctp_connectx.

2007-04-23 Thread Vlad Yasevich
Hi David

Please apply.  This was recently uncovered on the sctp-implementers
list as 'linux doing something stupid...'

Thanks
-vlad

[SCTP] Verify all destination ports in sctp_connectx.

We need to make sure that all destination ports are the same, since
the association really must not connect to multiple different ports
at once.  This was reported on the sctp-impl list.

Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 net/sctp/socket.c |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 536298c..6bfae12 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -966,6 +966,7 @@ static int __sctp_connect(struct sock* sk,
int walk_size = 0;
union sctp_addr *sa_addr;
void *addr_buf;
+   unsigned short port;
 
sp = sctp_sk(sk);
ep = sp-ep;
@@ -986,6 +987,7 @@ static int __sctp_connect(struct sock* sk,
while (walk_size  addrs_size) {
sa_addr = (union sctp_addr *)addr_buf;
af = sctp_get_af_specific(sa_addr-sa.sa_family);
+   port = ntohs(sa_addr-v4.sin_port);
 
/* If the address family is not supported or if this address
 * causes the address buffer to overflow return EINVAL.
@@ -999,6 +1001,12 @@ static int __sctp_connect(struct sock* sk,
if (err)
goto out_free;
 
+   /* Make sure the destination port is set to the same 
+* value in each address.
+*/
+   if (asoc  asoc-peer.port  asoc-peer.port != port)
+   goto out_free;
+
memcpy(to, sa_addr, af-sockaddr_len);
 
/* Check if there already is a matching association on the
-- 
1.5.0.3.438.gc49b2

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


bug in my understanding (was Re: bug in tcp?)

2007-04-23 Thread Sebastian Kuzminsky
Sebastian Kuzminsky [EMAIL PROTECTED] wrote:
 Philip Craig [EMAIL PROTECTED] wrote:
  Change the timeout or use keepalives.  I can't think of any other way.
  The 5 days is a compromise between keeping valid connections and
  timing out dead connections.  There will always be connections for
  which it times out too fast or too slow.  I don't think there are
  any drawbacks to increasing the timeout if you aren't a router,
  but as long as there is a timeout, you need keepalives to be sure.
 
 Thanks!  I'll add keepalives and rerun the tests, and I expect the
 problem to go away.

I reran the tests with keepalive enabled and it worked just fine.
Thanks for all your help, and sorry for the false alarm!


-- 
Sebastian Kuzminsky
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with commit a0ee18b9b7d3847976c6fb315c06a34fb296de0e

2007-04-23 Thread David Miller
From: Ismail Dönmez [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 15:06:34 +0300

 --- a/net/ipv4/fib_semantics.c
 +++ b/net/ipv4/fib_semantics.c
 @@ -89,7 +89,7 @@ static const struct
  {
 int error;
 u8  scope;
 -} fib_props[RTA_MAX + 1] = {
 +} fib_props[RTN_MAX + 1] = {

This fix is already in the net-2.6.22 tree.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] update MAINTAINERS for wireless mailing list

2007-04-23 Thread John W. Linville
From: Johannes Berg [EMAIL PROTECTED]

This patch adds the linux-wireless mailing list to all appropriate
entries in the MAINTAINERS file.

Signed-off-by: Johannes Berg [EMAIL PROTECTED]
Signed-off-by: John W. Linville [EMAIL PROTECTED]
---
 MAINTAINERS |   15 ---
 1 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6876c8b..a15ec1a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -656,6 +656,7 @@ S:  Supported
 ATMEL WIRELESS DRIVER
 P: Simon Kelley
 M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
 W: http://www.thekelleys.org.uk/atmel
 W: http://atmelwlandriver.sourceforge.net/
 S: Maintained
@@ -711,6 +712,7 @@ P:  Larry Finger
 M: [EMAIL PROTECTED]
 P: Stefano Brivio
 M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
 W: http://bcm43xx.berlios.de/
 S: Maintained
 
@@ -1552,6 +1554,7 @@ S:Supported
 HOST AP DRIVER
 P: Jouni Malinen
 M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 W: http://hostap.epitest.fi/
 S: Maintained
@@ -1838,6 +1841,7 @@ P:Yi Zhu
 M: [EMAIL PROTECTED]
 P: James Ketrenos
 M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 L: http://lists.sourceforge.net/mailman/listinfo/ipw2100-devel
 W: http://ipw2100.sourceforge.net
@@ -1848,6 +1852,7 @@ P:Yi Zhu
 M: [EMAIL PROTECTED]
 P: James Ketrenos
 M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 L: http://lists.sourceforge.net/mailman/listinfo/ipw2100-devel
 W: http://ipw2200.sourceforge.net
@@ -2540,6 +2545,7 @@ P:Pavel Roskin
 M: [EMAIL PROTECTED]
 P: David Gibson
 M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 W: http://www.nongnu.org/orinoco/
@@ -2719,7 +2725,7 @@ S:Supported
 PRISM54 WIRELESS DRIVER
 P: Prism54 Development Team
 M: [EMAIL PROTECTED]
-L: netdev@vger.kernel.org
+L: [EMAIL PROTECTED]
 W: http://prism54.org
 S: Maintained
 
@@ -2790,7 +2796,7 @@ S:Maintained
 RAYLINK/WEBGEAR 802.11 WIRELESS LAN DRIVER
 P: Corey Thomas
 M: [EMAIL PROTECTED]
-L: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
 S: Maintained
 
 RANDOM NUMBER DRIVER
@@ -3053,7 +3059,7 @@ M:[EMAIL PROTECTED]
 P: Daniel Drake
 M: [EMAIL PROTECTED]
 W: http://softmac.sipsolutions.net/
-L: netdev@vger.kernel.org
+L: [EMAIL PROTECTED]
 S: Maintained
 
 SOFTWARE RAID (Multiple Disks) SUPPORT
@@ -3758,6 +3764,7 @@ S:Maintained
 WAVELAN NETWORK DRIVER  WIRELESS EXTENSIONS
 P: Jean Tourrilhes
 M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
 W: http://www.hpl.hp.com/personal/Jean_Tourrilhes/Linux/
 S: Maintained
 
@@ -3775,6 +3782,7 @@ S:Maintained
 WL3501 WIRELESS PCMCIA CARD DRIVER
 P: Arnaldo Carvalho de Melo
 M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
 W: http://oops.ghostprotocols.net:81/blog
 S: Maintained
 
@@ -3838,6 +3846,7 @@ M:[EMAIL PROTECTED]
 P: Ulrich Kunitz
 M: [EMAIL PROTECTED]
 W: http://zd1211.ath.cx/wiki/DriverRewrite
+L: [EMAIL PROTECTED]
 L: [EMAIL PROTECTED] (subscribers-only)
 S: Maintained
 
-- 
1.5.0.6

-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] refactor wireless Kconfig

2007-04-23 Thread John W. Linville
From: Johannes Berg [EMAIL PROTECTED]

This patch refactors the wireless Kconfig all over and already
introduces net/wireless/Kconfig with just the WEXT bit for now,
the cfg80211 patch will add to that as well.

Signed-off-by: Johannes Berg [EMAIL PROTECTED]
Signed-off-by: John W. Linville [EMAIL PROTECTED]
---
 drivers/net/Makefile  |2 +-
 drivers/net/wireless/Kconfig  |  120 ++---
 drivers/net/wireless/bcm43xx/Kconfig  |3 +-
 drivers/net/wireless/hostap/Kconfig   |3 +-
 drivers/net/wireless/zd1211rw/Kconfig |3 +-
 net/Kconfig   |   11 ++-
 net/Makefile  |2 +
 net/ieee80211/Kconfig |3 +-
 net/wireless/Kconfig  |   23 ++
 net/wireless/Makefile |1 +
 10 files changed, 92 insertions(+), 79 deletions(-)
 create mode 100644 net/wireless/Kconfig
 create mode 100644 net/wireless/Makefile

diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 33af833..5852732 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -206,7 +206,7 @@ obj-$(CONFIG_TR) += tokenring/
 obj-$(CONFIG_WAN) += wan/
 obj-$(CONFIG_ARCNET) += arcnet/
 obj-$(CONFIG_NET_PCMCIA) += pcmcia/
-obj-$(CONFIG_NET_RADIO) += wireless/
+obj-y += wireless/
 obj-$(CONFIG_NET_TULIP) += tulip/
 obj-$(CONFIG_HAMRADIO) += hamradio/
 obj-$(CONFIG_IRDA) += irda/
diff --git a/drivers/net/wireless/Kconfig b/drivers/net/wireless/Kconfig
index ece3d9c..880c628 100644
--- a/drivers/net/wireless/Kconfig
+++ b/drivers/net/wireless/Kconfig
@@ -2,47 +2,21 @@
 # Wireless LAN device configuration
 #
 
-menu Wireless LAN (non-hamradio)
-   depends on NETDEVICES
-
-config NET_RADIO
-   bool Wireless LAN drivers (non-hamradio)  Wireless Extensions
-   select WIRELESS_EXT
-   ---help---
- Support for wireless LANs and everything having to do with radio,
- but not with amateur radio or FM broadcasting.
-
- Saying Y here also enables the Wireless Extensions (creates
- /proc/net/wireless and enables iwconfig access). The Wireless
- Extension is a generic API allowing a driver to expose to the user
- space configuration and statistics specific to common Wireless LANs.
- The beauty of it is that a single set of tool can support all the
- variations of Wireless LANs, regardless of their type (as long as
- the driver supports Wireless Extension). Another advantage is that
- these parameters may be changed on the fly without restarting the
- driver (or Linux). If you wish to use Wireless Extensions with
- wireless PCMCIA (PC-) cards, you need to say Y here; you can fetch
- the tools from
- http://www.hpl.hp.com/personal/Jean_Tourrilhes/Linux/Tools.html.
+menu Wireless LAN
 
-config NET_WIRELESS_RTNETLINK
-   bool Wireless Extension API over RtNetlink
-   depends on NET_RADIO
+config WLAN_PRE80211
+   bool Wireless LAN (pre-802.11)
+   depends on NETDEVICES
---help---
- Support the Wireless Extension API over the RtNetlink socket
- in addition to the traditional ioctl interface (selected above).
+ Say Y if you have any pre-802.11 wireless LAN hardware.
 
- For now, few tools use this facility, but it might grow in the
- future. The only downside is that it adds 4.5 kB to your kernel.
-
-# Note : the cards are obsolete (can't buy them anymore), but the drivers
-# are not, as people are still using them...
-comment Obsolete Wireless cards support (pre-802.11)
-   depends on NET_RADIO  (INET || ISA || PCMCIA)
+ This option does not affect the kernel build, it only
+ let's you choose drivers.
 
 config STRIP
tristate STRIP (Metricom starmode radio IP)
-   depends on NET_RADIO  INET
+   depends on INET  WLAN_PRE80211
+   select WIRELESS_EXT
---help---
  Say Y if you have a Metricom radio and intend to use Starmode Radio
  IP. STRIP is a radio protocol developed for the MosquitoNet project
@@ -65,7 +39,8 @@ config STRIP
 
 config ARLAN
tristate Aironet Arlan 655  IC2200 DS support
-   depends on NET_RADIO  ISA  !64BIT
+   depends on ISA  !64BIT  WLAN_PRE80211
+   select WIRELESS_EXT
---help---
  Aironet makes Arlan, a class of wireless LAN adapters. These use the
  www.Telxon.com chip, which is also used on several similar cards.
@@ -80,7 +55,8 @@ config ARLAN
 
 config WAVELAN
tristate ATT/Lucent old WaveLAN  DEC RoamAbout DS ISA support
-   depends on NET_RADIO  ISA
+   depends on ISA  WLAN_PRE80211
+   select WIRELESS_EXT
---help---
  The Lucent WaveLAN (formerly NCR and ATT; or DEC RoamAbout DS) is
  a Radio LAN (wireless Ethernet-like Local Area Network) using the
@@ -107,7 +83,8 @@ config WAVELAN
 
 config PCMCIA_WAVELAN
tristate ATT/Lucent old 

[PATCH] cfg80211: new wireless config infrastructure

2007-04-23 Thread John W. Linville
From: Johannes Berg [EMAIL PROTECTED]

This patch creates the core cfg80211 code along with some sysfs bits.
This is a stripped down version to allow mac80211 to function, but
doesn't include any configuration yet except for creating and removing
virtual interfaces.

This patch includes the nl80211 header file but it only contains the
interface types which the cfg80211 interface for creating virtual
interfaces relies on.

Signed-off-by: Johannes Berg [EMAIL PROTECTED]
Signed-off-by: John W. Linville [EMAIL PROTECTED]
---
 CREDITS   |6 ++
 MAINTAINERS   |6 ++
 include/linux/Kbuild  |1 +
 include/linux/netdevice.h |4 +
 include/linux/nl80211.h   |   38 
 include/net/cfg80211.h|   36 
 include/net/wireless.h|  139 ++
 net/wireless/Kconfig  |3 +
 net/wireless/Makefile |4 +-
 net/wireless/core.c   |  209 +
 net/wireless/core.h   |   49 +++
 net/wireless/sysfs.c  |   80 +
 net/wireless/sysfs.h  |9 ++
 13 files changed, 583 insertions(+), 1 deletions(-)
 create mode 100644 include/linux/nl80211.h
 create mode 100644 include/net/cfg80211.h
 create mode 100644 include/net/wireless.h
 create mode 100644 net/wireless/core.c
 create mode 100644 net/wireless/core.h
 create mode 100644 net/wireless/sysfs.c
 create mode 100644 net/wireless/sysfs.h

diff --git a/CREDITS b/CREDITS
index e3e7271..dede114 100644
--- a/CREDITS
+++ b/CREDITS
@@ -317,6 +317,12 @@ S: 2322 37th Ave SW
 S: Seattle, Washington 98126-2010
 S: USA
 
+N: Johannes Berg
+E: [EMAIL PROTECTED]
+W: http://johannes.sipsolutions.net/
+P: 1024D/9AB78CA5 AD02 0176 4E29 C137 1DF6 08D2 FC44 CF86 9AB7 8CA5
+D: powerpc  802.11 hacker
+
 N: Stephen R. van den Berg (AKA BuGless)
 E: [EMAIL PROTECTED]
 D: General kernel, gcc, and libc hacker
diff --git a/MAINTAINERS b/MAINTAINERS
index a15ec1a..3994eee 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -894,6 +894,12 @@ M: [EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 S: Maintained
 
+CFG80211 and NL80211
+P: Johannes Berg
+M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
+S: Maintained
+
 COMMON INTERNET FILE SYSTEM (CIFS)
 P: Steve French
 M: [EMAIL PROTECTED]
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index ea86f2e..4ff0f57 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -113,6 +113,7 @@ header-y += netrom.h
 header-y += nfs2.h
 header-y += nfs4_mount.h
 header-y += nfs_mount.h
+header-y += nl80211.h
 header-y += oom.h
 header-y += param.h
 header-y += pci_regs.h
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 71fc8ff..584c199 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -42,6 +42,8 @@
 struct vlan_group;
 struct ethtool_ops;
 struct netpoll_info;
+/* 802.11 specific */
+struct wireless_dev;
/* source back-compat hooks */
 #define SET_ETHTOOL_OPS(netdev,ops) \
( (netdev)-ethtool_ops = (ops) )
@@ -400,6 +402,8 @@ struct net_device
void*ip6_ptr;   /* IPv6 specific data */
void*ec_ptr;/* Econet specific data */
void*ax25_ptr;  /* AX.25 specific data */
+   struct wireless_dev *ieee80211_ptr; /* IEEE 802.11 specific data,
+  assign before registering */
 
 /*
  * Cache line mostly used on receive path (including eth_type_trans())
diff --git a/include/linux/nl80211.h b/include/linux/nl80211.h
new file mode 100644
index 000..9a30ba2
--- /dev/null
+++ b/include/linux/nl80211.h
@@ -0,0 +1,38 @@
+#ifndef __LINUX_NL80211_H
+#define __LINUX_NL80211_H
+/*
+ * 802.11 netlink interface public header
+ *
+ * Copyright 2006, 2007 Johannes Berg [EMAIL PROTECTED]
+ */
+
+/**
+ * enum nl80211_iftype - (virtual) interface types
+ * @NL80211_IFTYPE_UNSPECIFIED: unspecified type, driver decides
+ * @NL80211_IFTYPE_ADHOC: independent BSS member
+ * @NL80211_IFTYPE_STATION: managed BSS member
+ * @NL80211_IFTYPE_AP: access point
+ * @NL80211_IFTYPE_AP_VLAN: VLAN interface for access points
+ * @NL80211_IFTYPE_WDS: wireless distribution interface
+ * @NL80211_IFTYPE_MONITOR: monitor interface receiving all frames
+ * @__NL80211_IFTYPE_AFTER_LAST: internal use
+ *
+ * These values are used with the NL80211_ATTR_IFTYPE
+ * to set the type of an interface.
+ *
+ */
+enum nl80211_iftype {
+   NL80211_IFTYPE_UNSPECIFIED,
+   NL80211_IFTYPE_ADHOC,
+   NL80211_IFTYPE_STATION,
+   NL80211_IFTYPE_AP,
+   NL80211_IFTYPE_AP_VLAN,
+   NL80211_IFTYPE_WDS,
+   NL80211_IFTYPE_MONITOR,
+
+   /* keep last */
+   __NL80211_IFTYPE_AFTER_LAST
+};
+#define NL80211_IFTYPE_MAX (__NL80211_IFTYPE_AFTER_LAST - 1)
+
+#endif /* __LINUX_NL80211_H */
diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
new file mode 

[PATCH] remove wext over netlink

2007-04-23 Thread John W. Linville
From: Johannes Berg [EMAIL PROTECTED]

As scheduled, this patch removes the pointless wext over netlink
code.

Signed-off-by: Johannes Berg [EMAIL PROTECTED]
Signed-off-by: John W. Linville [EMAIL PROTECTED]
---
 Documentation/feature-removal-schedule.txt |   12 -
 include/net/iw_handler.h   |   10 -
 net/core/rtnetlink.c   |   31 --
 net/core/wireless.c|  735 
 net/wireless/Kconfig   |   10 -
 5 files changed, 0 insertions(+), 798 deletions(-)

diff --git a/Documentation/feature-removal-schedule.txt 
b/Documentation/feature-removal-schedule.txt
index 9817b60..976c8a1 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -285,18 +285,6 @@ Who:   Richard Purdie [EMAIL PROTECTED]
 
 ---
 
-What:  Wireless extensions over netlink (CONFIG_NET_WIRELESS_RTNETLINK)
-When:  with the merge of wireless-dev, 2.6.22 or later
-Why:   The option/code is
-* not enabled on most kernels
-* not required by any userspace tools (except an experimental one,
-  and even there only for some parts, others use ioctl)
-* pointless since wext is no longer evolving and the ioctl
-  interface needs to be kept
-Who:   Johannes Berg [EMAIL PROTECTED]
-

-
 What:  i8xx_tco watchdog driver
 When:  in 2.6.22
 Why:   the i8xx_tco watchdog driver has been replaced by the iTCO_wdt
diff --git a/include/net/iw_handler.h b/include/net/iw_handler.h
index 8a83018..909ca87 100644
--- a/include/net/iw_handler.h
+++ b/include/net/iw_handler.h
@@ -440,16 +440,6 @@ extern int dev_get_wireless_info(char * buffer, char 
**start, off_t offset,
 /* Handle IOCTLs, called in net/core/dev.c */
 extern int wireless_process_ioctl(struct ifreq *ifr, unsigned int cmd);
 
-/* Handle RtNetlink requests, called in net/core/rtnetlink.c */
-extern int wireless_rtnetlink_set(struct net_device *  dev,
- char *data,
- int   len);
-extern int wireless_rtnetlink_get(struct net_device *  dev,
- char *data,
- int   len,
- char **   p_buf,
- int * p_len);
-
 /* Second : functions that may be called by driver modules */
 
 /* Send a single event to user space */
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 75cea8e..4fe0f4b 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -51,10 +51,6 @@
 #include net/pkt_sched.h
 #include net/fib_rules.h
 #include net/rtnetlink.h
-#ifdef CONFIG_NET_WIRELESS_RTNETLINK
-#include linux/wireless.h
-#include net/iw_handler.h
-#endif /* CONFIG_NET_WIRELESS_RTNETLINK */
 
 struct rtnl_link
 {
@@ -684,17 +680,6 @@ static int rtnl_setlink(struct sk_buff *skb, struct 
nlmsghdr *nlh, void *arg)
modified = 1;
}
 
-#ifdef CONFIG_NET_WIRELESS_RTNETLINK
-   if (tb[IFLA_WIRELESS]) {
-   /* Call Wireless Extensions.
-* Various stuff checked in there... */
-   err = wireless_rtnetlink_set(dev, nla_data(tb[IFLA_WIRELESS]),
-nla_len(tb[IFLA_WIRELESS]));
-   if (err  0)
-   goto errout_dev;
-   }
-#endif /* CONFIG_NET_WIRELESS_RTNETLINK */
-
if (tb[IFLA_BROADCAST]) {
nla_memcpy(dev-broadcast, tb[IFLA_BROADCAST], dev-addr_len);
send_addr_notify = 1;
@@ -758,22 +743,6 @@ static int rtnl_getlink(struct sk_buff *skb, struct 
nlmsghdr* nlh, void *arg)
} else
return -EINVAL;
 
-
-#ifdef CONFIG_NET_WIRELESS_RTNETLINK
-   if (tb[IFLA_WIRELESS]) {
-   /* Call Wireless Extensions. We need to know the size before
-* we can alloc. Various stuff checked in there... */
-   err = wireless_rtnetlink_get(dev, nla_data(tb[IFLA_WIRELESS]),
-nla_len(tb[IFLA_WIRELESS]),
-iw_buf, iw_buf_len);
-   if (err  0)
-   goto errout;
-
-   /* Payload is at an offset in buffer */
-   iw = iw_buf + IW_EV_POINT_OFF;
-   }
-#endif /* CONFIG_NET_WIRELESS_RTNETLINK */
-
nskb = nlmsg_new(if_nlmsg_size(iw_buf_len), GFP_KERNEL);
if (nskb == NULL) {
err = -ENOBUFS;
diff --git a/net/core/wireless.c b/net/core/wireless.c
index 86db63d..fba295e 100644
--- a/net/core/wireless.c
+++ b/net/core/wireless.c
@@ -104,12 +104,10 @@
 
 /* Debugging stuff */
 #undef WE_IOCTL_DEBUG  /* Debug IOCTL API */
-#undef WE_RTNETLINK_DEBUG  /* Debug RtNetlink API */
 #undef WE_EVENT_DEBUG  /* Debug Event dispatcher 

Re: AF_PACKET how to get the original netdev from a packet received from a bonded master

2007-04-23 Thread Laurent Chavey

Thank you for the pointer, I am looking at the patch ad
see if it help. In my case, that will do the trick, but as mentioned
by Dave it may not have its place in a general distro.

Perso I would rather cloning -- do not flame me, it is not my intent to
open a long thread :-)
 the skb received on the bond and pass the clone + original skb
 to the tap. This gives a symetrical behavior to the tap...

Some of the point made for LLDP etc... may not require this
patch if one adds the processing as a protocol.

Thank you.

On 4/18/07, Chris Leech [EMAIL PROTECTED] wrote:

On 4/18/07, David Miller [EMAIL PROTECTED] wrote:
 From: Laurent Chavey [EMAIL PROTECTED]
 Date: Wed, 18 Apr 2007 16:05:27 -0700

  When using PF_PACKET socket with bonded interfaces, there is no
  way to get the slave interface (physical interface) the packet was
  actually received on.

 That's right.

 There isn't a real good solution to this problem either.
 Decapsulation can happen to arbitrary levels, how many
 devices should we remember and how to do the reference
 counting on that correctly?

 As such, I don't think this will ever be provided.

PJ Waskiewicz posted a patch that introduced a new socket option for
PF_PACKET sockets to return the orig_dev-ifindex instead of
dev-ifindex in sockaddr_ll.  He posted the patch on  March 8th.

http://marc.info/?l=linux-kernelm=117340678318939w=2

He also re-posted it on March 19th, at which point you (Dave) told him
that it was in your backlog and he didn't need to keep resending it.

To me it's seem like a decent opt-in solution for applications that
might care, without changing existing behavior.  Laurent, would that
change help you?  Dave, is there a reason you can see why this
wouldn't work?

- Chris


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with commit a0ee18b9b7d3847976c6fb315c06a34fb296de0e

2007-04-23 Thread Ismail Dönmez
Hi,
On Monday 23 April 2007 21:51:25 David Miller wrote:
 From: Ismail Dönmez [EMAIL PROTECTED]
 Date: Mon, 23 Apr 2007 15:06:34 +0300

  --- a/net/ipv4/fib_semantics.c
  +++ b/net/ipv4/fib_semantics.c
  @@ -89,7 +89,7 @@ static const struct
   {
  int error;
  u8  scope;
  -} fib_props[RTA_MAX + 1] = {
  +} fib_props[RTN_MAX + 1] = {

 This fix is already in the net-2.6.22 tree.

Yes I know the fix is in but I wondered why its creating such problems with 
2.6.18 kernel, guess it depends on some other commits.

Regards,
ismail

-- 
Life is a game, and if you aren't in it to win,
what the heck are you still doing here?

-- Linus Torvalds (talking about open source development)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


net-2.6.22 UDP stalls/hangs

2007-04-23 Thread Andrew Morton

I have a problem here.  To eliminate other -mm things I tested bare
git+ssh://master.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.22.git
as of 15 minutes ago and the problem is there too.

The machine is x86_64 running FC6.  The config is based on RH's own FC6
config and it's at http://userweb.kernel.org/~akpm/config-akpm2.txt

Symptoms are that networking-related initscripts take a long time. 
statd and cups take maybe a minute and when it gets to starting sendmail,
things appear to hang permanently - I hit the switch after a few minutes.

A sysrq-T was taken during the statd bringup stall:
http://userweb.kernel.org/~akpm/dmesg-akpm2.txt

The interesting bit is:


Apr 23 12:01:15 akpm2 kernel: rpc.statd S 001f2b1f297b 0  3479   
3478 (NOTLB)
Apr 23 12:01:15 akpm2 kernel:  81024ef2fb28 0082 
 000900c6
Apr 23 12:01:15 akpm2 kernel:  0246 004c 
81025eba8040 81025fe08100
Apr 23 12:01:15 akpm2 kernel:  81025eba8258 00075e936000 
29e4 0286
Apr 23 12:01:15 akpm2 kernel: Call Trace:
Apr 23 12:01:15 akpm2 kernel:  [8048c513] udp_poll+0x0/0x104
Apr 23 12:01:15 akpm2 kernel:  [8025e759] schedule_timeout+0x8a/0xad
Apr 23 12:01:15 akpm2 kernel:  [8028d7a3] process_timeout+0x0/0x5
Apr 23 12:01:15 akpm2 kernel:  [8022ed7b] do_sys_poll+0x27a/0x35c
Apr 23 12:01:15 akpm2 kernel:  [8021e6a6] __pollwait+0x0/0xdd
Apr 23 12:01:15 akpm2 kernel:  [80284486] 
default_wake_function+0x0/0xe
Apr 23 12:01:15 akpm2 kernel:  [80333c56] socket_has_perm+0x5b/0x68
Apr 23 12:01:15 akpm2 kernel:  [802510ea] sock_sendmsg+0xea/0x107
Apr 23 12:01:15 akpm2 kernel:  [8048eacb] arp_bind_neighbour+0x6b/0x9f
Apr 23 12:01:15 akpm2 kernel:  [8029696e] 
autoremove_wake_function+0x0/0x2e
Apr 23 12:01:15 akpm2 kernel:  [80254bf4] 
__ip_route_output_key+0x709/0x7c4
Apr 23 12:01:15 akpm2 kernel:  [804565a5] sys_sendto+0x128/0x151
Apr 23 12:01:15 akpm2 kernel:  [80331502] file_has_perm+0x48/0xa3
Apr 23 12:01:15 akpm2 kernel:  [80249319] sys_poll+0x32/0x3b
Apr 23 12:01:15 akpm2 kernel:  [8025911e] system_call+0x7e/0x83
Apr 23 12:01:15 akpm2 kernel: 
Apr 23 12:01:51 akpm2 rpc.statd[3479]: gethostbyname error for 
akpm2.corp.google.com

I think I saw the same problem maybe 1.5 weeks ago on this machine, but I
didn't have time to investigate further.  So it's not some recent thing.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] drivers/net/wireless/Kconfig: correct minor typo

2007-04-23 Thread John W. Linville
Correct minor typo in drivers/net/wireless/Kconfig identified by
Stefano Brivio [EMAIL PROTECTED].

Signed-off-by: John W. Linville [EMAIL PROTECTED]
---
diff --git a/drivers/net/wireless/Kconfig b/drivers/net/wireless/Kconfig
index 880c628..4426841 100644
--- a/drivers/net/wireless/Kconfig
+++ b/drivers/net/wireless/Kconfig
@@ -11,7 +11,7 @@ config WLAN_PRE80211
  Say Y if you have any pre-802.11 wireless LAN hardware.
 
  This option does not affect the kernel build, it only
- let's you choose drivers.
+ lets you choose drivers.
 
 config STRIP
tristate STRIP (Metricom starmode radio IP)
@@ -112,7 +112,7 @@ config WLAN_80211
  Say Y if you have any 802.11 wireless LAN hardware.
 
  This option does not affect the kernel build, it only
- let's you choose drivers.
+ lets you choose drivers.
 
 config PCMCIA_RAYCS
tristate Aviator/Raytheon 2.4MHz wireless support

-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] drivers/net/wireless/Kconfig: correct minor typo

2007-04-23 Thread David Miller
From: John W. Linville [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 15:53:13 -0400

 Correct minor typo in drivers/net/wireless/Kconfig identified by
 Stefano Brivio [EMAIL PROTECTED].
 
 Signed-off-by: John W. Linville [EMAIL PROTECTED]

Applied, thanks John.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net-2.6.22 UDP stalls/hangs

2007-04-23 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 13:27:19 -0700

 On Mon, 23 Apr 2007 13:18:10 -0700 (PDT)
 David Miller [EMAIL PROTECTED] wrote:
 
  From: Andrew Morton [EMAIL PROTECTED]
  Date: Mon, 23 Apr 2007 13:07:34 -0700
  
   The interesting bit is:
   ...
   I think I saw the same problem maybe 1.5 weeks ago on this machine, but I
   didn't have time to investigate further.  So it's not some recent thing.
  
  My initial reaction is that DNS responses are being lost or dropped
  for some reason.
 
 Plausible.   I'll try booting it with the ethernet unplugged.

That won't test the same scenerio.

If the network cable is unplugged, ARP responses won't arrive and
therefore sendmsg() calls will return with a host unreachable error.

The situation you need to recreate is specifically UDP packets getting
dropped.

The reason I wanted the tcpdump trace is so that we can see whether
the problem is UDP packets going out or going in which are being
mangled/dropped.

You don't need a hub to get a dump.  Instead you can run a caching
named on some other system, configure your FC6 box to use that system
for DNS via /etc/resolv.conf, then run tcpdump on the caching named
machine.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net-2.6.22 UDP stalls/hangs

2007-04-23 Thread Andrew Morton
On Mon, 23 Apr 2007 13:37:30 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Andrew Morton [EMAIL PROTECTED]
 Date: Mon, 23 Apr 2007 13:27:19 -0700
 
  On Mon, 23 Apr 2007 13:18:10 -0700 (PDT)
  David Miller [EMAIL PROTECTED] wrote:
  
   From: Andrew Morton [EMAIL PROTECTED]
   Date: Mon, 23 Apr 2007 13:07:34 -0700
   
The interesting bit is:
...
I think I saw the same problem maybe 1.5 weeks ago on this machine, but 
I
didn't have time to investigate further.  So it's not some recent thing.
   
   My initial reaction is that DNS responses are being lost or dropped
   for some reason.
  
  Plausible.   I'll try booting it with the ethernet unplugged.
 
 That won't test the same scenerio.
 
 If the network cable is unplugged, ARP responses won't arrive and
 therefore sendmsg() calls will return with a host unreachable error.
 
 The situation you need to recreate is specifically UDP packets getting
 dropped.
 
 The reason I wanted the tcpdump trace is so that we can see whether
 the problem is UDP packets going out or going in which are being
 mangled/dropped.
 
 You don't need a hub to get a dump.  Instead you can run a caching
 named on some other system, configure your FC6 box to use that system
 for DNS via /etc/resolv.conf, then run tcpdump on the caching named
 machine.

hm, fancy.



I unplugged the cable and the machine booted normally.  Lots of commands
were hanging when I plugged it back in.

I plugged the cable back in and on one console ran

tcpdump -l -i eth0

but of course tcpdump didn't do anything because it wants to do reverse
lookups.  But interestingly, tcpdump was taking maybe 15 seconds to respond
to ^c and to killall.  tcpdump was stuck in udp_poll(), like statd was. 
But I think it's significant that we're not taking signals while in that
interruptible sleep.

I am able to ping the test machine from another host on the same network.

On the test machine I used `tcpdump -l -n -i eth0' and on another vt, ran
`ping www.google.com'.  The test machine is 172.18.116.155

13:40:51.120004 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=standby group=1 addr=172.18.119.254
13:40:51.489171 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
13:40:52.567615 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
13:40:53.489201 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
13:40:53.755655 arp who-has 172.18.119.254 tell 172.18.116.155
13:40:53.755991 arp reply 172.18.119.254 is-at 00:00:0c:07:ac:01
13:40:53.755997 IP 172.18.116.155.32806  172.24.0.7.domain:  42807+ A? 
www.google.com. (32)
13:40:53.991979 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=standby group=1 addr=172.18.119.254
13:40:55.435664 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
13:40:55.514942 IP 172.18.116.45.netbios-dgm  172.18.119.255.netbios-dgm: NBT 
UDP PACKET(138)
13:40:55.710092 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
13:40:56.463086 arp who-has 172.18.119.254 tell 172.18.116.45
13:40:56.856033 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=standby group=1 addr=172.18.119.254
13:40:57.709673 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
13:40:58.331717 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
13:40:58.751949 IP 172.18.116.155.32807  172.25.146.107.domain:  42807+ A? 
www.google.com. (32)
13:40:59.276068 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-unknown (3) 16: 
state=initial group=2 [|hsrp]
13:40:59.709703 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
13:40:59.716492 IP 172.18.119.178.netbios-dgm  172.18.119.255.netbios-dgm: NBT 
UDP PACKET(138)
13:40:59.814742 arp who-has 172.18.119.254 tell 172.18.116.206
13:40:59.844096 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=standby group=1 addr=172.18.119.254
13:41:01.215791 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
13:41:01.709583 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
13:41:01.751918 IP 172.18.116.199.ipp  172.18.119.255.ipp: UDP, length 124
13:41:02.776596 arp who-has 172.18.119.254 tell 172.18.117.227
13:41:02.836204 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=standby group=1 addr=172.18.119.254
13:41:03.709613 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 

so it looks like we tried to send the query but we didn't see anything come 
back.



Which means I need 

[RFC] SIOCGIFCOUNT implementation

2007-04-23 Thread Stephen Hemminger
Trivial implementation of SIOCGIFCOUNT.
If you search the archives, this was suggested and rejected
6 years ago. But there maybe legacy Unix applications that
could actually use it.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 net/core/dev.c |   50 --
 1 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 4dc93cc..6779aa4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1987,7 +1987,7 @@ int register_gifconf(unsigned int family, gifconf_func_t 
* gifconf)
  * match.  --pb
  */
 
-static int dev_ifname(struct ifreq __user *arg)
+static int dev_ifname(void __user *arg)
 {
struct net_device *dev;
struct ifreq ifr;
@@ -2020,7 +2020,7 @@ static int dev_ifname(struct ifreq __user *arg)
  * Thus we will need a 'compatibility mode'.
  */
 
-static int dev_ifconf(char __user *arg)
+static int dev_ifconf(void __user *arg)
 {
struct ifconf ifc;
struct net_device *dev;
@@ -2071,6 +2071,32 @@ static int dev_ifconf(char __user *arg)
return copy_to_user(arg, ifc, sizeof(struct ifconf)) ? -EFAULT : 0;
 }
 
+/*
+ * Perform SIOCGIFCONF give estimate of number of interfaces.
+ * This show the poor design of Berkeley ioctl interface because number
+ * of interfaces can change at any time.
+ */
+static int dev_ifcount(char __user *arg)
+{
+   struct ifconf ifc;
+   struct net_device *dev;
+   int i, total = 0;
+
+   for (dev = dev_base; dev; dev = dev-next)
+   for (i = 0; i  NPROTO; i++)
+   if (gifconf_list[i]) {
+   int done = gifconf_list[i](dev, NULL, 0);
+   if (done  0)
+   total += done;
+   
+   }
+
+   memset(ifc, 0, sizeof(ifc));
+   ifc.ifc_len = total;
+   return copy_to_user(arg, ifc, sizeof(struct ifconf)) ? -EFAULT : 0;
+}
+
+
 #ifdef CONFIG_PROC_FS
 /*
  * This is invoked by the /proc filesystem handler to display a device
@@ -2643,19 +2669,23 @@ int dev_ioctl(unsigned int cmd, void __user *arg)
int ret;
char *colon;
 
-   /* One special case: SIOCGIFCONF takes ifconf argument
-  and requires shared lock, because it sleeps writing
-  to user space.
-*/
+   /* Special case ioctl's that don't relate to a specific device */
+   switch(cmd) {
+   case SIOCGIFCONF:
+   rtnl_lock();
+   ret = dev_ifconf(arg);
+   rtnl_unlock();
+   return ret;
 
-   if (cmd == SIOCGIFCONF) {
+   case SIOCGIFCOUNT:
rtnl_lock();
-   ret = dev_ifconf((char __user *) arg);
+   ret = dev_ifcount(arg);
rtnl_unlock();
return ret;
+
+   case SIOCGIFNAME:
+   return dev_ifname(arg);
}
-   if (cmd == SIOCGIFNAME)
-   return dev_ifname((struct ifreq __user *)arg);
 
if (copy_from_user(ifr, arg, sizeof(struct ifreq)))
return -EFAULT;
-- 
1.5.0.6

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net-2.6.22 UDP stalls/hangs

2007-04-23 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 13:56:39 -0700

 On Mon, 23 Apr 2007 13:37:30 -0700 (PDT)
 David Miller [EMAIL PROTECTED] wrote:
 
  From: Andrew Morton [EMAIL PROTECTED]
  Date: Mon, 23 Apr 2007 13:27:19 -0700
  
   On Mon, 23 Apr 2007 13:18:10 -0700 (PDT)
   David Miller [EMAIL PROTECTED] wrote:
   
From: Andrew Morton [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 13:07:34 -0700

 The interesting bit is:
 ...
 I think I saw the same problem maybe 1.5 weeks ago on this machine, 
 but I
 didn't have time to investigate further.  So it's not some recent 
 thing.

My initial reaction is that DNS responses are being lost or dropped
for some reason.
   
   Plausible.   I'll try booting it with the ethernet unplugged.
  
  That won't test the same scenerio.
  
  If the network cable is unplugged, ARP responses won't arrive and
  therefore sendmsg() calls will return with a host unreachable error.
  
  The situation you need to recreate is specifically UDP packets getting
  dropped.
  
  The reason I wanted the tcpdump trace is so that we can see whether
  the problem is UDP packets going out or going in which are being
  mangled/dropped.
  
  You don't need a hub to get a dump.  Instead you can run a caching
  named on some other system, configure your FC6 box to use that system
  for DNS via /etc/resolv.conf, then run tcpdump on the caching named
  machine.
 
 hm, fancy.
 
 
 
 I unplugged the cable and the machine booted normally.  Lots of commands
 were hanging when I plugged it back in.
 
 I plugged the cable back in and on one console ran
 
   tcpdump -l -i eth0
 
 but of course tcpdump didn't do anything because it wants to do reverse
 lookups.  But interestingly, tcpdump was taking maybe 15 seconds to respond
 to ^c and to killall.  tcpdump was stuck in udp_poll(), like statd was. 
 But I think it's significant that we're not taking signals while in that
 interruptible sleep.
 
 I am able to ping the test machine from another host on the same network.
 
 On the test machine I used `tcpdump -l -n -i eth0' and on another vt, ran
 `ping www.google.com'.  The test machine is 172.18.116.155
 
 13:40:53.755997 IP 172.18.116.155.32806  172.24.0.7.domain:  42807+ A? 
 www.google.com. (32)

...

no reply from 172.24.0.7

 13:40:58.751949 IP 172.18.116.155.32807  172.25.146.107.domain:  42807+ A? 
 www.google.com. (32)

...

no reply from 172.25.146.107

 so it looks like we tried to send the query but we didn't see anything come 
 back.

Right.
 
 Is nscd the caching named which you're referring to?

I would respond, but I first checked how many responses show up when
giving caching named fedora to google, and decided that you can
figure it out yourself :-)

More seriously, you need to install the caching-nameserver package
it appears, on Fedora.

nscd is not named, nscd is a part of glibc

named is part of the 'bind' package, you know, the standard DNS daemon
implementation for the past say 15 years or so... 

Aparently this 'caching-nameserver' package will bring in 'bind' plus
some configuration files that will give you a caching nameserver
setup.

You might have to tweak things for bind to allow non-local
connections.  On the machine where you install 'caching-nameserver'
use 127.0.0.1 in /etc/resolv.conf and make sure DNS lookups work, then
you can test on the FC6 system by using the other systems's IP
address.

And that's enough sysadmin FAQ'age for me for one day... :-/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with commit a0ee18b9b7d3847976c6fb315c06a34fb296de0e

2007-04-23 Thread Thomas Graf
* Ismail D?nmez [EMAIL PROTECTED] 2007-04-23 22:09
 Yes I know the fix is in but I wondered why its creating such problems with 
 2.6.18 kernel, guess it depends on some other commits.

As long as you apply the complete patch including the additional
sanity check for RTN_MAX it should work perfectly fine on 2.6.18.

I can't think of any connection between the patch and the errors
you are seeing.

Are you absolutely sure the errors you see are directly connected
to applying the patch?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


is CONFIG_NET_CLS_ACT always set ?

2007-04-23 Thread Laurent Chavey

if CONFIG_NET_CLS_ACT is not defined, then is the
code below correct ?


netif_receive_skb()
---
list_for_each_entry_rcu(ptype, ptype_all, list) {
if (!ptype-dev || ptype-dev == skb-dev) {
if (pt_prev)
ret = deliver_skb(skb, pt_prev);
pt_prev = ptype;
}
}

#ifdef CONFIG_NET_CLS_ACT
if (pt_prev) {
ret = deliver_skb(skb, pt_prev);
pt_prev = NULL; /* noone else should process this after*/
} else {
skb-tc_verd = SET_TC_OK2MUNGE(skb-tc_verd);
}

ret = ing_filter(skb);

if (ret == TC_ACT_SHOT || (ret == TC_ACT_STOLEN)) {
kfree_skb(skb);
goto out;
}

skb-tc_verd = 0;
ncls:
#endif
handle_diverter(skb);

if (handle_bridge(skb, pt_prev, ret))
goto out;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with commit a0ee18b9b7d3847976c6fb315c06a34fb296de0e

2007-04-23 Thread Ismail Dönmez
On Tuesday 24 April 2007 00:17:40 Thomas Graf wrote:
 * Ismail D?nmez [EMAIL PROTECTED] 2007-04-23 22:09

  Yes I know the fix is in but I wondered why its creating such problems
  with 2.6.18 kernel, guess it depends on some other commits.

 As long as you apply the complete patch including the additional
 sanity check for RTN_MAX it should work perfectly fine on 2.6.18.

The sanity check part doesn't seem to apply to 2.6.18.

 I can't think of any connection between the patch and the errors
 you are seeing.

 Are you absolutely sure the errors you see are directly connected
 to applying the patch?

Yes actually I am but I'll re-test and see. Thanks.

Regards,
ismail

-- 
Life is a game, and if you aren't in it to win,
what the heck are you still doing here?

-- Linus Torvalds (talking about open source development)


signature.asc
Description: This is a digitally signed message part.


Re: [RFC] SIOCGIFCOUNT implementation

2007-04-23 Thread Andi Kleen
 
 Please also tell me what legacy application could actually
 use this thing when BSD itself doesn't even implement it?

Some version of java seems to. I see a lot of:

ioctl32(java:30851): Unknown cmd fd(3) cmd(8938){00} arg(bfbb87c4) on 
socket:[334628709]

But just rejecting it is probably ok.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net-2.6.22 UDP stalls/hangs

2007-04-23 Thread Andrew Morton
On Mon, 23 Apr 2007 14:17:06 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

  Is nscd the caching named which you're referring to?
 
 I would respond, but I first checked how many responses show up when
 giving caching named fedora to google, and decided that you can
 figure it out yourself :-)
 
 More seriously, you need to install the caching-nameserver package
 it appears, on Fedora.

The machine on which I'd need to run the caching nameserver is in fact
running hacked-about Ubuntu.  It's on the corporate network so I risk being
chased by angry people with pointy sticks and `apt-cache search nameserver'
and `apt-cache search caching' don't come up with anything useful.

Sigh.  Looks like I'll need to drag in the hub from home tomorrow.

Or git-bisect.  Seems that rc7-mm1 isn't getting any closer.  I guess I'll
need to drop git-net and git-wireless and a bunch of other stuff for now so
I can get in and diagnose all the other bugs.


Let me play around with udpspam a bit.  It doesn't _have_ to be the
resolver.  Are there any simple UDP-based client/server test apps around?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[2/3] 2.6.21-rc7: known regressions (v2)

2007-04-23 Thread Adrian Bunk
This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject: ACPI: Crashes + hangs during modprobe processor
   unless processor.nocst
References : http://bugzilla.kernel.org/show_bug.cgi?id=8346
Submitter  : Olaf Kirch [EMAIL PROTECTED]
Status : unknown


Subject: kernels fail to boot with drives on ATIIXP controller
 (ACPI/IRQ related)
References : https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229621
 http://lkml.org/lkml/2007/3/4/257
Submitter  : Michal Jaegermann [EMAIL PROTECTED]
Status : unknown


Subject: ali_pata: boot from CD fails
References : http://lkml.org/lkml/2007/3/31/160
Submitter  : Stephen Clark [EMAIL PROTECTED]
Status : unknown


Subject: boot failure: rtl8139: exception in interrupt routine
References : http://lkml.org/lkml/2007/3/31/160
Submitter  : Stephen Clark [EMAIL PROTECTED]
Status : unknown


Subject: qla2xxx driver no longer works
References : http://lkml.org/lkml/2007/4/16/49
 http://lkml.org/lkml/2007/4/20/224
Submitter  : David Miller [EMAIL PROTECTED]
 Robert Peterson [EMAIL PROTECTED]
Handled-By : David Miller [EMAIL PROTECTED]
Patch  : http://lkml.org/lkml/2007/4/16/234
Status : patch was suggested


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net-2.6.22 UDP stalls/hangs

2007-04-23 Thread Andrew Morton
On Mon, 23 Apr 2007 14:45:57 -0700
Andrew Morton [EMAIL PROTECTED] wrote:

 Let me play around with udpspam a bit.

tcpdump does show stuff coming in when I run udpspam against the test
machine from another host.

More rtnl weirdness.  Running `ifup eth0' gave me:


Apr 23 14:53:57 localhost smartd[4051]: smartd has fork()ed into background 
mode. New PID=4051. 
Apr 23 14:56:47 localhost kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
Apr 23 14:56:47 localhost kernel: e1000: eth0: e1000_watchdog_task: NIC Link is 
Up 1000 Mbps Full Duplex, Flow Control: RX
Apr 23 14:56:47 localhost kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes 
ready
Apr 23 14:56:48 localhost dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 
67
Apr 23 14:56:48 localhost dhclient: DHCPACK from 172.18.119.253
Apr 23 14:56:48 localhost avahi-daemon[3971]: New relevant interface eth0.IPv4 
for mDNS.
Apr 23 14:56:48 localhost avahi-daemon[3971]: Joining mDNS multicast group on 
interface eth0.IPv4 with address 172.18.116.155.
Apr 23 14:56:48 localhost avahi-daemon[3971]: Registering new address record 
for 172.18.116.155 on eth0.
Apr 23 14:56:48 localhost avahi-daemon[3971]: Withdrawing address record for 
172.18.116.155 on eth0.
Apr 23 14:56:48 localhost avahi-daemon[3971]: Leaving mDNS multicast group on 
interface eth0.IPv4 with address 172.18.116.155.
Apr 23 14:56:48 localhost avahi-daemon[3971]: iface.c: 
interface_mdns_mcast_join() called but no local address available.
Apr 23 14:56:48 localhost avahi-daemon[3971]: Interface eth0.IPv4 no longer 
relevant for mDNS.
Apr 23 14:56:48 localhost avahi-daemon[3971]: New relevant interface eth0.IPv4 
for mDNS.
Apr 23 14:56:48 localhost avahi-daemon[3971]: Joining mDNS multicast group on 
interface eth0.IPv4 with address 172.18.116.155.
Apr 23 14:56:48 localhost kernel: RTNL: assertion failed at net/ipv4/igmp.c 
(1205)
Apr 23 14:56:48 localhost kernel: 
Apr 23 14:56:48 localhost kernel: Call Trace:
Apr 23 14:56:48 localhost kernel:  [8049340c] 
ip_mc_inc_group+0x3e/0x1f2
Apr 23 14:56:48 localhost kernel:  [80493b2b] 
ip_mc_join_group+0xca/0xe8
Apr 23 14:56:48 localhost kernel:  [8047e441] 
do_ip_setsockopt+0x6db/0x9d7
Apr 23 14:56:48 localhost kernel:  [8029696e] 
autoremove_wake_function+0x0/0x2e
Apr 23 14:56:48 localhost kernel:  [80336018] 
selinux_inode_getattr+0x50/0x5e
Apr 23 14:56:48 localhost kernel:  [80333c56] 
socket_has_perm+0x5b/0x68
Apr 23 14:56:48 localhost kernel:  [8047e7e5] ip_setsockopt+0x22/0x86
Apr 23 14:56:48 localhost kernel:  [8045587b] sys_setsockopt+0x8f/0xb5
Apr 23 14:56:48 localhost kernel:  [8025911e] system_call+0x7e/0x83
Apr 23 14:56:48 localhost kernel: 
Apr 23 14:56:48 localhost NET[4351]: /sbin/dhclient-script : updated 
/etc/resolv.conf
Apr 23 14:56:48 localhost avahi-daemon[3971]: Registering new address record 
for 172.18.116.155 on eth0.
Apr 23 14:56:48 localhost dhclient: bound to 172.18.116.155 -- renewal in 3205 
seconds.
Apr 23 14:56:49 localhost avahi-daemon[3971]: New relevant interface eth0.IPv6 
for mDNS.
Apr 23 14:56:49 localhost avahi-daemon[3971]: Joining mDNS multicast group on 
interface eth0.IPv6 with address fe80::204:23ff:fec6:d7d2.
Apr 23 14:56:49 localhost avahi-daemon[3971]: Registering new address record 
for fe80::204:23ff:fec6:d7d2 on eth0.

which is just stupid.  The rtnl_lock() is right there in ip_mc_join_group().
And this is a different architecture and config and compiler from yesterday's
fun.  And no scheduler patches involved here.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net-2.6.22 UDP stalls/hangs

2007-04-23 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 15:12:40 -0700

 which is just stupid.  The rtnl_lock() is right there in ip_mc_join_group().
 And this is a different architecture and config and compiler from yesterday's
 fun.  And no scheduler patches involved here.

Perhaps something on another cpu is dropping the rtnl semaphore one
times too many.

Recently a bug of this nature was discovered in the wireless stack.
But unless you have a wireless device in this box too, it's probably
unrelated.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net-2.6.22 UDP stalls/hangs

2007-04-23 Thread Andrew Morton
On Mon, 23 Apr 2007 15:15:31 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Andrew Morton [EMAIL PROTECTED]
 Date: Mon, 23 Apr 2007 15:12:40 -0700
 
  which is just stupid.  The rtnl_lock() is right there in ip_mc_join_group().
  And this is a different architecture and config and compiler from 
  yesterday's
  fun.  And no scheduler patches involved here.
 
 Perhaps something on another cpu is dropping the rtnl semaphore one
 times too many.
 
 Recently a bug of this nature was discovered in the wireless stack.
 But unless you have a wireless device in this box too, it's probably
 unrelated.

Could be.  But I'd expect the mutex code to whine about the extra unlock. 
And about from an unlock from a different thread.  If I have that option
turned on.

Oh well, one thing at a time.  The good news is that I can reproduce the
problem with netperf.

kpm:/usr/src/netperf-2.4.3 netperf -H akpm2 -t UDP_RR
UDP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to akpm2 
(172.18.116.155) port 0 AF_INET
netperf: receive_response: no response received. errno 0 counter 0

That's running netserver on the test machine.

The machine running netperf is 172.18.116.160 and the test machine running
netserver is 172.18.116.155

tcpdump from the test machine:

15:24:37.924210 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:38.859309 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=standby group=1 addr=172.18.119.254
15:24:39.078273 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
15:24:39.924074 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:40.017081 IP 172.24.0.7.domain  172.18.116.57.37456:  59635 4/7/6 
CNAME[|domain]
15:24:41.383433 IP 172.18.116.160.33137  172.18.116.155.12865: S 
2760291763:2760291763(0) win 5840 mss 1460,sackOK,timestamp 1967355840 
0,nop,wscale 8
15:24:41.383479 IP 172.18.116.155.12865  172.18.116.160.33137: S 
1640262480:1640262480(0) ack 2760291764 win 5792 mss 1460,sackOK,timestamp 
7714 1967355840,nop,wscale 7
15:24:41.383683 IP 172.18.116.160.33137  172.18.116.155.12865: . ack 1 win 23 
nop,nop,timestamp 1967355840 7714
15:24:41.383883 IP 172.18.116.160.33137  172.18.116.155.12865: P 1:257(256) 
ack 1 win 23 nop,nop,timestamp 1967355840 7714
15:24:41.383902 IP 172.18.116.155.12865  172.18.116.160.33137: . ack 257 win 
54 nop,nop,timestamp 7714 1967355840
15:24:41.384065 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) 
ack 257 win 54 nop,nop,timestamp 7714 1967355840
15:24:41.587266 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) 
ack 257 win 54 nop,nop,timestamp 7765 1967355840
15:24:41.839234 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=standby group=1 addr=172.18.119.254
15:24:41.924303 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:41.995285 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) 
ack 257 win 54 nop,nop,timestamp 7867 1967355840
15:24:42.030341 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
15:24:42.811330 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) 
ack 257 win 54 nop,nop,timestamp 8071 1967355840
15:24:43.924183 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:44.121880 IP 172.24.0.7.domain  172.18.116.22.46700:  52073* 1/4/4 
A[|domain]
15:24:44.443419 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) 
ack 257 win 54 nop,nop,timestamp 8479 1967355840
15:24:44.723257 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=standby group=1 addr=172.18.119.254
15:24:44.886356 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
15:24:45.924263 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:47.659300 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=standby group=1 addr=172.18.119.254
15:24:47.707599 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) 
ack 257 win 54 nop,nop,timestamp 9295 1967355840
15:24:47.874419 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
15:24:47.952350 802.1d config 8000.00:18:74:5d:04:66.80ae root 
0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:48.037569 IP 172.24.0.7.domain  172.18.117.18.46665:  59092 2/7/6 
CNAME[|domain]

So I think we did a bit of TCP chatter then no UDP at all?

It's interesting that the test machine can see other people's DNS queries
go past.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  

Re: net-2.6.22 UDP stalls/hangs

2007-04-23 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 15:37:14 -0700

 So I think we did a bit of TCP chatter then no UDP at all?
 
 It's interesting that the test machine can see other people's DNS queries
 go past.

It's mysterious alright.

I can't say that the UDP's are going out corrupted because tcpdump
seems to decode the DNS queries just fine.  Hmmm, if you're sending
this out on the broken machine we can't rule out corrupted checksums.

And if tcpdump doesn't see the UDP replies it means that it isn't even
reaching the device, let alone the stack.  At least that rules out
the stack dropping UDP packets for some reason.

It's possible we've stuffed up some expectation the e1000 driver
has for TX checksum offload.  Turn off TX checksums with
ethtool -K eth0 tx off and see if that makes the problem
go away.  Next, try ethtool -K eth0 rx off.

I suspect skb_transport_offset() might be wrong for UDP packets
for some reason, as that's what drivers/net/e1000/e1000_main.c
e1000_tx_csum() depend upon.

Either that or some error in Herbert's recent checksum offload
handling changes, such as, in fact I am highly suspicious of
the second change listed below, you may want to try reverting
just that one:

commit 8952d6c988ec31070732117f353666a4b9a09fea
Author: Herbert Xu [EMAIL PROTECTED]
Date:   Mon Apr 9 11:59:39 2007 -0700

[NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARY

When a transmitted packet is looped back directly, CHECKSUM_PARTIAL
maps to the semantics of CHECKSUM_UNNECESSARY.  Therefore we should
treat it as such in the stack.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
Signed-off-by: David S. Miller [EMAIL PROTECTED]

commit 7f8be19f5a5737ce6ad670756183235c71b560bb
Author: Herbert Xu [EMAIL PROTECTED]
Date:   Mon Apr 9 11:59:07 2007 -0700

[NET]: Use csum_start offset instead of skb_transport_header

The skb transport pointer is currently used to specify the start
of the checksum region for transmit checksum offload.  Unfortunately,
the same pointer is also used during receive side processing.

This creates a problem when we want to retransmit a received
packet with partial checksums since the skb transport pointer
would be overwritten.

This patch solves this problem by creating a new 16-bit csum_start
offset value to replace the skb transport header for the purpose
of checksums.  This offset is calculated from skb-head so that
it does not have to change when skb-data changes.

No extra space is required since csum_offset itself fits within
a 16-bit word so we can use the other 16 bits for csum_start.

For backwards compatibility, just before we push a packet with
partial checksums off into the device driver, we set the skb
transport header to what it would have been under the old scheme.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
Signed-off-by: David S. Miller [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: is CONFIG_NET_CLS_ACT always set ?

2007-04-23 Thread jamal
On Mon, 2007-23-04 at 14:19 -0700, Laurent Chavey wrote:
 if CONFIG_NET_CLS_ACT is not defined, then is the
 code below correct ?

The simple answer is yes; 
Is something not working for you?

It is hard to read your question my good friend because it comes out
like a trick question;- 
It is more useful to say:
 I think this code is broken because 
And even more useful if you add:
I have run the following testcases to illustrate blah ...
Then if something is broken, and you want to be popular, you say:
netdevers, heres the fix

Anyways, what is the problem?

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: is CONFIG_NET_CLS_ACT always set ?

2007-04-23 Thread Laurent Chavey

sorry for the trick question.

I was simply making a modification to the skb tap code
to allow tcpdump to work on the slave interface and not
just the bonded one. The code change requires a change to
the for loop code and it follow on to the call to deliver_skb().

In the process of adding the code, I noticed that if (at least
that is how I read the code)
CONFIG_NET_CLS_ACT is not defined then
the last pt_prev is call after the diverter and bridge code
have handled the skb.  This would only occur for bridged skb
that one wanted to tap.



On 4/23/07, jamal [EMAIL PROTECTED] wrote:

On Mon, 2007-23-04 at 14:19 -0700, Laurent Chavey wrote:
 if CONFIG_NET_CLS_ACT is not defined, then is the
 code below correct ?

The simple answer is yes;
Is something not working for you?

It is hard to read your question my good friend because it comes out
like a trick question;-
It is more useful to say:
 I think this code is broken because 
And even more useful if you add:
I have run the following testcases to illustrate blah ...
Then if something is broken, and you want to be popular, you say:
netdevers, heres the fix

Anyways, what is the problem?

cheers,
jamal



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Ethernet driver for SNI RM200/RM400 machines

2007-04-23 Thread Thomas Bogendoerfer
I'm currently working on getting LINUX supported on SNI RM200/300/400
machines (MIPS based servers and workstations). Most of the changes
are already in Linus tree. What's now missing are some device drivers.
One is an ethernet driver for the older EISA only RM200 and RM400
machines. They are using an Intel 82596 ethernet chip. Currently
there are 3 more or less different 82596 drivers in the tree, so
I didn't want to add another modified copy. I took lasi_82596.c,
extracted the common parts, put that in its own file and used
that for lasi_82956.c and sni_82596.c. The patch is about 89k
big, therefore I've placed it offline:

http://www.alpha.franken.de/patches/sni_82596.diff

The driver is tested on a HP B132 and a SNI RM400-220.

I've already looked at 82596.c, which contains a 82596 driver for
MVME boards. It's no big deal to use lib82596.c as well for
that driver. Right now my VME gear is not usable for testing,
so if anybody has hardware to test such a modified driver, please
mail me.

It would be great, if this driver could be included during the 2.6.22
release cycle.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessary a
good idea.[ RFC1925, 2.3 ]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net-2.6.22 UDP stalls/hangs

2007-04-23 Thread Rick Jones

Oh well, one thing at a time.  The good news is that I can reproduce the
problem with netperf.

kpm:/usr/src/netperf-2.4.3 netperf -H akpm2 -t UDP_RR
UDP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to akpm2 
(172.18.116.155) port 0 AF_INET
netperf: receive_response: no response received. errno 0 counter 0

That's running netserver on the test machine.

The machine running netperf is 172.18.116.160 and the test machine running
netserver is 172.18.116.155

tcpdump from the test machine:

15:24:37.924210 802.1d config 8000.00:18:74:5d:04:66.80ae root 0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:38.859309 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: state=standby group=1 addr=172.18.119.254

15:24:39.078273 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
15:24:39.924074 802.1d config 8000.00:18:74:5d:04:66.80ae root 0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:40.017081 IP 172.24.0.7.domain  172.18.116.57.37456:  59635 4/7/6 CNAME[|domain]

15:24:41.383433 IP 172.18.116.160.33137  172.18.116.155.12865: S 
2760291763:2760291763(0) win 5840 mss 1460,sackOK,timestamp 1967355840 0,nop,wscale 
8
15:24:41.383479 IP 172.18.116.155.12865  172.18.116.160.33137: S 
1640262480:1640262480(0) ack 2760291764 win 5792 mss 1460,sackOK,timestamp 7714 
1967355840,nop,wscale 7
15:24:41.383683 IP 172.18.116.160.33137  172.18.116.155.12865: . ack 1 win 23 
nop,nop,timestamp 1967355840 7714
15:24:41.383883 IP 172.18.116.160.33137  172.18.116.155.12865: P 1:257(256) ack 1 
win 23 nop,nop,timestamp 1967355840 7714
15:24:41.383902 IP 172.18.116.155.12865  172.18.116.160.33137: . ack 257 win 54 
nop,nop,timestamp 7714 1967355840
15:24:41.384065 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) ack 257 
win 54 nop,nop,timestamp 7714 1967355840
15:24:41.587266 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) ack 257 
win 54 nop,nop,timestamp 7765 1967355840
15:24:41.839234 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=standby group=1 addr=172.18.119.254
15:24:41.924303 802.1d config 8000.00:18:74:5d:04:66.80ae root 0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:41.995285 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) ack 257 win 54 nop,nop,timestamp 7867 1967355840

15:24:42.030341 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
15:24:42.811330 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) ack 257 
win 54 nop,nop,timestamp 8071 1967355840
15:24:43.924183 802.1d config 8000.00:18:74:5d:04:66.80ae root 0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:44.121880 IP 172.24.0.7.domain  172.18.116.22.46700:  52073* 1/4/4 A[|domain]

15:24:44.443419 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) ack 257 
win 54 nop,nop,timestamp 8479 1967355840
15:24:44.723257 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=standby group=1 addr=172.18.119.254
15:24:44.886356 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
15:24:45.924263 802.1d config 8000.00:18:74:5d:04:66.80ae root 0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:47.659300 IP 172.18.119.252.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: state=standby group=1 addr=172.18.119.254

15:24:47.707599 IP 172.18.116.155.12865  172.18.116.160.33137: P 1:257(256) ack 257 
win 54 nop,nop,timestamp 9295 1967355840
15:24:47.874419 IP 172.18.119.253.hsrp  224.0.0.2.hsrp: HSRPv0-hello 20: 
state=active group=1 addr=172.18.119.254
15:24:47.952350 802.1d config 8000.00:18:74:5d:04:66.80ae root 0066.00:15:c7:20:57:c0 pathcost 4 age 1 max 20 hello 2 fdelay 15 
15:24:48.037569 IP 172.24.0.7.domain  172.18.117.18.46665:  59092 2/7/6 CNAME[|domain]


So I think we did a bit of TCP chatter then no UDP at all?


Looks that way, and on top if it got no results back from netserver on 
the control (TCP, port 12865) connection.  Adding some -d's to the 
global options will cause netperf to regurgitate what messages it is 
sending and such.


I'd have expected that even if no UDP traffic could flow between netperf 
and netserver the timer running in the netserver _should_ have gotten it 
out of the recv()/recvfrom() call in recv_udp_rr() (src/nettest_bsd.c) 
and that netperf would then report a normal result of just 0 
transactions per second.


Either that timer didn't get set, didn't fire, or was insufficient to 
get netserver out of that recv() on the UDP socket, or comms between the 
two system got fubar for TCP too.


rick jones


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net-2.6.22 UDP stalls/hangs

2007-04-23 Thread Andrew Morton
On Mon, 23 Apr 2007 15:45:09 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Andrew Morton [EMAIL PROTECTED]
 Date: Mon, 23 Apr 2007 15:37:14 -0700
 
  So I think we did a bit of TCP chatter then no UDP at all?
  
  It's interesting that the test machine can see other people's DNS queries
  go past.
 
 It's mysterious alright.
 
 I can't say that the UDP's are going out corrupted because tcpdump
 seems to decode the DNS queries just fine.  Hmmm, if you're sending
 this out on the broken machine we can't rule out corrupted checksums.
 
 And if tcpdump doesn't see the UDP replies it means that it isn't even
 reaching the device, let alone the stack.  At least that rules out
 the stack dropping UDP packets for some reason.
 
 It's possible we've stuffed up some expectation the e1000 driver
 has for TX checksum offload.  Turn off TX checksums with
 ethtool -K eth0 tx off and see if that makes the problem
 go away.  Next, try ethtool -K eth0 rx off.
 
 I suspect skb_transport_offset() might be wrong for UDP packets
 for some reason, as that's what drivers/net/e1000/e1000_main.c
 e1000_tx_csum() depend upon.
 
 Either that or some error in Herbert's recent checksum offload
 handling changes, such as, in fact I am highly suspicious of
 the second change listed below, you may want to try reverting
 just that one:

Bingo.

 commit 8952d6c988ec31070732117f353666a4b9a09fea
 Author: Herbert Xu [EMAIL PROTECTED]
 Date:   Mon Apr 9 11:59:39 2007 -0700
 
 [NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARY
 
 When a transmitted packet is looped back directly, CHECKSUM_PARTIAL
 maps to the semantics of CHECKSUM_UNNECESSARY.  Therefore we should
 treat it as such in the stack.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]
 Signed-off-by: David S. Miller [EMAIL PROTECTED]

 commit 7f8be19f5a5737ce6ad670756183235c71b560bb
 Author: Herbert Xu [EMAIL PROTECTED]
 Date:   Mon Apr 9 11:59:07 2007 -0700
 
 [NET]: Use csum_start offset instead of skb_transport_header
 
 The skb transport pointer is currently used to specify the start
 of the checksum region for transmit checksum offload.  Unfortunately,
 the same pointer is also used during receive side processing.
 
 This creates a problem when we want to retransmit a received
 packet with partial checksums since the skb transport pointer
 would be overwritten.
 
 This patch solves this problem by creating a new 16-bit csum_start
 offset value to replace the skb transport header for the purpose
 of checksums.  This offset is calculated from skb-head so that
 it does not have to change when skb-data changes.
 
 No extra space is required since csum_offset itself fits within
 a 16-bit word so we can use the other 16 bits for csum_start.
 
 For backwards compatibility, just before we push a packet with
 partial checksums off into the device driver, we set the skb
 transport header to what it would have been under the old scheme.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]
 Signed-off-by: David S. Miller [EMAIL PROTECTED]

Reverting both 8952d6c988ec31070732117f353666a4b9a09fea and
7f8be19f5a5737ce6ad670756183235c71b560bb fixes it.  Reverting only
7f8be19f5a5737ce6ad670756183235c71b560bb also fixes it.

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/25] xen: Add the Xen virtual network device driver.

2007-04-23 Thread Jeremy Fitzhardinge
The network device frontend driver allows the kernel to access network
devices exported exported by a virtual machine containing a physical
network device driver.

Signed-off-by: Ian Pratt [EMAIL PROTECTED]
Signed-off-by: Christian Limpach [EMAIL PROTECTED]
Signed-off-by: Chris Wright [EMAIL PROTECTED]
Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED]
Cc: netdev@vger.kernel.org
Cc: Jeff Garzik [EMAIL PROTECTED]
Cc: Stephen Hemminger [EMAIL PROTECTED]
---
 drivers/net/Kconfig|   12 
 drivers/net/Makefile   |2 
 drivers/net/xen-netfront.c | 1957 
 3 files changed, 1971 insertions(+)

===
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2508,6 +2508,18 @@ source drivers/atm/Kconfig
 
 source drivers/s390/net/Kconfig
 
+config XEN_NETDEV_FRONTEND
+   tristate Xen network device frontend driver
+   depends on XEN
+   default y
+   help
+ The network device frontend driver allows the kernel to
+ access network devices exported exported by a virtual
+ machine containing a physical network device driver. The
+ frontend driver is intended for unprivileged guest domains;
+ if you are compiling a kernel for a Xen guest, you almost
+ certainly want to enable this.
+
 config ISERIES_VETH
tristate iSeries Virtual Ethernet driver support
depends on PPC_ISERIES
===
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -218,3 +218,5 @@ obj-$(CONFIG_FS_ENET) += fs_enet/
 obj-$(CONFIG_FS_ENET) += fs_enet/
 
 obj-$(CONFIG_NETXEN_NIC) += netxen/
+
+obj-$(CONFIG_XEN_NETDEV_FRONTEND) += xen-netfront.o
===
--- /dev/null
+++ b/drivers/net/xen-netfront.c
@@ -0,0 +1,1957 @@
+/**
+ * Virtual network driver for conversing with remote driver backends.
+ *
+ * Copyright (c) 2002-2005, K A Fraser
+ * Copyright (c) 2005, XenSource Ltd
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the Software), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include linux/module.h
+#include linux/version.h
+#include linux/kernel.h
+#include linux/netdevice.h
+#include linux/etherdevice.h
+#include linux/skbuff.h
+#include linux/ethtool.h
+#include linux/in.h
+#include linux/if_ether.h
+#include linux/moduleparam.h
+#include linux/mm.h
+#include xen/xenbus.h
+#include xen/interface/io/netif.h
+#include xen/interface/memory.h
+#ifdef CONFIG_XEN_BALLOON
+#include xen/balloon.h
+#endif
+#include xen/interface/grant_table.h
+
+#include xen/events.h
+#include xen/page.h
+#include xen/grant_table.h
+
+/*
+ * Mutually-exclusive module options to select receive data path:
+ *  rx_copy : Packets are copied by network backend into local memory
+ *  rx_flip : Page containing packet data is transferred to our ownership
+ * For fully-virtualised guests there is no option - copying must be used.
+ * For paravirtualised guests, flipping is the default.
+ */
+static int rx_copy;
+module_param(rx_copy, bool, 0);
+MODULE_PARM_DESC(rx_copy, Copy packets from network card (rather than flip));
+static int rx_flip;
+module_param(rx_flip, bool, 0);
+MODULE_PARM_DESC(rx_flip, Flip packets from network card (rather than copy));
+
+#define RX_COPY_THRESHOLD 256
+
+#define GRANT_INVALID_REF  0
+
+#define NET_TX_RING_SIZE __RING_SIZE((struct netif_tx_sring *)0, PAGE_SIZE)
+#define NET_RX_RING_SIZE __RING_SIZE((struct netif_rx_sring *)0, PAGE_SIZE)
+
+struct netfront_info {
+   struct list_head 

Re: net-2.6.22 UDP stalls/hangs

2007-04-23 Thread David Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Tue, 24 Apr 2007 10:04:58 +1000

 On Mon, Apr 23, 2007 at 03:45:09PM -0700, David Miller wrote:
  
  Either that or some error in Herbert's recent checksum offload
  handling changes, such as, in fact I am highly suspicious of
  the second change listed below, you may want to try reverting
  just that one:
 
 Indeed.  My change depended on drivers correctly using csum_offset
 instead of the old csum field.  That was wrong anyway since sparse
 would now warn against that usage.  However, prior to my change it
 was harmless.
 
 [NETDRV]: Perform missing csum_offset conversions
 
 When csum_offset was introduced we did a conversion from csum to
 csum_offset where applicable.  A couple of drivers were missed in
 this process.
 
 It was harmless to begin with since the two fields coincided.  Now
 that we've made them different with the addition of csum_start, the
 missed drivers must be converted or they can't send packets out at
 all that require checksum offload.
  
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Applied, thanks a lot Herbert.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.21-rc7: BUG: sleeping function called from invalid context at net/core/sock.c:1523

2007-04-23 Thread Herbert Xu
Jiri Kosina [EMAIL PROTECTED] wrote:
 
 Hmm, *sigh*. I guess the patch below fixes the problem, but it is a 
 masterpiece in the field of ugliness. And I am not sure whether it is 
 completely correct either. Are there any immediate ideas for better 
 solution with respect to how struct sock locking works?

Please cc such patches to netdev.  Thanks.

 diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
 index 71f5cfb..c5c93cd 100644
 --- a/net/bluetooth/hci_sock.c
 +++ b/net/bluetooth/hci_sock.c
 @@ -656,7 +656,10 @@ static int hci_sock_dev_event(struct notifier_block 
 *this, unsigned long event,
/* Detach sockets from device */
read_lock(hci_sk_list.lock);
sk_for_each(sk, node, hci_sk_list.head) {
 -   lock_sock(sk);
 +   if (in_atomic())
 +   bh_lock_sock(sk);
 +   else
 +   lock_sock(sk);

This doesn't do what you think it does.  bh_lock_sock can still succeed
even with lock_sock held by someone else.

Does this need to occur immediately when an event occurs? If not I'd
suggest moving this into a workqueue.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] TCP related patches for net-2.6.22

2007-04-23 Thread Stephen Hemminger
A bunch of TCP congestion control updates for 2.6.22.

The first one might be a bug fix worth backporting.
It addresses a problem that causes vegas not to work right
when using the setsockopt() setup.

-- 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] TCP Illinois update

2007-04-23 Thread Stephen Hemminger
This version more closely matches the paper, and fixes several
math errors. The biggest difference is that it updates alpha/beta
once per RTT

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 net/ipv4/tcp_illinois.c |  298 +---
 1 file changed, 186 insertions(+), 112 deletions(-)

--- net-2.6.22.orig/net/ipv4/tcp_illinois.c
+++ net-2.6.22/net/ipv4/tcp_illinois.c
@@ -23,74 +23,106 @@
 #define ALPHA_MIN  ((3*ALPHA_SCALE)/10)/* ~0.3 */
 #define ALPHA_MAX  (10*ALPHA_SCALE)/* 10.0 */
 #define ALPHA_BASE ALPHA_SCALE /* 1.0 */
+#define U32_MAX((u32)~0U)
+#define RTT_MAX(U32_MAX / ALPHA_MAX)   /* 3.3 secs */
 
 #define BETA_SHIFT 6
 #define BETA_SCALE (1uBETA_SHIFT)
-#define BETA_MIN   (BETA_SCALE/8)  /* 0.8 */
-#define BETA_MAX   (BETA_SCALE/2)
-#define BETA_BASE  BETA_MAX/* 0.5 */
-
-#define THETA  5
+#define BETA_MIN   (BETA_SCALE/8)  /* 0.125 */
+#define BETA_MAX   (BETA_SCALE/2)  /* 0.5 */
+#define BETA_BASE  BETA_MAX
 
 static int win_thresh __read_mostly = 15;
-module_param(win_thresh, int, 0644);
+module_param(win_thresh, int, 0);
 MODULE_PARM_DESC(win_thresh, Window threshold for starting adaptive sizing);
 
-#define MAX_RTT0x7fff
+static int theta __read_mostly = 5;
+module_param(theta, int, 0);
+MODULE_PARM_DESC(theta, # of fast RTT's before full growth);
 
 /* TCP Illinois Parameters */
-struct tcp_illinois {
-   u32 last_alpha;
-   u32 min_rtt;
-   u32 max_rtt;
-   u32 rtt_low;
-   u32 rtt_cnt;
-   u64 sum_rtt;
+struct illinois {
+   u64 sum_rtt;/* sum of rtt's measured within last rtt */
+   u16 cnt_rtt;/* # of rtts measured within last rtt */
+   u32 base_rtt;   /* min of all rtt in usec */
+   u32 max_rtt;/* max of all rtt in usec */
+   u32 end_seq;/* right edge of current RTT */
+   u32 alpha;  /* Additive increase */
+   u32 beta;   /* Muliplicative decrease */
+   u16 acked;  /* # packets acked by current ACK */
+   u8  rtt_above;  /* average rtt has gone above threshold */
+   u8  rtt_low;/* # of rtts measurements below threshold */
 };
 
+static void rtt_reset(struct sock *sk)
+{
+   struct tcp_sock *tp = tcp_sk(sk);
+   struct illinois *ca = inet_csk_ca(sk);
+
+   ca-end_seq = tp-snd_nxt;
+   ca-cnt_rtt = 0;
+   ca-sum_rtt = 0;
+
+   /* TODO: age max_rtt? */
+}
+
 static void tcp_illinois_init(struct sock *sk)
 {
-   struct tcp_illinois *ca = inet_csk_ca(sk);
+   struct illinois *ca = inet_csk_ca(sk);
+
+   ca-alpha = ALPHA_MAX;
+   ca-beta = BETA_BASE;
+   ca-base_rtt = 0x7fff;
+   ca-max_rtt = 0;
 
-   ca-last_alpha = ALPHA_BASE;
-   ca-min_rtt = 0x7fff;
+   ca-acked = 0;
+   ca-rtt_low = 0;
+   ca-rtt_above = 0;
+
+   rtt_reset(sk);
 }
 
-/*
- * Keep track of min, max and average RTT
- */
-static void tcp_illinois_rtt_calc(struct sock *sk, u32 rtt)
+/* Measure RTT for each ack. */
+static void tcp_illinois_rtt_sample(struct sock *sk, u32 rtt)
 {
-   struct tcp_illinois *ca = inet_csk_ca(sk);
+   struct illinois *ca = inet_csk_ca(sk);
+
+   /* ignore bogus values, this prevents wraparound in alpha math */
+   if (rtt  RTT_MAX)
+   rtt = RTT_MAX;
+
+   /* keep track of minimum RTT seen so far */
+   if (ca-base_rtt  rtt)
+   ca-base_rtt = rtt;
 
-   if (rtt  ca-min_rtt)
-   ca-min_rtt = rtt;
-   if (rtt  ca-max_rtt)
+   /* and max */
+   if (ca-max_rtt  rtt)
ca-max_rtt = rtt;
 
-   if (++ca-rtt_cnt == 1)
-   ca-sum_rtt = rtt;
-   else
-   ca-sum_rtt += rtt;
+   ++ca-cnt_rtt;
+   ca-sum_rtt += rtt;
 }
 
-/* max queuing delay */
-static inline u32 max_delay(const struct tcp_illinois *ca)
+/* Capture count of packets covered by ack, to adjust for delayed acks */
+static void tcp_illinois_acked(struct sock *sk, u32 pkts_acked)
 {
-   return ca-max_rtt - ca-min_rtt;
+   struct illinois *ca = inet_csk_ca(sk);
+   ca-acked = pkts_acked;
 }
 
-/* average queueing delay */
-static u32 avg_delay(struct tcp_illinois *ca)
+/* Maximum queuing delay */
+static inline u32 max_delay(const struct illinois *ca)
 {
-   u64 avg_rtt = ca-sum_rtt;
-
-   do_div(avg_rtt, ca-rtt_cnt);
+   return ca-max_rtt - ca-base_rtt;
+}
 
-   ca-sum_rtt = 0;
-   ca-rtt_cnt = 0;
+/* Average queuing delay */
+static inline u32 avg_delay(const struct illinois *ca)
+{
+   u64 t = ca-sum_rtt;
 
-   return avg_rtt - ca-min_rtt;
+   do_div(t, ca-cnt_rtt);
+   return t - ca-base_rtt;
 }
 
 /*
@@ -101,32 +133,31 @@ static u32 avg_delay(struct tcp_illinois
  * A. If 

[PATCH 3/4] tcp: congestion control API update

2007-04-23 Thread Stephen Hemminger
Do some simple changes to make congestion control API faster/cleaner.
* use ktime_t rather than timeval
* merge rtt sampling into existing ack callback
  this means one indirect call versus two per ack.
* use flags bits to store options/settings

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 include/linux/skbuff.h  |5 +
 include/net/tcp.h   |9 +
 net/ipv4/tcp_bic.c  |2 +-
 net/ipv4/tcp_cong.c |   14 +++---
 net/ipv4/tcp_cubic.c|2 +-
 net/ipv4/tcp_htcp.c |2 +-
 net/ipv4/tcp_illinois.c |   16 +++-
 net/ipv4/tcp_input.c|   25 -
 net/ipv4/tcp_lp.c   |8 +---
 net/ipv4/tcp_output.c   |2 +-
 net/ipv4/tcp_vegas.c|   10 +++---
 net/ipv4/tcp_veno.c |   10 +++---
 net/ipv4/tcp_westwood.c |2 +-
 net/ipv4/tcp_yeah.c |6 --
 net/ipv4/tcp_yeah.h |7 +--
 15 files changed, 65 insertions(+), 55 deletions(-)

--- net-2.6.22.orig/include/linux/skbuff.h
+++ net-2.6.22/include/linux/skbuff.h
@@ -1569,6 +1569,11 @@ static inline void __net_timestamp(struc
skb-tstamp = ktime_get_real();
 }
 
+static inline ktime_t net_timedelta(ktime_t t)
+{
+   return ktime_sub(ktime_get_real(), t);
+}
+
 
 extern __sum16 __skb_checksum_complete_head(struct sk_buff *skb, int len);
 extern __sum16 __skb_checksum_complete(struct sk_buff *skb);
--- net-2.6.22.orig/include/net/tcp.h
+++ net-2.6.22/include/net/tcp.h
@@ -629,9 +629,12 @@ enum tcp_ca_event {
 #define TCP_CA_MAX 128
 #define TCP_CA_BUF_MAX (TCP_CA_NAME_MAX*TCP_CA_MAX)
 
+#define TCP_CONG_NON_RESTRICTED 0x1
+#define TCP_CONG_RTT_STAMP 0x2
+
 struct tcp_congestion_ops {
struct list_headlist;
-   int non_restricted;
+   unsigned long flags;
 
/* initialize private data (optional) */
void (*init)(struct sock *sk);
@@ -645,8 +648,6 @@ struct tcp_congestion_ops {
/* do new cwnd calculation (required) */
void (*cong_avoid)(struct sock *sk, u32 ack,
   u32 rtt, u32 in_flight, int good_ack);
-   /* round trip time sample per acked packet (optional) */
-   void (*rtt_sample)(struct sock *sk, u32 usrtt);
/* call before changing ca_state (optional) */
void (*set_state)(struct sock *sk, u8 new_state);
/* call when cwnd event occurs (optional) */
@@ -654,7 +655,7 @@ struct tcp_congestion_ops {
/* new value of cwnd after loss (optional) */
u32  (*undo_cwnd)(struct sock *sk);
/* hook for packet ack accounting (optional) */
-   void (*pkts_acked)(struct sock *sk, u32 num_acked);
+   void (*pkts_acked)(struct sock *sk, u32 num_acked, ktime_t last);
/* get info for inet_diag (optional) */
void (*get_info)(struct sock *sk, u32 ext, struct sk_buff *skb);
 
--- net-2.6.22.orig/net/ipv4/tcp_bic.c
+++ net-2.6.22/net/ipv4/tcp_bic.c
@@ -206,7 +206,7 @@ static void bictcp_state(struct sock *sk
 /* Track delayed acknowledgment ratio using sliding window
  * ratio = (15*ratio + sample) / 16
  */
-static void bictcp_acked(struct sock *sk, u32 cnt)
+static void bictcp_acked(struct sock *sk, u32 cnt, ktime_t last)
 {
const struct inet_connection_sock *icsk = inet_csk(sk);
 
--- net-2.6.22.orig/net/ipv4/tcp_cong.c
+++ net-2.6.22/net/ipv4/tcp_cong.c
@@ -126,7 +126,7 @@ int tcp_set_default_congestion_control(c
 #endif
 
if (ca) {
-   ca-non_restricted = 1; /* default is always allowed */
+   ca-flags |= TCP_CONG_NON_RESTRICTED;   /* default is always 
allowed */
list_move(ca-list, tcp_cong_list);
ret = 0;
}
@@ -181,7 +181,7 @@ void tcp_get_allowed_congestion_control(
*buf = '\0';
rcu_read_lock();
list_for_each_entry_rcu(ca, tcp_cong_list, list) {
-   if (!ca-non_restricted)
+   if (!(ca-flags  TCP_CONG_NON_RESTRICTED))
continue;
offs += snprintf(buf + offs, maxlen - offs,
 %s%s,
@@ -212,16 +212,16 @@ int tcp_set_allowed_congestion_control(c
}
}
 
-   /* pass 2 clear */
+   /* pass 2 clear old values */
list_for_each_entry_rcu(ca, tcp_cong_list, list)
-   ca-non_restricted = 0;
+   ca-flags = ~TCP_CONG_NON_RESTRICTED;
 
/* pass 3 mark as allowed */
while ((name = strsep(val,  ))  *name) {
ca = tcp_ca_find(name);
WARN_ON(!ca);
if (ca)
-   ca-non_restricted = 1;
+   ca-flags |= TCP_CONG_NON_RESTRICTED;
}
 out:
spin_unlock(tcp_cong_list_lock);
@@ -256,7 +256,7 @@ int tcp_set_congestion_control(struct so
if (!ca)
err = -ENOENT;
 
-   else if (!(ca-non_restricted || capable(CAP_NET_ADMIN)))
+   else if (!((ca-flags  TCP_CONG_NON_RESTRICTED) || 
capable(CAP_NET_ADMIN)))
  

[PATCH 1/4] tcp: congestion control initialization

2007-04-23 Thread Stephen Hemminger
Change to defer congestion control initialization.

If setsockopt() was used to change TCP_CONGESTION before
connection is established, then protocols that use sequence numbers
to keep track of one RTT interval (vegas, illinois, ...) get confused.

Change the init hook to be called after handshake.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 net/ipv4/tcp_cong.c |   23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

--- net-2.6.22.orig/net/ipv4/tcp_cong.c
+++ net-2.6.22/net/ipv4/tcp_cong.c
@@ -79,18 +79,19 @@ void tcp_init_congestion_control(struct 
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_congestion_ops *ca;
 
-   if (icsk-icsk_ca_ops != tcp_init_congestion_ops)
-   return;
+   /* if no choice made yet assign the current value set as default */
+   if (icsk-icsk_ca_ops == tcp_init_congestion_ops) {
+   rcu_read_lock();
+   list_for_each_entry_rcu(ca, tcp_cong_list, list) {
+   if (try_module_get(ca-owner)) {
+   icsk-icsk_ca_ops = ca;
+   break;
+   }
 
-   rcu_read_lock();
-   list_for_each_entry_rcu(ca, tcp_cong_list, list) {
-   if (try_module_get(ca-owner)) {
-   icsk-icsk_ca_ops = ca;
-   break;
+   /* fallback to next available */
}
-
+   rcu_read_unlock();
}
-   rcu_read_unlock();
 
if (icsk-icsk_ca_ops-init)
icsk-icsk_ca_ops-init(sk);
@@ -238,6 +239,7 @@ int tcp_set_congestion_control(struct so
 
rcu_read_lock();
ca = tcp_ca_find(name);
+
/* no change asking for existing value */
if (ca == icsk-icsk_ca_ops)
goto out;
@@ -263,7 +265,8 @@ int tcp_set_congestion_control(struct so
else {
tcp_cleanup_congestion_control(sk);
icsk-icsk_ca_ops = ca;
-   if (icsk-icsk_ca_ops-init)
+
+   if (sk-sk_state != TCP_CLOSE  icsk-icsk_ca_ops-init)
icsk-icsk_ca_ops-init(sk);
}
  out:

-- 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] TCP YEAH: use vegas dont copy it

2007-04-23 Thread Stephen Hemminger
Rather than using a copy of vegas code, the YEAH code should just
have it exported so there is common code.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


---
 net/ipv4/tcp_vegas.c |   31 +---
 net/ipv4/tcp_vegas.h |   24 +
 net/ipv4/tcp_yeah.c  |   53 +---
 net/ipv4/tcp_yeah.h  |  131 ---
 4 files changed, 61 insertions(+), 178 deletions(-)

--- net-2.6.22.orig/net/ipv4/tcp_vegas.c
+++ net-2.6.22/net/ipv4/tcp_vegas.c
@@ -38,6 +38,8 @@
 
 #include net/tcp.h
 
+#include tcp_vegas.h
+
 /* Default values of the Vegas variables, in fixed-point representation
  * with V_PARAM_SHIFT bits to the right of the binary point.
  */
@@ -54,17 +56,6 @@ module_param(gamma, int, 0644);
 MODULE_PARM_DESC(gamma, limit on increase (scale by 2));
 
 
-/* Vegas variables */
-struct vegas {
-   u32 beg_snd_nxt;/* right edge during last RTT */
-   u32 beg_snd_una;/* left edge  during last RTT */
-   u32 beg_snd_cwnd;   /* saves the size of the cwnd */
-   u8  doing_vegas_now;/* if true, do vegas for this RTT */
-   u16 cntRTT; /* # of RTTs measured within last RTT */
-   u32 minRTT; /* min of RTTs measured within last RTT (in 
usec) */
-   u32 baseRTT;/* the min of all Vegas RTT measurements seen 
(in usec) */
-};
-
 /* There are several situations when we must re-start Vegas:
  *
  *  o when a connection is established
@@ -81,7 +72,7 @@ struct vegas {
  * Instead we must wait until the completion of an RTT during
  * which we actually receive ACKs.
  */
-static inline void vegas_enable(struct sock *sk)
+static void vegas_enable(struct sock *sk)
 {
const struct tcp_sock *tp = tcp_sk(sk);
struct vegas *vegas = inet_csk_ca(sk);
@@ -104,13 +95,14 @@ static inline void vegas_disable(struct 
vegas-doing_vegas_now = 0;
 }
 
-static void tcp_vegas_init(struct sock *sk)
+void tcp_vegas_init(struct sock *sk)
 {
struct vegas *vegas = inet_csk_ca(sk);
 
vegas-baseRTT = 0x7fff;
vegas_enable(sk);
 }
+EXPORT_SYMBOL_GPL(tcp_vegas_init);
 
 /* Do RTT sampling needed for Vegas.
  * Basically we:
@@ -120,7 +112,7 @@ static void tcp_vegas_init(struct sock *
  *   o min-filter RTT samples from a much longer window (forever for now)
  * to find the propagation delay (baseRTT)
  */
-static void tcp_vegas_pkts_acked(struct sock *sk, u32 cnt, ktime_t last)
+void tcp_vegas_pkts_acked(struct sock *sk, u32 cnt, ktime_t last)
 {
struct vegas *vegas = inet_csk_ca(sk);
u32 vrtt;
@@ -138,8 +130,9 @@ static void tcp_vegas_pkts_acked(struct 
vegas-minRTT = min(vegas-minRTT, vrtt);
vegas-cntRTT++;
 }
+EXPORT_SYMBOL_GPL(tcp_vegas_pkts_acked);
 
-static void tcp_vegas_state(struct sock *sk, u8 ca_state)
+void tcp_vegas_state(struct sock *sk, u8 ca_state)
 {
 
if (ca_state == TCP_CA_Open)
@@ -147,6 +140,7 @@ static void tcp_vegas_state(struct sock 
else
vegas_disable(sk);
 }
+EXPORT_SYMBOL_GPL(tcp_vegas_state);
 
 /*
  * If the connection is idle and we are restarting,
@@ -157,12 +151,13 @@ static void tcp_vegas_state(struct sock 
  * packets, _then_ we can make Vegas calculations
  * again.
  */
-static void tcp_vegas_cwnd_event(struct sock *sk, enum tcp_ca_event event)
+void tcp_vegas_cwnd_event(struct sock *sk, enum tcp_ca_event event)
 {
if (event == CA_EVENT_CWND_RESTART ||
event == CA_EVENT_TX_START)
tcp_vegas_init(sk);
 }
+EXPORT_SYMBOL_GPL(tcp_vegas_cwnd_event);
 
 static void tcp_vegas_cong_avoid(struct sock *sk, u32 ack,
 u32 seq_rtt, u32 in_flight, int flag)
@@ -339,8 +334,7 @@ static void tcp_vegas_cong_avoid(struct 
 }
 
 /* Extract info for Tcp socket info provided via netlink. */
-static void tcp_vegas_get_info(struct sock *sk, u32 ext,
-  struct sk_buff *skb)
+void tcp_vegas_get_info(struct sock *sk, u32 ext, struct sk_buff *skb)
 {
const struct vegas *ca = inet_csk_ca(sk);
if (ext  (1  (INET_DIAG_VEGASINFO - 1))) {
@@ -354,6 +348,7 @@ static void tcp_vegas_get_info(struct so
nla_put(skb, INET_DIAG_VEGASINFO, sizeof(info), info);
}
 }
+EXPORT_SYMBOL_GPL(tcp_vegas_get_info);
 
 static struct tcp_congestion_ops tcp_vegas = {
.flags  = TCP_CONG_RTT_STAMP,
--- /dev/null
+++ net-2.6.22/net/ipv4/tcp_vegas.h
@@ -0,0 +1,24 @@
+/*
+ * TCP Vegas congestion control interface
+ */
+#ifndef __TCP_VEGAS_H
+#define __TCP_VEGAS_H 1
+
+/* Vegas variables */
+struct vegas {
+   u32 beg_snd_nxt;/* right edge during last RTT */
+   u32 beg_snd_una;/* left edge  during last RTT */
+   u32 beg_snd_cwnd;   /* saves the size of the cwnd */
+   u8  doing_vegas_now;/* if true, do vegas for this RTT */
+   u16 cntRTT; /* # of RTTs measured within last RTT */
+   u32 minRTT; 

[NET]: Warn about GSO/checksum abuse

2007-04-23 Thread Herbert Xu
Hi Dave:

[NET]: Warn about GSO/checksum abuse

Now that Patrick has added the code to deal with GSO in netfilter,
we no longer need the crutch that computes partial checksums just
before transmission.

This patch turns this into a warning again.  If this goes OK, we
can then turn it into a BUG_ON and remove the gso_send_check cruft.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/core/dev.c b/net/core/dev.c
index 18c51b4..d82d00f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1202,7 +1202,7 @@ struct sk_buff *skb_gso_segment(struct sk_buff *skb, int 
features)
skb-mac_len = skb-network_header - skb-mac_header;
__skb_pull(skb, skb-mac_len);
 
-   if (unlikely(skb-ip_summed != CHECKSUM_PARTIAL)) {
+   if (WARN_ON(skb-ip_summed != CHECKSUM_PARTIAL)) {
if (skb_header_cloned(skb) 
(err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC)))
return ERR_PTR(err);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] TCP Illinois update

2007-04-23 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 20:31:19 -0700

 This version more closely matches the paper, and fixes several
 math errors. The biggest difference is that it updates alpha/beta
 once per RTT
 
 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

Applied, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] TCP YEAH: use vegas dont copy it

2007-04-23 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 20:31:21 -0700

 Rather than using a copy of vegas code, the YEAH code should just
 have it exported so there is common code.
 
 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

Excellent cleanup.

Applied, thanks Stephen.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] tcp: congestion control initialization

2007-04-23 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 20:31:18 -0700

 Change to defer congestion control initialization.
 
 If setsockopt() was used to change TCP_CONGESTION before
 connection is established, then protocols that use sequence numbers
 to keep track of one RTT interval (vegas, illinois, ...) get confused.
 
 Change the init hook to be called after handshake.
 
 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

Applied.

I'll look this over for backporting, thanks Stephen.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET]: Warn about GSO/checksum abuse

2007-04-23 Thread David Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Tue, 24 Apr 2007 13:55:42 +1000

 [NET]: Warn about GSO/checksum abuse
 
 Now that Patrick has added the code to deal with GSO in netfilter,
 we no longer need the crutch that computes partial checksums just
 before transmission.
 
 This patch turns this into a warning again.  If this goes OK, we
 can then turn it into a BUG_ON and remove the gso_send_check cruft.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Thanks for keeping track of this.

Applied, thanks Herbert.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] tcp: congestion control API update

2007-04-23 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 20:31:20 -0700

 Do some simple changes to make congestion control API faster/cleaner.
 * use ktime_t rather than timeval
 * merge rtt sampling into existing ack callback
   this means one indirect call versus two per ack.
 * use flags bits to store options/settings
 
 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

Nice work Stephen.

Applied, thanks a lot.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html