date:20060622

[RFC] net/ipv{4,6} UDP-Lite: code sharing between udp.c and udplite.c

2006-06-22 Thread Gerrit Renker

I would greatly value comments on a suggestion made earlier by Yoshifuji
regarding sharing code between udp.c and udplite.c which I am pursuing. 

I reduced diffs between ipv4/udp{,lite}.c to the minimum possible and
performed a line-by-line comparison between udp.c and udplite.c.

Result: out of 45 functions which re-appear from udp.c in udplite.c,
* 26 can be derived without human thinking at all (sed/perl)
* 10 require trivial interaction (sockopt/header names)
*  8 require genuine modifications (in control flow and algorithm)
*  1 function is missing in udplite.c (no equivalent of udp_check())


Problem: The UDP code (and in particular the replicated functions) operate 
 on the following globally visible symbols:

 EXPORT_SYMBOL(udp_hash);/* would be udplite_hash   */
 EXPORT_SYMBOL(udp_hash_lock);   /* would be udplite_hash_lock  */ 
 EXPORT_SYMBOL(udp_port_rover);  /* would be udplite_port_rover */

This would lead to clashes if udp.c/udplite.c use the same names.

  
Suggestion: #include code from udp.c in a much-reduced udplite.c, after
re-defining symbols, so that the top of udplite.c looks like e.g.

  #include linux/udplite.h
  #define   udp_hashudplite_hash
  #define   udp_port_rover  udplite_port_rover
  #include udp.c  /* include the source code */


Inputs: The benefits are a much deflated patch, code reuse, increased
clarity (only the diffs are visible). This comes at the cost
of introducing a few #ifdefs in udp.c (otherwise no changes).
However, I am not sure whether such an approach would find 
acceptance and therefore I am asking for input. As currently the
porting to ipv6/udplite.c is under way, I would like to take any
suggestions on board which can reduce dependencies and inflated
code.

Many thanks in advance,
--Gerrit 


NB: Details of the code analysis can be found on 
 http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/udplite-comparison.html
and the diff-minimized variant of ipv4/udplite.c is in the latest tarball,
 http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/files/udplite_linux.tar.gz
(any future patches will have the linelengths cut to 80 chars).
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[NET]: Prevent transmission after dev_deactivate

2006-06-22 Thread Herbert Xu

Hi Dave:

I found a bug in my GSO patches with the shutdown handling in
dev_deactivate.  It provided enough impetus for me to finally
clean up this function :)

This patch is against Linus's tree.

[NET]: Prevent transmission after dev_deactivate

The dev_deactivate function has bit-rotted since the introduction of
lockless drivers.  In particular, the spin_unlock_wait call at the end
has no effect on the xmit routine of lockless drivers.

With a little bit of work, we can make it much more useful by providing
the guarantee that when it returns, no more calls to the xmit routine
of the underlying driver will be made.

The idea is simple.  There are two entry points in to the xmit routine.
The first comes from dev_queue_xmit.  That one is easily stopped by
using synchronize_rcu.  This works because we set the qdisc to noop_qdisc
before the synchronize_rcu call.  That in turn causes all subsequent
packets sent to dev_queue_xmit to be dropped.  The synchronize_rcu call
also ensures all outstanding calls leave their critical section.

The other entry point is from qdisc_run.  Since we now have a bit that
indicates whether it's running, all we have to do is to wait until the
bit is off.

I've removed the loop to wait for __LINK_STATE_SCHED to clear.  This is
useless because netif_wake_queue can cause it to be set again.  It is
also harmless because we've disarmed qdisc_run.

I've also removed the spin_unlock_wait on xmit_lock because its only
purpose of making sure that all outstanding xmit_lock holders have
exited is also given by dev_watchdog_down.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
1201ce3ea54baa35bcecf9925bf9d788e084d895
diff --git a/net/core/dev.c b/net/core/dev.c
index ab39fe1..29e3888 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1295,7 +1295,7 @@ int dev_queue_xmit(struct sk_buff *skb)
/* Disable soft irqs for various locks below. Also 
 * stops preemption for RCU. 
 */
-   local_bh_disable(); 
+   rcu_read_lock_bh(); 
 
/* Updates of qdisc are serialized by queue_lock. 
 * The struct Qdisc which is pointed to by qdisc is now a 
@@ -1369,13 +1369,13 @@ #endif
}
 
rc = -ENETDOWN;
-   local_bh_enable();
+   rcu_read_unlock_bh();
 
 out_kfree_skb:
kfree_skb(skb);
return rc;
 out:
-   local_bh_enable();
+   rcu_read_unlock_bh();
return rc;
 }
 
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index d7aca8e..7aad012 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -181,9 +181,13 @@ requeue:
 
 void __qdisc_run(struct net_device *dev)
 {
+   if (unlikely(dev-qdisc == noop_qdisc))
+   goto out;
+
while (qdisc_restart(dev)  0  !netif_queue_stopped(dev))
/* NOTHING */;
 
+out:
clear_bit(__LINK_STATE_QDISC_RUNNING, dev-state);
 }
 
@@ -583,10 +587,12 @@ void dev_deactivate(struct net_device *d
 
dev_watchdog_down(dev);
 
-   while (test_bit(__LINK_STATE_SCHED, dev-state))
-   yield();
+   /* Wait for outstanding dev_queue_xmit calls. */
+   synchronize_rcu();
 
-   spin_unlock_wait(dev-_xmit_lock);
+   /* Wait for outstanding qdisc_run calls. */
+   while (test_bit(__LINK_STATE_QDISC_RUNNING, dev-state))
+   yield();
 }
 
 void dev_init_scheduler(struct net_device *dev)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[0/5] GSO: Generic Segmentation Offload

2006-06-22 Thread Herbert Xu

Hi:

This is a repost of the GSO patches.  The main change is the fix to a bug
in the way dev-gso_skb is freed.  This series requires the dev_deactivate
patch that I just posted.

Here is the original description:

This series adds Generic Segmentation Offload (GSO) support to the Linux
networking stack.

Many people have observed that a lot of the savings in TSO come from
traversing the networking stack once rather than many times for each
super-packet.  These savings can be obtained without hardware support.
In fact, the concept can be applied to other protocols such as TCPv6,
UDP, or even DCCP.

The key to minimising the cost in implementing this is to postpone the
segmentation as late as possible.  In the ideal world, the segmentation
would occur inside each NIC driver where they would rip the super-packet
apart and either produce SG lists which are directly fed to the hardware,
or linearise each segment into pre-allocated memory to be fed to the NIC.
This would elminate segmented skb's altogether.

Unfortunately this requires modifying each and every NIC driver so it
would take quite some time.  A much easier solution is to perform the
segmentation just before the entry into the driver's xmit routine.  This
series of patches does this.

I've attached some numbers to demonstrate the savings brought on by
doing this.  The best scenario is obviously the case where the underlying
NIC supports SG.  This means that we simply have to manipulate the SG
entries and place them into individual skb's before passing them to the
driver.  The attached file lo-res shows this.

The test was performed through the loopback device which is a fairly good
approxmiation of an SG-capable NIC.

GSO like TSO is only effective if the MTU is significantly less than the
maximum value of 64K.  So only the case where the MTU was set to 1500 is
of interest.  There we can see that the throughput improved by 17.5%
(3061.05Mb/s = 3598.17Mb/s).  The actual saving in transmission cost is
in fact a lot more than that as the majority of the time here is spent on
the RX side which still has to deal with 1500-byte packets.

The worst-case scenario is where the NIC does not support SG and the user
uses write(2) which means that we have to copy the data twice.  The files
gso-off/gso-on provide data for this case (the test was carried out on
e100).  As you can see, the cost of the extra copy is mostly offset by the
reduction in the cost of going through the networking stack.

For now GSO is off by default but can be enabled through ethtool.  It is
conceivable that with enough optimisation GSO could be a win in most cases
and we could enable it by default.

However, even without enabling GSO explicitly it can still function on
bridged and forwarded packets.  As it is, passing TSO packets through a
bridge only works if all constiuents support TSO.  With GSO, it provides
a fallback so that we may enable TSO for a bridge even if some of its
constituents do not support TSO.

This provides massive savings for Xen as it uses a bridge-based architecture
and TSO/GSO produces a much larger effective MTU for internal traffic between
domains.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
$ sudo ./ethtool -K lo gso on
$ sudo ifconfig lo mtu 1500
$ netperf -t TCP_STREAM
TCP STREAM TEST to localhost
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

 87380  16384  1638410.003598.17
$ sudo ./ethtool -K lo gso off
$ netperf -t TCP_STREAM
TCP STREAM TEST to localhost
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

 87380  16384  1638410.003061.05
$ sudo ifconfig lo mtu 6
$ netperf -t TCP_STREAM
TCP STREAM TEST to localhost
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

 87380  16384  1638410.008245.05
$ sudo ./ethtool -K lo gso on
$ netperf -t TCP_STREAM
TCP STREAM TEST to localhost
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

 87380  16384  1638410.008563.36
$ sudo ifconfig lo mtu 16436
$ netperf -t TCP_STREAM
TCP STREAM TEST to localhost
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

 87380  16384  1638410.007359.95
$ sudo ./ethtool -K lo gso off
$ netperf -t TCP_STREAM
TCP STREAM TEST to localhost
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

 87380  16384  1638410.007535.04
$
CPU: PIII,

[1/5] [NET]: Merge TSO/UFO fields in sk_buff

2006-06-22 Thread Herbert Xu

Hi:

[NET]: Merge TSO/UFO fields in sk_buff

Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
going to scale if we add any more segmentation methods (e.g., DCCP).  So
let's merge them.

They were used to tell the protocol of a packet.  This function has been
subsumed by the new gso_type field.  This is essentially a set of netdev
feature bits (shifted by 16 bits) that are required to process a specific
skb.  As such it's easy to tell whether a given device can process a GSO
skb: you just have to and the gso_type field and the netdev's features
field.

I've made gso_type a conjunction.  The idea is that you have a base type
(e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
For example, if we add a hardware TSO type that supports ECN, they would
declare NETIF_F_TSO | NETIF_F_TSO_ECN.  All TSO packets with CWR set would
have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
packets would be SKB_GSO_TCPV4.  This means that only the CWR packets need
to be emulated in software.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/drivers/net/8139cp.c b/drivers/net/8139cp.c
--- a/drivers/net/8139cp.c
+++ b/drivers/net/8139cp.c
@@ -792,7 +792,7 @@ static int cp_start_xmit (struct sk_buff
entry = cp-tx_head;
eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0;
if (dev-features  NETIF_F_TSO)
-   mss = skb_shinfo(skb)-tso_size;
+   mss = skb_shinfo(skb)-gso_size;
 
if (skb_shinfo(skb)-nr_frags == 0) {
struct cp_desc *txd = cp-tx_ring[entry];
diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -1640,7 +1640,7 @@ bnx2_tx_int(struct bnx2 *bp)
skb = tx_buf-skb;
 #ifdef BCM_TSO 
/* partial BD completions possible with TSO packets */
-   if (skb_shinfo(skb)-tso_size) {
+   if (skb_shinfo(skb)-gso_size) {
u16 last_idx, last_ring_idx;
 
last_idx = sw_cons +
@@ -4428,7 +4428,7 @@ bnx2_start_xmit(struct sk_buff *skb, str
(TX_BD_FLAGS_VLAN_TAG | (vlan_tx_tag_get(skb)  16));
}
 #ifdef BCM_TSO 
-   if ((mss = skb_shinfo(skb)-tso_size) 
+   if ((mss = skb_shinfo(skb)-gso_size) 
(skb-len  (bp-dev-mtu + ETH_HLEN))) {
u32 tcp_opt_len, ip_tcp_len;
 
diff --git a/drivers/net/chelsio/sge.c b/drivers/net/chelsio/sge.c
--- a/drivers/net/chelsio/sge.c
+++ b/drivers/net/chelsio/sge.c
@@ -1418,7 +1418,7 @@ int t1_start_xmit(struct sk_buff *skb, s
struct cpl_tx_pkt *cpl;
 
 #ifdef NETIF_F_TSO
-   if (skb_shinfo(skb)-tso_size) {
+   if (skb_shinfo(skb)-gso_size) {
int eth_type;
struct cpl_tx_pkt_lso *hdr;
 
@@ -1433,7 +1433,7 @@ int t1_start_xmit(struct sk_buff *skb, s
hdr-ip_hdr_words = skb-nh.iph-ihl;
hdr-tcp_hdr_words = skb-h.th-doff;
hdr-eth_type_mss = htons(MK_ETH_TYPE_MSS(eth_type,
-   skb_shinfo(skb)-tso_size));
+   skb_shinfo(skb)-gso_size));
hdr-len = htonl(skb-len - sizeof(*hdr));
cpl = (struct cpl_tx_pkt *)hdr;
sge-stats.tx_lso_pkts++;
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -2394,7 +2394,7 @@ e1000_tso(struct e1000_adapter *adapter,
uint8_t ipcss, ipcso, tucss, tucso, hdr_len;
int err;
 
-   if (skb_shinfo(skb)-tso_size) {
+   if (skb_shinfo(skb)-gso_size) {
if (skb_header_cloned(skb)) {
err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
if (err)
@@ -2402,7 +2402,7 @@ e1000_tso(struct e1000_adapter *adapter,
}
 
hdr_len = ((skb-h.raw - skb-data) + (skb-h.th-doff  2));
-   mss = skb_shinfo(skb)-tso_size;
+   mss = skb_shinfo(skb)-gso_size;
if (skb-protocol == htons(ETH_P_IP)) {
skb-nh.iph-tot_len = 0;
skb-nh.iph-check = 0;
@@ -2519,7 +2519,7 @@ e1000_tx_map(struct e1000_adapter *adapt
 * tso gets written back prematurely before the data is fully
 * DMA'd to the controller */
if (!skb-data_len  tx_ring-last_tx_tso 
-   !skb_shinfo(skb)-tso_size) {
+   !skb_shinfo(skb)-gso_size) {
tx_ring-last_tx_tso = 0;
size -= 4;
}
@@ -2757,7 +2757,7 @@ e1000_xmit_frame(struct sk_buff *skb, st

[2/5] [NET]: Add generic segmentation offload

2006-06-22 Thread Herbert Xu

Hi:

[NET]: Add generic segmentation offload

This patch adds the infrastructure for generic segmentation offload.
The idea is to tap into the potential savings of TSO without hardware
support by postponing the allocation of segmented skb's until just
before the entry point into the NIC driver.

The same structure can be used to support software IPv6 TSO, as well as
UFO and segmentation offload for other relevant protocols, e.g., DCCP.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -406,6 +406,9 @@ struct net_device
struct list_headqdisc_list;
unsigned long   tx_queue_len;   /* Max frames per queue allowed 
*/
 
+   /* Partially transmitted GSO packet. */
+   struct sk_buff  *gso_skb;
+
/* ingress path synchronizer */
spinlock_t  ingress_lock;
struct Qdisc*qdisc_ingress;
@@ -540,6 +543,7 @@ struct packet_type {
 struct net_device *,
 struct packet_type *,
 struct net_device *);
+   struct sk_buff  *(*gso_segment)(struct sk_buff *skb, int sg);
void*af_packet_priv;
struct list_headlist;
 };
@@ -690,7 +694,8 @@ extern int  dev_change_name(struct net_d
 extern int dev_set_mtu(struct net_device *, int);
 extern int dev_set_mac_address(struct net_device *,
struct sockaddr *);
-extern voiddev_queue_xmit_nit(struct sk_buff *skb, struct 
net_device *dev);
+extern int dev_hard_start_xmit(struct sk_buff *skb,
+   struct net_device *dev);
 
 extern voiddev_init(void);
 
@@ -964,6 +969,7 @@ extern int  netdev_max_backlog;
 extern int weight_p;
 extern int netdev_set_master(struct net_device *dev, struct 
net_device *master);
 extern int skb_checksum_help(struct sk_buff *skb, int inward);
+extern struct sk_buff *skb_gso_segment(struct sk_buff *skb, int sg);
 #ifdef CONFIG_BUG
 extern void netdev_rx_csum_fault(struct net_device *dev);
 #else
diff --git a/net/core/dev.c b/net/core/dev.c
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -116,6 +116,7 @@
 #include asm/current.h
 #include linux/audit.h
 #include linux/dmaengine.h
+#include linux/err.h
 
 /*
  * The list of packet types we will receive (as opposed to discard)
@@ -1048,7 +1049,7 @@ static inline void net_timestamp(struct 
  * taps currently in use.
  */
 
-void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
+static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
 {
struct packet_type *ptype;
 
@@ -1186,6 +1187,40 @@ out: 
return ret;
 }
 
+/**
+ * skb_gso_segment - Perform segmentation on skb.
+ * @skb: buffer to segment
+ * @sg: whether scatter-gather is supported on the target.
+ *
+ * This function segments the given skb and returns a list of segments.
+ */
+struct sk_buff *skb_gso_segment(struct sk_buff *skb, int sg)
+{
+   struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT);
+   struct packet_type *ptype;
+   int type = skb-protocol;
+
+   BUG_ON(skb_shinfo(skb)-frag_list);
+   BUG_ON(skb-ip_summed != CHECKSUM_HW);
+
+   skb-mac.raw = skb-data;
+   skb-mac_len = skb-nh.raw - skb-data;
+   __skb_pull(skb, skb-mac_len);
+
+   rcu_read_lock();
+   list_for_each_entry_rcu(ptype, ptype_base[ntohs(type)  15], list) {
+   if (ptype-type == type  !ptype-dev  ptype-gso_segment) {
+   segs = ptype-gso_segment(skb, sg);
+   break;
+   }
+   }
+   rcu_read_unlock();
+
+   return segs;
+}
+
+EXPORT_SYMBOL(skb_gso_segment);
+
 /* Take action when hardware reception checksum errors are detected. */
 #ifdef CONFIG_BUG
 void netdev_rx_csum_fault(struct net_device *dev)
@@ -1222,6 +1257,86 @@ static inline int illegal_highdma(struct
 #define illegal_highdma(dev, skb)  (0)
 #endif
 
+struct dev_gso_cb {
+   void (*destructor)(struct sk_buff *skb);
+};
+
+#define DEV_GSO_CB(skb) ((struct dev_gso_cb *)(skb)-cb)
+
+static void dev_gso_skb_destructor(struct sk_buff *skb)
+{
+   struct dev_gso_cb *cb;
+
+   do {
+   struct sk_buff *nskb = skb-next;
+
+   skb-next = nskb-next;
+   nskb-next = NULL;
+   kfree_skb(nskb);
+   } while (skb-next);
+
+   cb = DEV_GSO_CB(skb);
+   if (cb-destructor)
+   cb-destructor(skb);
+}
+
+/**
+

[4/5] [NET]: Added GSO toggle

2006-06-22 Thread Herbert Xu

Hi:

[NET]: Added GSO toggle

This patch adds a generic segmentation offload toggle that can be turned
on/off for each net device.  For now it only supports in TCPv4.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -408,6 +408,8 @@ struct ethtool_ops {
 #define ETHTOOL_GPERMADDR  0x0020 /* Get permanent hardware address */
 #define ETHTOOL_GUFO   0x0021 /* Get UFO enable (ethtool_value) */
 #define ETHTOOL_SUFO   0x0022 /* Set UFO enable (ethtool_value) */
+#define ETHTOOL_GGSO   0x0023 /* Get GSO enable (ethtool_value) */
+#define ETHTOOL_SGSO   0x0024 /* Set GSO enable (ethtool_value) */
 
 /* compatibility with older code */
 #define SPARC_ETH_GSET ETHTOOL_GSET
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -309,6 +309,7 @@ struct net_device
 #define NETIF_F_HW_VLAN_RX 256 /* Receive VLAN hw acceleration */
 #define NETIF_F_HW_VLAN_FILTER 512 /* Receive filtering on VLAN */
 #define NETIF_F_VLAN_CHALLENGED1024/* Device cannot handle VLAN 
packets */
+#define NETIF_F_GSO2048/* Enable software GSO. */
 #define NETIF_F_LLTX   4096/* LockLess TX */
 
/* Segmentation offload features */
diff --git a/include/net/sock.h b/include/net/sock.h
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1031,9 +1031,13 @@ static inline void sk_setup_caps(struct 
 {
__sk_dst_set(sk, dst);
sk-sk_route_caps = dst-dev-features;
+   if (sk-sk_route_caps  NETIF_F_GSO)
+   sk-sk_route_caps |= NETIF_F_TSO;
if (sk-sk_route_caps  NETIF_F_TSO) {
if (sock_flag(sk, SOCK_NO_LARGESEND) || dst-header_len)
sk-sk_route_caps = ~NETIF_F_TSO;
+   else 
+   sk-sk_route_caps |= NETIF_F_SG | NETIF_F_HW_CSUM;
}
 }
 
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -376,15 +376,20 @@ void br_features_recompute(struct net_br
features = br-feature_mask  ~NETIF_F_ALL_CSUM;
 
list_for_each_entry(p, br-port_list, list) {
-   if (checksum  NETIF_F_NO_CSUM 
-   !(p-dev-features  NETIF_F_NO_CSUM))
+   unsigned long feature = p-dev-features;
+
+   if (checksum  NETIF_F_NO_CSUM  !(feature  NETIF_F_NO_CSUM))
checksum ^= NETIF_F_NO_CSUM | NETIF_F_HW_CSUM;
-   if (checksum  NETIF_F_HW_CSUM 
-   !(p-dev-features  NETIF_F_HW_CSUM))
+   if (checksum  NETIF_F_HW_CSUM  !(feature  NETIF_F_HW_CSUM))
checksum ^= NETIF_F_HW_CSUM | NETIF_F_IP_CSUM;
-   if (!(p-dev-features  NETIF_F_IP_CSUM))
+   if (!(feature  NETIF_F_IP_CSUM))
checksum = 0;
-   features = p-dev-features;
+
+   if (feature  NETIF_F_GSO)
+   feature |= NETIF_F_TSO;
+   feature |= NETIF_F_GSO;
+
+   features = feature;
}
 
br-dev-features = features | checksum | NETIF_F_LLTX;
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -614,6 +614,29 @@ static int ethtool_set_ufo(struct net_de
return dev-ethtool_ops-set_ufo(dev, edata.data);
 }
 
+static int ethtool_get_gso(struct net_device *dev, char __user *useraddr)
+{
+   struct ethtool_value edata = { ETHTOOL_GGSO };
+
+   edata.data = dev-features  NETIF_F_GSO;
+   if (copy_to_user(useraddr, edata, sizeof(edata)))
+return -EFAULT;
+   return 0;
+}
+
+static int ethtool_set_gso(struct net_device *dev, char __user *useraddr)
+{
+   struct ethtool_value edata;
+
+   if (copy_from_user(edata, useraddr, sizeof(edata)))
+   return -EFAULT;
+   if (edata.data)
+   dev-features |= NETIF_F_GSO;
+   else
+   dev-features = ~NETIF_F_GSO;
+   return 0;
+}
+
 static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
 {
struct ethtool_test test;
@@ -905,6 +928,12 @@ int dev_ethtool(struct ifreq *ifr)
case ETHTOOL_SUFO:
rc = ethtool_set_ufo(dev, useraddr);
break;
+   case ETHTOOL_GGSO:
+   rc = ethtool_get_gso(dev, useraddr);
+   break;
+   case ETHTOOL_SGSO:
+   rc = ethtool_set_gso(dev, useraddr);
+   break;
default:
rc =  -EOPNOTSUPP;
}

[5/5] [IPSEC]: Handle GSO packets

2006-06-22 Thread Herbert Xu

Hi:

[IPSEC]: Handle GSO packets

This patch segments GSO packets received by the IPsec stack.  This can
happen when a NIC driver injects GSO packets into the stack which are
then forwarded to another host.

The primary application of this is going to be Xen where its backend
driver may inject GSO packets into dom0.

Of course this also can be used by other virtualisation schemes such as
VMWare or UML since the tap device could be modified to inject GSO packets
received through splice.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -9,6 +9,8 @@
  */
 
 #include linux/compiler.h
+#include linux/if_ether.h
+#include linux/kernel.h
 #include linux/skbuff.h
 #include linux/spinlock.h
 #include linux/netfilter_ipv4.h
@@ -97,16 +99,10 @@ error_nolock:
goto out_exit;
 }
 
-static int xfrm4_output_finish(struct sk_buff *skb)
+static int xfrm4_output_finish2(struct sk_buff *skb)
 {
int err;
 
-#ifdef CONFIG_NETFILTER
-   if (!skb-dst-xfrm) {
-   IPCB(skb)-flags |= IPSKB_REROUTED;
-   return dst_output(skb);
-   }
-#endif
while (likely((err = xfrm4_output_one(skb)) == 0)) {
nf_reset(skb);
 
@@ -119,7 +115,7 @@ static int xfrm4_output_finish(struct sk
return dst_output(skb);
 
err = nf_hook(PF_INET, NF_IP_POST_ROUTING, skb, NULL,
- skb-dst-dev, xfrm4_output_finish);
+ skb-dst-dev, xfrm4_output_finish2);
if (unlikely(err != 1))
break;
}
@@ -127,6 +123,48 @@ static int xfrm4_output_finish(struct sk
return err;
 }
 
+static int xfrm4_output_finish(struct sk_buff *skb)
+{
+   struct sk_buff *segs;
+
+#ifdef CONFIG_NETFILTER
+   if (!skb-dst-xfrm) {
+   IPCB(skb)-flags |= IPSKB_REROUTED;
+   return dst_output(skb);
+   }
+#endif
+
+   if (!skb_shinfo(skb)-gso_size)
+   return xfrm4_output_finish2(skb);
+
+   skb-protocol = htons(ETH_P_IP);
+   segs = skb_gso_segment(skb, 0);
+   kfree_skb(skb);
+   if (unlikely(IS_ERR(segs)))
+   return PTR_ERR(segs);
+
+   do {
+   struct sk_buff *nskb = segs-next;
+   int err;
+
+   segs-next = NULL;
+   err = xfrm4_output_finish2(segs);
+
+   if (unlikely(err)) {
+   while ((segs = nskb)) {
+   nskb = segs-next;
+   segs-next = NULL;
+   kfree_skb(segs);
+   }
+   return err;
+   }
+
+   segs = nskb;
+   } while (segs);
+
+   return 0;
+}
+
 int xfrm4_output(struct sk_buff *skb)
 {
return NF_HOOK_COND(PF_INET, NF_IP_POST_ROUTING, skb, NULL, 
skb-dst-dev,
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -94,7 +94,7 @@ error_nolock:
goto out_exit;
 }
 
-static int xfrm6_output_finish(struct sk_buff *skb)
+static int xfrm6_output_finish2(struct sk_buff *skb)
 {
int err;
 
@@ -110,7 +110,7 @@ static int xfrm6_output_finish(struct sk
return dst_output(skb);
 
err = nf_hook(PF_INET6, NF_IP6_POST_ROUTING, skb, NULL,
- skb-dst-dev, xfrm6_output_finish);
+ skb-dst-dev, xfrm6_output_finish2);
if (unlikely(err != 1))
break;
}
@@ -118,6 +118,41 @@ static int xfrm6_output_finish(struct sk
return err;
 }
 
+static int xfrm6_output_finish(struct sk_buff *skb)
+{
+   struct sk_buff *segs;
+
+   if (!skb_shinfo(skb)-gso_size)
+   return xfrm6_output_finish2(skb);
+
+   skb-protocol = htons(ETH_P_IP);
+   segs = skb_gso_segment(skb, 0);
+   kfree_skb(skb);
+   if (unlikely(IS_ERR(segs)))
+   return PTR_ERR(segs);
+
+   do {
+   struct sk_buff *nskb = segs-next;
+   int err;
+
+   segs-next = NULL;
+   err = xfrm6_output_finish2(segs);
+
+   if (unlikely(err)) {
+   while ((segs = nskb)) {
+   nskb = segs-next;
+   segs-next = NULL;
+   kfree_skb(segs);
+   }
+   return err;
+   }
+
+   segs = nskb;
+   } while (segs);
+
+   return 0;
+}
+
 int xfrm6_output(struct sk_buff *skb)
 {
return NF_HOOK(PF_INET6,

Re: [0/5] GSO: Generic Segmentation Offload

2006-06-22 Thread Herbert Xu

Hi:

If anyone is interested here is the incremental patch against the previous
series.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/core/dev.c b/net/core/dev.c
index 9c68ab8..d293e0f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1415,7 +1415,7 @@ gso:
/* Disable soft irqs for various locks below. Also 
 * stops preemption for RCU. 
 */
-   local_bh_disable(); 
+   rcu_read_lock_bh(); 
 
/* Updates of qdisc are serialized by queue_lock. 
 * The struct Qdisc which is pointed to by qdisc is now a 
@@ -1486,13 +1486,13 @@ #endif
}
 
rc = -ENETDOWN;
-   local_bh_enable();
+   rcu_read_unlock_bh();
 
 out_kfree_skb:
kfree_skb(skb);
return rc;
 out:
-   local_bh_enable();
+   rcu_read_unlock_bh();
return rc;
 }
 
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 472cb5a..4cdd6ca 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -185,9 +185,13 @@ requeue:
 
 void __qdisc_run(struct net_device *dev)
 {
+   if (unlikely(dev-qdisc == noop_qdisc))
+   goto out;
+
while (qdisc_restart(dev)  0  !netif_queue_stopped(dev))
/* NOTHING */;
 
+out:
clear_bit(__LINK_STATE_QDISC_RUNNING, dev-state);
 }
 
@@ -581,20 +585,24 @@ void dev_deactivate(struct net_device *d
spin_lock_bh(dev-queue_lock);
qdisc = dev-qdisc;
dev-qdisc = noop_qdisc;
-   skb = dev-gso_skb;
-   dev-gso_skb = NULL;
 
qdisc_reset(qdisc);
 
spin_unlock_bh(dev-queue_lock);
 
-   kfree_skb(skb);
dev_watchdog_down(dev);
 
-   while (test_bit(__LINK_STATE_SCHED, dev-state))
+   /* Wait for outstanding dev_queue_xmit calls. */
+   synchronize_rcu();
+
+   /* Wait for outstanding qdisc_run calls. */
+   while (test_bit(__LINK_STATE_QDISC_RUNNING, dev-state))
yield();
 
-   spin_unlock_wait(dev-_xmit_lock);
+   if (dev-gso_skb) {
+   kfree_skb(dev-gso_skb);
+   dev-gso_skb = NULL;
+   }
 }
 
 void dev_init_scheduler(struct net_device *dev)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[3/5] [NET]: Add software TSOv4

2006-06-22 Thread Herbert Xu

Hi:

[NET]: Add software TSOv4

This patch adds the GSO implementation for IPv4 TCP.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1299,6 +1299,7 @@ extern void  skb_split(struct sk_b
 struct sk_buff *skb1, const u32 len);
 
 extern void   skb_release_data(struct sk_buff *skb);
+extern struct sk_buff *skb_segment(struct sk_buff *skb, int sg);
 
 static inline void *skb_header_pointer(const struct sk_buff *skb, int offset,
   int len, void *buffer)
diff --git a/include/net/protocol.h b/include/net/protocol.h
--- a/include/net/protocol.h
+++ b/include/net/protocol.h
@@ -37,6 +37,7 @@
 struct net_protocol {
int (*handler)(struct sk_buff *skb);
void(*err_handler)(struct sk_buff *skb, u32 info);
+   struct sk_buff *(*gso_segment)(struct sk_buff *skb, int sg);
int no_policy;
 };
 
diff --git a/include/net/tcp.h b/include/net/tcp.h
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1087,6 +1087,8 @@ extern struct request_sock_ops tcp_reque
 
 extern int tcp_v4_destroy_sock(struct sock *sk);
 
+extern struct sk_buff *tcp_tso_segment(struct sk_buff *skb, int sg);
+
 #ifdef CONFIG_PROC_FS
 extern int  tcp4_proc_init(void);
 extern void tcp4_proc_exit(void);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1826,6 +1826,132 @@ unsigned char *skb_pull_rcsum(struct sk_
 
 EXPORT_SYMBOL_GPL(skb_pull_rcsum);
 
+/**
+ * skb_segment - Perform protocol segmentation on skb.
+ * @skb: buffer to segment
+ * @sg: whether scatter-gather can be used for generated segments
+ *
+ * This function performs segmentation on the given skb.  It returns
+ * the segment at the given position.  It returns NULL if there are
+ * no more segments to generate, or when an error is encountered.
+ */
+struct sk_buff *skb_segment(struct sk_buff *skb, int sg)
+{
+   struct sk_buff *segs = NULL;
+   struct sk_buff *tail = NULL;
+   unsigned int mss = skb_shinfo(skb)-gso_size;
+   unsigned int doffset = skb-data - skb-mac.raw;
+   unsigned int offset = doffset;
+   unsigned int headroom;
+   unsigned int len;
+   int nfrags = skb_shinfo(skb)-nr_frags;
+   int err = -ENOMEM;
+   int i = 0;
+   int pos;
+
+   __skb_push(skb, doffset);
+   headroom = skb_headroom(skb);
+   pos = skb_headlen(skb);
+
+   do {
+   struct sk_buff *nskb;
+   skb_frag_t *frag;
+   int hsize, nsize;
+   int k;
+   int size;
+
+   len = skb-len - offset;
+   if (len  mss)
+   len = mss;
+
+   hsize = skb_headlen(skb) - offset;
+   if (hsize  0)
+   hsize = 0;
+   nsize = hsize + doffset;
+   if (nsize  len + doffset || !sg)
+   nsize = len + doffset;
+
+   nskb = alloc_skb(nsize + headroom, GFP_ATOMIC);
+   if (unlikely(!nskb))
+   goto err;
+
+   if (segs)
+   tail-next = nskb;
+   else
+   segs = nskb;
+   tail = nskb;
+
+   nskb-dev = skb-dev;
+   nskb-priority = skb-priority;
+   nskb-protocol = skb-protocol;
+   nskb-dst = dst_clone(skb-dst);
+   memcpy(nskb-cb, skb-cb, sizeof(skb-cb));
+   nskb-pkt_type = skb-pkt_type;
+   nskb-mac_len = skb-mac_len;
+
+   skb_reserve(nskb, headroom);
+   nskb-mac.raw = nskb-data;
+   nskb-nh.raw = nskb-data + skb-mac_len;
+   nskb-h.raw = nskb-nh.raw + (skb-h.raw - skb-nh.raw);
+   memcpy(skb_put(nskb, doffset), skb-data, doffset);
+
+   if (!sg) {
+   nskb-csum = skb_copy_and_csum_bits(skb, offset,
+   skb_put(nskb, len),
+   len, 0);
+   continue;
+   }
+
+   frag = skb_shinfo(nskb)-frags;
+   k = 0;
+
+   nskb-ip_summed = CHECKSUM_HW;
+   nskb-csum = skb-csum;
+   memcpy(skb_put(nskb, hsize), skb-data + offset, hsize);
+
+   while (pos  offset + len) {
+   BUG_ON(i = nfrags);
+
+   *frag = skb_shinfo(skb)-frags[i];
+   get_page(frag-page);
+

Re: [3/5] [NET]: Add software TSOv4

2006-06-22 Thread Herbert Xu

On Thu, Jun 22, 2006 at 06:14:00PM +1000, herbert wrote:
 
 [NET]: Add software TSOv4

Doh, forgot to remove an unused declaration.  Here is an updated version.

[NET]: Add software TSOv4

This patch adds the GSO implementation for IPv4 TCP.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -406,6 +406,9 @@ struct net_device
struct list_headqdisc_list;
unsigned long   tx_queue_len;   /* Max frames per queue allowed 
*/
 
+   /* Partially transmitted GSO packet. */
+   struct sk_buff  *gso_skb;
+
/* ingress path synchronizer */
spinlock_t  ingress_lock;
struct Qdisc*qdisc_ingress;
@@ -540,6 +543,7 @@ struct packet_type {
 struct net_device *,
 struct packet_type *,
 struct net_device *);
+   struct sk_buff  *(*gso_segment)(struct sk_buff *skb, int sg);
void*af_packet_priv;
struct list_headlist;
 };
@@ -690,7 +694,8 @@ extern int  dev_change_name(struct net_d
 extern int dev_set_mtu(struct net_device *, int);
 extern int dev_set_mac_address(struct net_device *,
struct sockaddr *);
-extern voiddev_queue_xmit_nit(struct sk_buff *skb, struct 
net_device *dev);
+extern int dev_hard_start_xmit(struct sk_buff *skb,
+   struct net_device *dev);
 
 extern voiddev_init(void);
 
@@ -964,6 +969,7 @@ extern int  netdev_max_backlog;
 extern int weight_p;
 extern int netdev_set_master(struct net_device *dev, struct 
net_device *master);
 extern int skb_checksum_help(struct sk_buff *skb, int inward);
+extern struct sk_buff *skb_gso_segment(struct sk_buff *skb, int sg);
 #ifdef CONFIG_BUG
 extern void netdev_rx_csum_fault(struct net_device *dev);
 #else
diff --git a/net/core/dev.c b/net/core/dev.c
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -116,6 +116,7 @@
 #include asm/current.h
 #include linux/audit.h
 #include linux/dmaengine.h
+#include linux/err.h
 
 /*
  * The list of packet types we will receive (as opposed to discard)
@@ -1048,7 +1049,7 @@ static inline void net_timestamp(struct 
  * taps currently in use.
  */
 
-void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
+static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
 {
struct packet_type *ptype;
 
@@ -1186,6 +1187,40 @@ out: 
return ret;
 }
 
+/**
+ * skb_gso_segment - Perform segmentation on skb.
+ * @skb: buffer to segment
+ * @sg: whether scatter-gather is supported on the target.
+ *
+ * This function segments the given skb and returns a list of segments.
+ */
+struct sk_buff *skb_gso_segment(struct sk_buff *skb, int sg)
+{
+   struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT);
+   struct packet_type *ptype;
+   int type = skb-protocol;
+
+   BUG_ON(skb_shinfo(skb)-frag_list);
+   BUG_ON(skb-ip_summed != CHECKSUM_HW);
+
+   skb-mac.raw = skb-data;
+   skb-mac_len = skb-nh.raw - skb-data;
+   __skb_pull(skb, skb-mac_len);
+
+   rcu_read_lock();
+   list_for_each_entry_rcu(ptype, ptype_base[ntohs(type)  15], list) {
+   if (ptype-type == type  !ptype-dev  ptype-gso_segment) {
+   segs = ptype-gso_segment(skb, sg);
+   break;
+   }
+   }
+   rcu_read_unlock();
+
+   return segs;
+}
+
+EXPORT_SYMBOL(skb_gso_segment);
+
 /* Take action when hardware reception checksum errors are detected. */
 #ifdef CONFIG_BUG
 void netdev_rx_csum_fault(struct net_device *dev)
@@ -1222,6 +1257,86 @@ static inline int illegal_highdma(struct
 #define illegal_highdma(dev, skb)  (0)
 #endif
 
+struct dev_gso_cb {
+   void (*destructor)(struct sk_buff *skb);
+};
+
+#define DEV_GSO_CB(skb) ((struct dev_gso_cb *)(skb)-cb)
+
+static void dev_gso_skb_destructor(struct sk_buff *skb)
+{
+   struct dev_gso_cb *cb;
+
+   do {
+   struct sk_buff *nskb = skb-next;
+
+   skb-next = nskb-next;
+   nskb-next = NULL;
+   kfree_skb(nskb);
+   } while (skb-next);
+
+   cb = DEV_GSO_CB(skb);
+   if (cb-destructor)
+   cb-destructor(skb);
+}
+
+/**
+ * dev_gso_segment - Perform emulated hardware segmentation on skb.
+ * @skb: buffer to segment
+ *
+ * This function segments the given skb and stores the list of segments
+ * in

Re: [patch] ipv6 source address selection in addrconf.c (2.6.17)

2006-06-22 Thread David Miller

From: YOSHIFUJI Hideaki [EMAIL PROTECTED]
Date: Thu, 22 Jun 2006 01:12:57 +0900 (JST)

 I think it is trivial enough to push this to -stable as well.

 Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]

Ok, done.

Thanks a lot!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [IPV6] ADDRCONF: Fix default source address selection without CONFIG_IPV6_PRIVACY

2006-06-22 Thread David Miller

From: YOSHIFUJI Hideaki [EMAIL PROTECTED]
Date: Thu, 22 Jun 2006 00:23:41 +0900 (JST)

 We need to update hiscore.rule even if we don't enable CONFIG_IPV6_PRIVACY,
 because we have more less significant rule; longest match.

Applied, thank you.

 I think it is suitable for -stable as well.

Agreed, I have pushed it to -stable.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [-mm patch] drivers/net/ni5010.c: fix compile error

2006-06-22 Thread Andreas Mohr

Hi,

On Wed, Jun 21, 2006 at 05:10:57PM +0200, Adrian Bunk wrote:
 On Wed, Jun 21, 2006 at 03:48:57AM -0700, Andrew Morton wrote:
 ...
  Changes since 2.6.17-rc6-mm2:
 ...
  +ni5010-netcard-cleanup.patch
  
   netdev cleanup
 ...
 
 This patch fixes the following compile error with CONFIG_NI5010=y:

Doh, thanks!
(that should teach me to do non-module runs, too)

Andreas Mohr
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ? (was Re: Possible leaks in network drivers)

2006-06-22 Thread Jeff Garzik


Herbert Xu wrote:

This patch uses pskb_expand_head to expand the existing skb and linearize


Seems sane to me.



it if needed.  Actually, someone should sift through every instance of
skb_pad on a non-linear skb as they do not fit the reasons why this was
originally created.


Non-linear skbs smaller than ETH_ZLEN seem unlikely.

Overall, the skb_pad() changes were made over a short span of time, 
often to older and under-used drivers, so I would not be surprised to 
find rough edges or the occasional bug.


Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ?

2006-06-22 Thread David Miller

From: Herbert Xu [EMAIL PROTECTED]
Date: Thu, 22 Jun 2006 12:30:29 +1000

 On Thu, Jun 22, 2006 at 10:55:44AM +1000, Herbert Xu wrote:

  I think skb_padto simply shouldn't allocate a new skb.  It only needs
  to extend the data area.

 OK, here is a patch to make it do that.

 [NET]: Avoid allocating skb in skb_pad

Want me to let this cook in 2.6.18 for a while before sending
it off to -stable?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ? (was Re: Possible leaks in network drivers)

2006-06-22 Thread Herbert Xu

On Thu, Jun 22, 2006 at 04:22:22AM -0400, Jeff Garzik wrote:
 
 it if needed.  Actually, someone should sift through every instance of
 skb_pad on a non-linear skb as they do not fit the reasons why this was
 originally created.
 
 Non-linear skbs smaller than ETH_ZLEN seem unlikely.

When I was grepping it seems that a few drivers were using it with
a length other than ETH_ZLEN.  I've just done another grep and here
are the potential suspects:

cassini.c
starfire.c
yellowfin.c

Also, the skb_pad in drivers/s390/net/claw.c didn't check for errors
at all.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ?

2006-06-22 Thread Herbert Xu

On Thu, Jun 22, 2006 at 01:26:09AM -0700, David Miller wrote:

 Want me to let this cook in 2.6.18 for a while before sending
 it off to -stable?

You know I'm never one to push anything quickly so absolutely yes :)
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ?

2006-06-22 Thread David Miller

From: Herbert Xu [EMAIL PROTECTED]
Date: Thu, 22 Jun 2006 18:30:37 +1000

 On Thu, Jun 22, 2006 at 01:26:09AM -0700, David Miller wrote:
 
  Want me to let this cook in 2.6.18 for a while before sending
  it off to -stable?
 
 You know I'm never one to push anything quickly so absolutely yes :)

Ok, applied to net-2.6.18 for now :)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [-mm patch] drivers/net/ni5010.c: fix compile error

2006-06-22 Thread Adrian Bunk

On Thu, Jun 22, 2006 at 10:13:16AM +0200, Andreas Mohr wrote:
 Hi,
 
 On Wed, Jun 21, 2006 at 05:10:57PM +0200, Adrian Bunk wrote:
  On Wed, Jun 21, 2006 at 03:48:57AM -0700, Andrew Morton wrote:
  ...
   Changes since 2.6.17-rc6-mm2:
  ...
   +ni5010-netcard-cleanup.patch
   
netdev cleanup
  ...
  
  This patch fixes the following compile error with CONFIG_NI5010=y:
 
 Doh, thanks!
 (that should teach me to do non-module runs, too)

And change the driver to no longer use Space.c?  ;-)

 Andreas Mohr

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: No interfaces under /proc/sys/net/ipv4/conf/

2006-06-22 Thread Hasso Tepper

Hasso Tepper wrote:
 After upgrade to 2.6.16.20 from 2.6.11 I discovered that no dynamic
 interfaces (vlans, tunnels) appear under /proc/sys/net/ipv4/conf/.
 /proc/sys/net/ipv6/conf/ is OK.

OK, realised out that it's feature. Entries in /proc/sys/net/*/conf/ are 
not created if interface doesn't have at least one ipv4/ipv6 address.

I can think of workarounds for most of problems (although it breaks a hell 
lot of software here), but how I suppose to configure ipv6 settings for 
interfaces which have to obtain global ipv6 address via autoconf so that 
it will work even if cable is not plugged in? I did via /etc/sysctl.conf,
but now ... machine boots with no link = no link-local address = 
no /proc/sys/net/ipv6/conf/interfce = configuration fails.


regards,

-- 
Hasso Tepper

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [NET]: Prevent transmission after dev_deactivate

2006-06-22 Thread David Miller

From: Herbert Xu [EMAIL PROTECTED]
Date: Thu, 22 Jun 2006 18:08:56 +1000

 I found a bug in my GSO patches with the shutdown handling in
 dev_deactivate.  It provided enough impetus for me to finally
 clean up this function :)

 This patch is against Linus's tree.

 [NET]: Prevent transmission after dev_deactivate

Looks good, applied, thanks a lot!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ? (was Re: Possible leaks in network drivers)

2006-06-22 Thread Jeff Garzik


Herbert Xu wrote:

On Thu, Jun 22, 2006 at 04:22:22AM -0400, Jeff Garzik wrote:

it if needed.  Actually, someone should sift through every instance of
skb_pad on a non-linear skb as they do not fit the reasons why this was
originally created.

Non-linear skbs smaller than ETH_ZLEN seem unlikely.


When I was grepping it seems that a few drivers were using it with
a length other than ETH_ZLEN.  I've just done another grep and here
are the potential suspects:

cassini.c
starfire.c
yellowfin.c


That doesn't really invalidate the point :)  These drivers are still 
only padding very small packets.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 2/7] NetLabel: core network changes

2006-06-22 Thread David Miller

From: [EMAIL PROTECTED]
Date: Wed, 21 Jun 2006 15:42:37 -0400

 Index: linux-2.6.17.i686-quilt/include/linux/netlink.h
 ===
 --- linux-2.6.17.i686-quilt.orig/include/linux/netlink.h
 +++ linux-2.6.17.i686-quilt/include/linux/netlink.h
 @@ -21,6 +21,7 @@
  #define NETLINK_DNRTMSG  14  /* DECnet routing messages */
  #define NETLINK_KOBJECT_UEVENT   15  /* Kernel messages to userspace 
 */
  #define NETLINK_GENERIC  16
 +#define NETLINK_NETLABEL 17  /* Network packet labeling */

  #define MAX_LINKS 32 

Please use generic netlink.

Jamal posted a very nice document recently on how to use it
properly.  You can read that thread here:

http://marc.theaimsgroup.com/?l=linux-netdevm=115072450928755w=2

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ? (was Re: Possible leaks in network drivers)

2006-06-22 Thread Herbert Xu

On Thu, Jun 22, 2006 at 04:57:39AM -0400, Jeff Garzik wrote:

 Non-linear skbs smaller than ETH_ZLEN seem unlikely.
 
 When I was grepping it seems that a few drivers were using it with
 a length other than ETH_ZLEN.  I've just done another grep and here
 are the potential suspects:
 
 cassini.c
 starfire.c
 yellowfin.c
 
 That doesn't really invalidate the point :)  These drivers are still 
 only padding very small packets.

Hmm, at least cassini pads it to 255 for gigabit...

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 2/7] NetLabel: core network changes

2006-06-22 Thread David Miller

From: [EMAIL PROTECTED]
Date: Wed, 21 Jun 2006 15:42:37 -0400

 Index: linux-2.6.17.i686-quilt/net/ipv4/Makefile
 ===
 --- linux-2.6.17.i686-quilt.orig/net/ipv4/Makefile
 +++ linux-2.6.17.i686-quilt/net/ipv4/Makefile
 @@ -42,6 +42,9 @@ obj-$(CONFIG_TCP_CONG_HYBLA) += tcp_hybl
  obj-$(CONFIG_TCP_CONG_HTCP) += tcp_htcp.o
  obj-$(CONFIG_TCP_CONG_VEGAS) += tcp_vegas.o
  obj-$(CONFIG_TCP_CONG_SCALABLE) += tcp_scalable.o
 +ifeq ($(CONFIG_NETLABEL_CIPSOV4),y)
 +obj-y += cipso_ipv4.o
 +endif

Why not obj-$CONFIG_NETLABEL_CIPSOV4 += cipso_ipv4.o?

The whole idea behind the obj-$CONFIG_OPTION technique is
to avoid conditionals all over the makefile.

 Index: linux-2.6.17.i686-quilt/net/ipv4/af_inet.c
 ===
 --- linux-2.6.17.i686-quilt.orig/net/ipv4/af_inet.c
 +++ linux-2.6.17.i686-quilt/net/ipv4/af_inet.c
 @@ -114,6 +114,7 @@
  #ifdef CONFIG_IP_MROUTE
  #include linux/mroute.h
  #endif
 +#include net/netlabel.h

  DEFINE_SNMP_STAT(struct linux_mib, net_statistics) __read_mostly;

 @@ -616,6 +617,8 @@ int inet_accept(struct socket *sock, str

   sock_graft(sk2, newsock);

 + netlbl_socket_inet_accept(sock, newsock);
 +
   newsock-state = SS_CONNECTED;
   err = 0;
   release_sock(sk2);

Neither the netlabel.h header not the implementation of
the netlbl_socket_inet_accept() function exist at this
point in your patch set.

At each patch point, the tree must build and function
properly.

This means you have to split up and order your changes
correctly, gradually building up the infrastructure and
then finally plugging it in and making use of it.

Nobody can test your work in an incremental fashion, and
thus it's not possible to determine if a bug or behavior
gets introduced at patch 2, 3 or 4, for example.

 + if (cipso_v4_validate(optptr)) {
 + pp_ptr = optptr;
 + goto error;
 + }
 + break;

Same thing here, cipso_v4_validate() doesn't exist in the
tree at this point in the patch set, so the tree doesn't
build after applying this patch.

Please split up your submission properly.

I really can't sanely review the rest of this until you dice up your
changes properly.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ? (was Re: Possible leaks in network drivers)

2006-06-22 Thread Herbert Xu

On Thu, Jun 22, 2006 at 07:02:27PM +1000, herbert wrote:

  cassini.c
  starfire.c
  yellowfin.c
  
  That doesn't really invalidate the point :)  These drivers are still 
  only padding very small packets.
 
 Hmm, at least cassini pads it to 255 for gigabit...

The one in starfire looks especially dodgy.  It supports SG and also
requires the whole length to be a multiple of 4 if the firmware is
broken.  The question is do they really intend this or do they want
each fragment to terminate on a 4-byte boundary.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 3/7] NetLabel: CIPSOv4 engine

2006-06-22 Thread David Miller

From: [EMAIL PROTECTED]
Date: Wed, 21 Jun 2006 15:42:38 -0400

 Add support for the Commercial IP Security Option (CIPSO) to the
 IPv4 network stack.  CIPSO has become a de-facto standard for
 trusted/labeled networking amongst existing Trusted Operating
 Systems such as Trusted Solaris, HP-UX CMW, etc.  This
 implementation is designed to be used with the NetLabel subsystem to
 provide explicit packet labeling to LSM developers.

The thing that concerns me most about CIPSO is that even once users
migrate to a more SELINUX native approach from this CIPSO stuff, the
CIPSO code, it's bloat, and it's maintainence burdon will remain.

It's easy to put stuff it, it's impossible to take stuff out even
once it's largely unused by even it's original target audience.

And that's what I see happening here.

This is why, to be perfectly honest with you, I'd much rather
something like this stay out-of-tree and people are strongly
encouraged to use the more native stuff under Linux.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: No interfaces under /proc/sys/net/ipv4/conf/

2006-06-22 Thread Hasso Tepper

Hasso Tepper:
 I can think of workarounds for most of problems (although it breaks a
 hell lot of software here), but how I suppose to configure ipv6
 settings for interfaces which have to obtain global ipv6 address via
 autoconf so that it will work even if cable is not plugged in? I did
 via /etc/sysctl.conf, but now ... machine boots with no link = no
 link-local address = no /proc/sys/net/ipv6/conf/interfce =
 configuration fails.

Just realized (via practical experience) that same question applies to 
interfaces configured via dhcp.


regards,

-- 
Hasso Tepper
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [0/5] GSO: Generic Segmentation Offload

2006-06-22 Thread David Miller

From: Herbert Xu [EMAIL PROTECTED]
Date: Thu, 22 Jun 2006 18:12:11 +1000

 This is a repost of the GSO patches.  The main change is the fix to a bug
 in the way dev-gso_skb is freed.  This series requires the dev_deactivate
 patch that I just posted.

Applied, thanks a lot Herbert.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bugme-new] [Bug 6730] New: pptp connection hang on heavy network load.

2006-06-22 Thread Andrew Morton

On Thu, 22 Jun 2006 03:06:00 -0700
[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=6730
 
Summary: pptp connection hang on heavy network load.
 Kernel Version: 2.6.17
 Status: NEW
   Severity: blocking
  Owner: [EMAIL PROTECTED]
  Submitter: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur: 2.6.17
 Distribution: debian
 Hardware Environment: NVidia nForce2 GF6600 forcedeth driver.
 Software Environment: latest debian sid dist-upgrade on 22 june 2006
 Problem Description: when I make a many files (about 500 files near 1 Gb of 
 total size, 
 *.deb from last and previous upgrade for example ) upload to a server via 
 ssh/scp, my vpn 
 pptp connection hangs till i restart ppp connection. This problem persist not 
 only a latest 
 kernel. I don't remember when it begin to hangs. It very-very annoying and 
 makes me 
 difficult to make my daily job. If I do not make such heavy network vpn 
 traffic, I can 
 work for a weeks without reboots and any problems
 
 Steps to reproduce:
 Start to copy big amount of data over pptp connection from a client linux 
 machine. Ex. via 
 ssh/scp.
 

We thought we'd fixed this :(
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Suspending 802.11 drivers

2006-06-22 Thread Luis R. Rodriguez


On 6/21/06, Stefan Rompf [EMAIL PROTECTED] wrote:

Am Mittwoch 21 Juni 2006 17:08 schrieb Luis R. Rodriguez:

 Since d80211 is already being patched for sysfs how about we use sysfs
 (and kobjects) to maintain the state at suspend() and resume(). This
 would allow userspace tools like supplicant running in the background
 to pick up from sysfs where it left off and for our drivers to save
 where we left off.

Forgive me that I'm so insistant on this question, but this is important: What
state that goes beyond the data settable with wireless ioctls/iwconfig (that
is kept anyway) needs to be saved by the stack? Last association info is
worthless, the assocation can be restored using the ESSID/BSSID/channel set
with iwconfig or by wpa_supplicant. Important is that userspace is notified
about the connection loss. Is there _any_ other information not recreatable
from iwconfig settings that needs to be kept?

Stefan


Stefan, this is an excellent and valid question. Let me elaborate --
Its exactly those settings you mention I'm suggesting get saved onto
sysfs by the driver and later get picked up by userspace. There are,
however, other settings which could get saved when suspending too
though, settings for example, which otherwise would get set by current
private wireless ioctls. There are too many here to describe really,
each device has their own set of of private attributes.

More on this in another e-mail I'm about to send to netdev.

 Luis
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ipv6 source address selection in addrconf.c (2.6.17)

2006-06-22 Thread Lukasz Stelmach

YOSHIFUJI Hideaki / 吉藤英明 wrote:
 In article [EMAIL PROTECTED] (at Thu, 22 Jun 2006 00:57:56 +0200), Lukasz 
 Stelmach [EMAIL PROTECTED] says:
 
 Lukasz Stelmach wrote:
 Lukasz Stelmach wrote:
[]

 fd24:6f44:46bd:face:EUI64 fd24:6f44:46bd:face:RANDOM
 and
 2002:531f:d667:face:EUI64 2002:531f:d667:face::RANDOM

 there seem to be no way to prefere 2002:: over fc00:: in rule 7 and it will 
 be
 selected as long as it is before 2002:: on the list. I can see here that an
 implicit assumption has been made that an interface either is multihomed or
 private. The seventh rule should not IMHO break the whole process of
 selection but rather mark as selectable all private (random) addresses. And
 it should rather be done before rule 6.
 
 Hmm? We do not have such intention.
 In above case, when you connect to 2001:200:0:8002:203:47ff:fea5:3085,
 either 2002:531f:d667:face:EUI64 or 2002:531f:d667:face::RANDOM
 should be selected (depending on if use_tempaddr = 2),
 by the longest matching rule (Rule 8).

I've chewd the code line by line and it tastes like it should work the way you
say... OK I see the problem. I've used ifconfig which doesn't show
deprecated flag and valid/prefered times which, combined with privacy,
*seem* to cause some problems . I don't know yet if it is a problem of proper
intervals in radvd.conf or is there still a bug in kernel. I'll let you know
when I learn it.

OK. That's enough for now. Let me get back to the real work ;-)


Best regards.
-- 
Było mi bardzo miło.Czwarta pospolita klęska, [...]
Łukasz  Już nie katolicka lecz złodziejska.  (c)PP



signature.asc
Description: OpenPGP digital signature

[PATCH 0/32] TIPC updates

2006-06-22 Thread Per Liden

Here's a resend of the latest TIPC updates. I apologize for not properly 
submitting the patches for review the first time around.

This patch set includes several minor bugfixes. Most of them ported over 
from an older TIPC 1.5.x branch maintained on sourceforge (that branch is 
being phased).

Patches can be pulled from:

git://tipc.cslab.ericsson.net/pub/git/tipc.git

Summary:

 include/net/tipc/tipc_bearer.h |   12 ++
 net/tipc/bcast.c   |   79 ---
 net/tipc/bcast.h   |2 
 net/tipc/bearer.c  |   70 +++--
 net/tipc/cluster.c |   22 ++--
 net/tipc/config.c  |   85 +++-
 net/tipc/core.c|7 +
 net/tipc/core.h|   21 +++-
 net/tipc/discover.c|   13 +-
 net/tipc/eth_media.c   |   29 +++--
 net/tipc/link.c|  217 +++-
 net/tipc/name_distr.c  |   30 --
 net/tipc/name_table.c  |  203 -
 net/tipc/node.c|   78 --
 net/tipc/node.h|2 
 net/tipc/node_subscr.c |   15 +--
 net/tipc/port.c|   41 
 net/tipc/ref.c |   31 +-
 net/tipc/socket.c  |  100 +++---
 net/tipc/subscr.c  |   18 ++-
 net/tipc/zone.c|   19 ++--
 21 files changed, 661 insertions(+), 433 deletions(-)

Allan Stephens:
  [TIPC] Prevent name table corruption if no room for new publication
  [TIPC] Use correct upper bound when validating network zone number.
  [TIPC] Corrected potential misuse of tipc_media_addr structure.
  [TIPC] Allow ports to receive multicast messages through native API.
  [TIPC] Links now validate destination node specified by incoming messages.
  [TIPC] Multicast link failure now resets all links to nacking node.
  [TIPC] Allow compilation when CONFIG_TIPC_DEBUG is not set.
  [TIPC] Fixed privilege checking typo in dest_name_check().
  [TIPC] Fix misleading comment in buf_discard() routine.
  [TIPC] Added support for MODULE_VERSION capability.
  [TIPC] Validate entire interface name when locating bearer to enable.
  [TIPC] Non-operation-affecting corrections to comments  function 
definitions.
  [TIPC] Fixed connect() to detect a dest address that is missing or too 
short.
  [TIPC] Implied connect now saves dest name for retrieval as ancillary 
data.
  [TIPC] Can now return destination name of form {0,x,y} via ancillary data.
  [TIPC] Connected send now checks socket state when retrying congested 
send.
  [TIPC] Stream socket send indicates partial success if data partially 
sent.
  [TIPC] Improved performance of error checking during socket creation.
  [TIPC] recvmsg() now returns TIPC ancillary data using correct level 
(SOL_TIPC)
  [TIPC] Simplify code for returning partial success of stream send request.
  [TIPC] Optimized argument validation done by connect().
  [TIPC] Withdrawing all names from nameless port now returns success, not 
error
  [TIPC] Added missing warning for out-of-memory condition
  [TIPC] Fixed memory leak in tipc_link_send() when destination is 
unreachable
  [TIPC] Disallow config operations that aren't supported in certain modes.
  [TIPC] First phase of assert() cleanup
  [TIPC] Enhanced  cleaned up system messages; fixed 2 obscure memory 
leaks.
  [TIPC] Fixed link switchover bugs
  [TIPC] Get rid of dynamically allocated arrays in broadcast code.
  [TIPC] Fix incorrect correction to discovery timer frequency computation.

Eric Sesterhenn:
  [TIPC] Fix for NULL pointer dereference

Jon Maloy:
  [TIPC] Improved tolerance to promiscuous mode interface

/Per
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ?

2006-06-22 Thread Alan Cox

Ar Iau, 2006-06-22 am 01:34 -0700, ysgrifennodd David Miller:
 From: Herbert Xu [EMAIL PROTECTED]
 Date: Thu, 22 Jun 2006 18:30:37 +1000

  On Thu, Jun 22, 2006 at 01:26:09AM -0700, David Miller wrote:

   Want me to let this cook in 2.6.18 for a while before sending
   it off to -stable?

  You know I'm never one to push anything quickly so absolutely yes :)

 Ok, applied to net-2.6.18 for now :)

The 8390 change (corrected version) also makes 8390.c faster so should
be applied anyway, and the orinoco one fixes some code that isn't even
needed and someone forgot to remove long ago. Otherwise the skb_padto
behaviour change with the newer skb style makes a lot more sense I
agree.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ?

2006-06-22 Thread Herbert Xu

Alan Cox [EMAIL PROTECTED] wrote:
 
 The 8390 change (corrected version) also makes 8390.c faster so should
 be applied anyway, and the orinoco one fixes some code that isn't even
 needed and someone forgot to remove long ago. Otherwise the skb_padto

Yeah I agree totally.  However, I haven't actually seen the fixed 8390
version being posted yet or at least not to netdev :)

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ?

2006-06-22 Thread Arjan van de Ven

On Thu, 2006-06-22 at 12:34 +0100, Alan Cox wrote:
 Ar Iau, 2006-06-22 am 01:34 -0700, ysgrifennodd David Miller:
  From: Herbert Xu [EMAIL PROTECTED]
  Date: Thu, 22 Jun 2006 18:30:37 +1000
  
   On Thu, Jun 22, 2006 at 01:26:09AM -0700, David Miller wrote:
   
Want me to let this cook in 2.6.18 for a while before sending
it off to -stable?
   
   You know I'm never one to push anything quickly so absolutely yes :)
  
  Ok, applied to net-2.6.18 for now :)
 
 The 8390 change (corrected version) also makes 8390.c faster so should
 be applied anyway, 

8390 is such a race monster that a few cycles matter a lot! :-)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ?

2006-06-22 Thread Erik Mouw

On Thu, Jun 22, 2006 at 01:33:36PM +0200, Arjan van de Ven wrote:
 On Thu, 2006-06-22 at 12:34 +0100, Alan Cox wrote:
  The 8390 change (corrected version) also makes 8390.c faster so should
  be applied anyway, 
 
 8390 is such a race monster that a few cycles matter a lot! :-)

It sure is. Back in the old days I could saturate a 10 Mbit ethernet
segment using a Western Digital 8003 (the 8 bit ISA card) in a 386DX40
(running Linux 1.0, 1.2, and 1.3).


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC] sysfs + configfs on 802.11 wireless drivers

2006-06-22 Thread Luis R. Rodriguez


Rebel fleet of wireless developers,

Here's some changes which I think our current wireless stacks could
use to assist cleaning up WEs, private ioctls, and provide userspace a
cleaner framework to interact with our wireless drivers.

Kernel level:

(1) Use the new *configfs* for all user-specific attributes
(2) Use *sysfs* read-only kobjects for device-specific attributes like
values which can be saved for suspend() and collected for resume().
IEEE 802.11 capabilities, features (for example radiotap), and what is
currently settable/gettable from private ioctl realm along with its
restrictions can also be exported via sysfs.
(3) On resume() talk to userspace via netlink to read our sysfs and configfs us

Userspace applications can then:

(1) Interact with configfs for configuring wireless devices, including
what used to be set by private ioctls
(2) Retrieve attributes saved from sysfs and set them onto configfs
after resume(). Sysfs will also tell us this card's capabilities,
features and private data along with their respective restrictions we
can work with so -- userspace can modify the available options which
can be gettable/settable.
(3) Respond to netlink communication from driver after resume() to set
data through configfs

---

I know we recently moved WE to netlink but I figured celebrating the
happy marriage of Mr. sysfs and Mrs. configfs on 2.6.16 by giving them
offspring would be nice and more appropriate. Here's an example
run-through of how this would work:

(1) A wireless device comes up and spits out device-specific default
settings on sysfs

(2) If a user wants to change essid, channel, power-save-mode (this is
not suspend()), rate, and so forth userspace writes the settings into
configfs, these would in turn get updated on sysfs by the driver.

(3) Should the device go into suspend() the driver can then update its
necessary attributes on sysfs required to recover from suspend() which
may not have been updated yet (whatever they may be)

(4) At resume() we could just have our driver read our sysfs
attributes and try to set all of them back exactly how they were
before but to reduce bloat on our drivers and since our state is
already exported we could just have userspace do it for us so... we
use netlink to communicate to userspace to go ahead and ask it to
resume() us. Advantages of this would be userspace would always
consistantly handle the assoc/desassoc and WPA in a consistent manner
and as mentioned above, driver bloatness killing.

(5) At resume() userspace reads sysfs and sets us back up through configfs

Comments are appreciated, if this is something that seems desirable I
can start cranking up some code.

 Luis
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ?

2006-06-22 Thread Alan Cox

Ar Iau, 2006-06-22 am 13:33 +0200, ysgrifennodd Arjan van de Ven:
  The 8390 change (corrected version) also makes 8390.c faster so should
  be applied anyway, 
 
 8390 is such a race monster that a few cycles matter a lot! :-)

There are generic 8390 clones for 100Mbit. I'm not suggesting its a good
idea but people did it.

Alan

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Memory corruption in 8390.c ?

2006-06-22 Thread Alan Cox

Ar Iau, 2006-06-22 am 21:29 +1000, ysgrifennodd Herbert Xu:
 Alan Cox [EMAIL PROTECTED] wrote:
  
  The 8390 change (corrected version) also makes 8390.c faster so should
  be applied anyway, and the orinoco one fixes some code that isn't even
  needed and someone forgot to remove long ago. Otherwise the skb_padto
 
 Yeah I agree totally.  However, I haven't actually seen the fixed 8390
 version being posted yet or at least not to netdev :)

Ah the resounding clang of a subtle hint ;)

Signed-off-by: Alan Cox [EMAIL PROTECTED]

- Return 8390.c to the old way of handling short packets (which is also
faster)

- Remove the skb_padto from orinoco. This got left in when the padding bad 
write patch was added and is actually not needed. This is fixing a merge
error way back when.

- Wavelan can also use the stack based buffer trick if you want



diff -u --new-file --recursive --exclude-from /usr/src/exclude 
linux.vanilla-2.6.17/drivers/net/8390.c linux-2.6.17/drivers/net/8390.c
--- linux.vanilla-2.6.17/drivers/net/8390.c 2006-06-19 17:17:32.0 
+0100
+++ linux-2.6.17/drivers/net/8390.c 2006-06-21 21:23:12.0 +0100
@@ -275,12 +275,14 @@
struct ei_device *ei_local = (struct ei_device *) netdev_priv(dev);
int send_length = skb-len, output_page;
unsigned long flags;
+   char buf[ETH_ZLEN];
+   char *data = skb-data;
 
if (skb-len  ETH_ZLEN) {
-   skb = skb_padto(skb, ETH_ZLEN);
-   if (skb == NULL)
-   return 0;
+   memset(buf, 0, ETH_ZLEN);   /* more efficient than doing 
just the needed bits */
+   memcpy(buf, data, skb-len);
send_length = ETH_ZLEN;
+   data = buf;
}
 
/* Mask interrupts from the ethercard. 
@@ -347,7 +349,7 @@
 * trigger the send later, upon receiving a Tx done interrupt.
 */
 
-   ei_block_output(dev, send_length, skb-data, output_page);
+   ei_block_output(dev, send_length, data, output_page);

if (! ei_local-txing) 
{
diff -u --new-file --recursive --exclude-from /usr/src/exclude 
linux.vanilla-2.6.17/drivers/net/wireless/orinoco.c 
linux-2.6.17/drivers/net/wireless/orinoco.c
--- linux.vanilla-2.6.17/drivers/net/wireless/orinoco.c 2006-06-19 
17:29:48.0 +0100
+++ linux-2.6.17/drivers/net/wireless/orinoco.c 2006-06-21 18:19:02.0 
+0100
@@ -491,11 +491,8 @@
}
 
/* Length of the packet body */
-   /* FIXME: what if the skb is smaller than this? */
+   /* A shorter data_len will be padded by hermes_bap_pwrite_pad */
len = max_t(int, ALIGN(skb-len, 2), ETH_ZLEN);
-   skb = skb_padto(skb, len);
-   if (skb == NULL)
-   goto fail;
len -= ETH_HLEN;
 
eh = (struct ethhdr *)skb-data;
diff -u --new-file --recursive --exclude-from /usr/src/exclude 
linux.vanilla-2.6.17/drivers/net/wireless/wavelan.c 
linux-2.6.17/drivers/net/wireless/wavelan.c
--- linux.vanilla-2.6.17/drivers/net/wireless/wavelan.c 2006-06-19 
17:29:48.0 +0100
+++ linux-2.6.17/drivers/net/wireless/wavelan.c 2006-06-21 18:32:47.0 
+0100
@@ -2903,6 +2903,7 @@
 {
net_local *lp = (net_local *) dev-priv;
unsigned long flags;
+   char data[ETH_ZLEN];
 
 #ifdef DEBUG_TX_TRACE
printk(KERN_DEBUG %s: -wavelan_packet_xmit(0x%X)\n, dev-name,
@@ -2937,15 +2938,16 @@
 * able to detect collisions, therefore in theory we don't really
 * need to pad. Jean II */
if (skb-len  ETH_ZLEN) {
-   skb = skb_padto(skb, ETH_ZLEN);
-   if (skb == NULL)
-   return 0;
+   memset(data, 0, ETH_ZLEN);
+   memcpy(data, skb-data, skb-len);
+   /* Write packet on the card */
+   if(wv_packet_write(dev, data, ETH_ZLEN))
+   return 1;   /* We failed */
}
-
-   /* Write packet on the card */
-   if(wv_packet_write(dev, skb-data, skb-len))
+   else if(wv_packet_write(dev, skb-data, skb-len))
return 1;   /* We failed */
 
+
dev_kfree_skb(skb);
 
 #ifdef DEBUG_TX_TRACE

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 3/7] NetLabel: CIPSOv4 engine

2006-06-22 Thread Paul Moore

On Thursday 22 June 2006 5:12 am, David Miller wrote:
 From: [EMAIL PROTECTED]
 Date: Wed, 21 Jun 2006 15:42:38 -0400

  Add support for the Commercial IP Security Option (CIPSO) to the
  IPv4 network stack.  CIPSO has become a de-facto standard for
  trusted/labeled networking amongst existing Trusted Operating
  Systems such as Trusted Solaris, HP-UX CMW, etc.  This
  implementation is designed to be used with the NetLabel subsystem to
  provide explicit packet labeling to LSM developers.

 The thing that concerns me most about CIPSO is that even once users
 migrate to a more SELINUX native approach from this CIPSO stuff, the
 CIPSO code, it's bloat, and it's maintainence burdon will remain.

 It's easy to put stuff it, it's impossible to take stuff out even
 once it's largely unused by even it's original target audience.

 And that's what I see happening here.

 This is why, to be perfectly honest with you, I'd much rather
 something like this stay out-of-tree and people are strongly
 encouraged to use the more native stuff under Linux.

Well, not exactly the response I was hoping for, but let me plead my case one 
more time :)

Traditional MLS CIPSO is a niche protocol, I won't try to argue that point, 
and I also won't try to argue that the NetLabel patch is late to the party, 
the IPsec/XFRM labeling approach has already been accepted as the SELinux 
packet labeling mechanism.  However, the XFRM labeling mechanism in not 
currently supported by any OS other than Linux/SELinux.  I have spoken with 
users that need CIPSO to interoperate with their other trusted systems, the 
XFRM approach is simply not a viable solution for them.  I strongly believe 
that failure to support an interoperable packet labeling mechanism on Linux 
will seriously restrict Linux's deployment in trusted networks.

It's all about compatibility and enabling Linux to be used in places it can't 
be used now.  True, other OS vendors might support the SELinux/IPsec packet 
labeling approach, but I see very little in the way of motivation for them to 
do the work.

If it makes you feel any better I do intend to support the Selopt approach (or 
at least something very similar) for CIPSO as envisioned by James Morris for 
the SELinux networking hooks of long ago.  This will allow CIPSO to carry the 
full SELinux context making it a more SELINUX native approach than 
traditional MLS CIPSO.  I just wanted to keep this initial patch set as small 
as possible (you can see how well that worked out) ... :)

-- 
paul moore
linux security @ hp
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 32/32] [TIPC] Fix incorrect correction to discovery timer frequency computation.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/discover.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/tipc/discover.c b/net/tipc/discover.c
index ee9b448..2b84412 100644
--- a/net/tipc/discover.c
+++ b/net/tipc/discover.c
@@ -2,7 +2,7 @@
  * net/tipc/discover.c
  * 
  * Copyright (c) 2003-2006, Ericsson AB
- * Copyright (c) 2005, Wind River Systems
+ * Copyright (c) 2005-2006, Wind River Systems
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -267,8 +267,8 @@ static void disc_timeout(struct link_req
/* leave timer interval as is if already at a normal rate */
} else {
req-timer_intv *= 2;
-   if (req-timer_intv  TIPC_LINK_REQ_SLOW)
-   req-timer_intv = TIPC_LINK_REQ_SLOW;
+   if (req-timer_intv  TIPC_LINK_REQ_FAST)
+   req-timer_intv = TIPC_LINK_REQ_FAST;
if ((req-timer_intv == TIPC_LINK_REQ_FAST)  
(req-bearer-nodes.count))
req-timer_intv = TIPC_LINK_REQ_SLOW;
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 24/32] [TIPC] Withdrawing all names from nameless port now returns success, not error

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/port.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/net/tipc/port.c b/net/tipc/port.c
index 360920b..899e08e 100644
--- a/net/tipc/port.c
+++ b/net/tipc/port.c
@@ -1171,8 +1171,6 @@ int tipc_withdraw(u32 ref, unsigned int 
p_ptr = tipc_port_lock(ref);
if (!p_ptr)
return -EINVAL;
-   if (!p_ptr-publ.published)
-   goto exit;
if (!seq) {
list_for_each_entry_safe(publ, tpubl, 
 p_ptr-publications, pport_list) {
@@ -1199,7 +1197,6 @@ int tipc_withdraw(u32 ref, unsigned int 
}
if (list_empty(p_ptr-publications))
p_ptr-publ.published = 0;
-exit:
tipc_port_unlock(p_ptr);
return res;
 }
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 6/32] [TIPC] Links now validate destination node specified by incoming messages.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

This fix prevents link flopping and name table inconsistency problems arising
when a node is assigned a different Z.C.N value than it used previously.
(Changing the Z.C.N value causes other nodes to have two link endpoints
sending to the same MAC address using two different destination Z.C.N values,
requiring the receiving node to filter out the unwanted messages.)

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/link.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 784b24b..955b87d 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1720,6 +1720,11 @@ #endif
link_recv_non_seq(buf);
continue;
}
+   
+   if (unlikely(!msg_short(msg) 
+(msg_destnode(msg) != tipc_own_addr)))
+   goto cont;
+   
n_ptr = tipc_node_find(msg_prevnode(msg));
if (unlikely(!n_ptr))
goto cont;
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 7/32] [TIPC] Multicast link failure now resets all links to nacking node.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

This fix prevents node from crashing.

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/bcast.c |   32 +++---
 net/tipc/link.c  |  124 +-
 2 files changed, 128 insertions(+), 28 deletions(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 2c4ecbe..00691b7 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -49,13 +49,19 @@ #include bearer.h
 #include name_table.h
 #include bcast.h
 
-
 #define MAX_PKT_DEFAULT_MCAST 1500 /* bcast link max packet size (fixed) */
 
 #define BCLINK_WIN_DEFAULT 20  /* bcast link window size (default) */
 
 #define BCLINK_LOG_BUF_SIZE 0
 
+/*
+ * Loss rate for incoming broadcast frames; used to test retransmission code.
+ * Set to N to cause every N'th frame to be discarded; 0 = don't discard any.
+ */
+ 
+#define TIPC_BCAST_LOSS_RATE 0
+
 /**
  * struct bcbearer_pair - a pair of bearers used by broadcast link
  * @primary: pointer to primary bearer
@@ -165,21 +171,18 @@ static int bclink_ack_allowed(u32 n)
  * @after: sequence number of last packet to *not* retransmit
  * @to: sequence number of last packet to retransmit
  * 
- * Called with 'node' locked, bc_lock unlocked
+ * Called with bc_lock locked
  */
 
 static void bclink_retransmit_pkt(u32 after, u32 to)
 {
struct sk_buff *buf;
 
-   spin_lock_bh(bc_lock);
buf = bcl-first_out;
while (buf  less_eq(buf_seqno(buf), after)) {
buf = buf-next;
}
-   if (buf != NULL)
-   tipc_link_retransmit(bcl, buf, mod(to - after));
-   spin_unlock_bh(bc_lock);  
+   tipc_link_retransmit(bcl, buf, mod(to - after));
 }
 
 /** 
@@ -399,7 +402,10 @@ int tipc_bclink_send_msg(struct sk_buff 
  */
 
 void tipc_bclink_recv_pkt(struct sk_buff *buf)
-{
+{
+#if (TIPC_BCAST_LOSS_RATE)
+   static int rx_count = 0;
+#endif
struct tipc_msg *msg = buf_msg(buf);
struct node* node = tipc_node_find(msg_prevnode(msg));
u32 next_in;
@@ -420,9 +426,13 @@ void tipc_bclink_recv_pkt(struct sk_buff
tipc_node_lock(node);
tipc_bclink_acknowledge(node, msg_bcast_ack(msg));
tipc_node_unlock(node);
+   spin_lock_bh(bc_lock);
bcl-stats.recv_nacks++;
+   bcl-owner-next = node;   /* remember requestor */
bclink_retransmit_pkt(msg_bcgap_after(msg),
  msg_bcgap_to(msg));
+   bcl-owner-next = NULL;
+   spin_unlock_bh(bc_lock);  
} else {
tipc_bclink_peek_nack(msg_destnode(msg),
  msg_bcast_tag(msg),
@@ -433,6 +443,14 @@ void tipc_bclink_recv_pkt(struct sk_buff
return;
}
 
+#if (TIPC_BCAST_LOSS_RATE)
+   if (++rx_count == TIPC_BCAST_LOSS_RATE) {
+   rx_count = 0;
+   buf_discard(buf);
+   return;
+   }
+#endif
+
tipc_node_lock(node);
 receive:
deferred = node-bclink.deferred_head;
diff --git a/net/tipc/link.c b/net/tipc/link.c
index 955b87d..ba7d3f1 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1604,40 +1604,121 @@ void tipc_link_push_queue(struct link *l
tipc_bearer_schedule(l_ptr-b_ptr, l_ptr);
 }
 
+static void link_reset_all(unsigned long addr)
+{
+   struct node *n_ptr;
+   char addr_string[16];
+   u32 i;
+
+   read_lock_bh(tipc_net_lock);
+   n_ptr = tipc_node_find((u32)addr);
+   if (!n_ptr) {
+   read_unlock_bh(tipc_net_lock);
+   return; /* node no longer exists */
+   }
+
+   tipc_node_lock(n_ptr);
+
+   warn(Resetting all links to %s\n, 
+addr_string_fill(addr_string, n_ptr-addr));
+
+   for (i = 0; i  MAX_BEARERS; i++) {
+   if (n_ptr-links[i]) {
+   link_print(n_ptr-links[i], TIPC_OUTPUT, 
+  Resetting link\n);
+   tipc_link_reset(n_ptr-links[i]);
+   }
+   }
+
+   tipc_node_unlock(n_ptr);
+   read_unlock_bh(tipc_net_lock);
+}
+
+static void link_retransmit_failure(struct link *l_ptr, struct sk_buff *buf)
+{
+   struct tipc_msg *msg = buf_msg(buf);
+
+   warn(Retransmission failure on link %s\n, l_ptr-name);
+   tipc_msg_print(TIPC_OUTPUT, msg, RETR-FAIL);
+
+   if (l_ptr-addr) {
+
+   /* Handle failure on standard link */
+
+   link_print(l_ptr, TIPC_OUTPUT, Resetting link\n);
+   tipc_link_reset(l_ptr);
+
+   } else {
+
+   /* Handle failure on broadcast link */
+
+   struct node *n_ptr;
+   char addr_string[16];
+
+

[PATCH 20/32] [TIPC] Improved performance of error checking during socket creation.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/socket.c |9 +++--
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 8cefacb..a1f2210 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -169,12 +169,6 @@ static int tipc_create(struct socket *so
struct sock *sk;
 u32 ref;
 
-   if ((sock-type != SOCK_STREAM)  
-   (sock-type != SOCK_SEQPACKET) 
-   (sock-type != SOCK_DGRAM) 
-   (sock-type != SOCK_RDM))
-   return -EPROTOTYPE;
-
if (unlikely(protocol != 0))
return -EPROTONOSUPPORT;
 
@@ -199,6 +193,9 @@ static int tipc_create(struct socket *so
sock-ops = msg_ops;
sock-state = SS_READY;
break;
+   default:
+   tipc_deleteport(ref);
+   return -EPROTOTYPE;
}
 
sk = sk_alloc(AF_TIPC, GFP_KERNEL, tipc_proto, 1);
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 30/32] [TIPC] Fixed link switchover bugs

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Incorporates several related fixes:
- switchover now occurs when switching from an active link to a standby link
- failure of a standby link no longer initiates switchover
- links now display correct # of received packtes following reactivation

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/link.c |   30 --
 net/tipc/node.c |7 +--
 net/tipc/node.h |2 ++
 3 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index d7668b8..d646580 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -690,6 +690,7 @@ void tipc_link_reset(struct link *l_ptr)
struct sk_buff *buf;
u32 prev_state = l_ptr-state;
u32 checkpoint = l_ptr-next_in_no;
+   int was_active_link = tipc_link_is_active(l_ptr);

msg_set_session(l_ptr-pmsg, msg_session(l_ptr-pmsg) + 1);
 
@@ -711,7 +712,7 @@ #if 0
tipc_printf(TIPC_CONS, \nReset link %s\n, l_ptr-name);
dbg_link_dump();
 #endif
-   if (tipc_node_has_active_links(l_ptr-owner) 
+   if (was_active_link  tipc_node_has_active_links(l_ptr-owner) 
l_ptr-owner-permit_changeover) {
l_ptr-reset_checkpoint = checkpoint;
l_ptr-exp_msg_count = START_CHANGEOVER;
@@ -754,7 +755,7 @@ #endif
 
 static void link_activate(struct link *l_ptr)
 {
-   l_ptr-next_in_no = 1;
+   l_ptr-next_in_no = l_ptr-stats.recv_info = 1;
tipc_node_link_up(l_ptr-owner, l_ptr);
tipc_bearer_add_dest(l_ptr-b_ptr, l_ptr-addr);
link_send_event(tipc_cfg_link_event, l_ptr, 1);
@@ -2303,12 +2304,18 @@ void tipc_link_tunnel(struct link *l_ptr
u32 length = msg_size(msg);
 
tunnel = l_ptr-owner-active_links[selector  1];
-   if (!tipc_link_is_up(tunnel))
+   if (!tipc_link_is_up(tunnel)) {
+   warn(Link changeover error, 
+tunnel link no longer available\n);
return;
+   }
msg_set_size(tunnel_hdr, length + INT_H_SIZE);
buf = buf_acquire(length + INT_H_SIZE);
-   if (!buf)
+   if (!buf) {
+   warn(Link changeover error, 
+unable to send tunnel msg\n);
return;
+   }
memcpy(buf-data, (unchar *)tunnel_hdr, INT_H_SIZE);
memcpy(buf-data + INT_H_SIZE, (unchar *)msg, length);
dbg(%c-%c:, l_ptr-b_ptr-net_plane, tunnel-b_ptr-net_plane);
@@ -2328,19 +2335,23 @@ void tipc_link_changeover(struct link *l
u32 msgcount = l_ptr-out_queue_size;
struct sk_buff *crs = l_ptr-first_out;
struct link *tunnel = l_ptr-owner-active_links[0];
-   int split_bundles = tipc_node_has_redundant_links(l_ptr-owner);
struct tipc_msg tunnel_hdr;
+   int split_bundles;
 
if (!tunnel)
return;
 
-   if (!l_ptr-owner-permit_changeover)
+   if (!l_ptr-owner-permit_changeover) {
+   warn(Link changeover error, 
+peer did not permit changeover\n);
return;
+   }
 
msg_init(tunnel_hdr, CHANGEOVER_PROTOCOL,
 ORIGINAL_MSG, TIPC_OK, INT_H_SIZE, l_ptr-addr);
msg_set_bearer_id(tunnel_hdr, l_ptr-peer_bearer_id);
msg_set_msgcnt(tunnel_hdr, msgcount);
+   dbg(Link changeover requires %u tunnel messages\n, msgcount);
 
if (!l_ptr-first_out) {
struct sk_buff *buf;
@@ -2360,6 +2371,9 @@ void tipc_link_changeover(struct link *l
return;
}
 
+   split_bundles = (l_ptr-owner-active_links[0] != 
+l_ptr-owner-active_links[1]);
+
while (crs) {
struct tipc_msg *msg = buf_msg(crs);
 
@@ -2497,11 +2511,13 @@ static int link_recv_changeover_msg(stru
 dest_link-name);
tipc_link_reset(dest_link);
dest_link-exp_msg_count = msg_count;
+   dbg(Expecting %u tunnelled messages\n, msg_count);
if (!msg_count)
goto exit;
} else if (dest_link-exp_msg_count == START_CHANGEOVER) {
msg_dbg(tunnel_msg, BLK/FIRST/REC);
dest_link-exp_msg_count = msg_count;
+   dbg(Expecting %u tunnelled messages\n, msg_count);
if (!msg_count)
goto exit;
}
@@ -2509,6 +2525,8 @@ static int link_recv_changeover_msg(stru
/* Receive original message */
 
if (dest_link-exp_msg_count == 0) {
+   warn(Link switchover error, 
+got too many tunnelled messages\n);
msg_dbg(tunnel_msg, OVERDUE/DROP/REC);
dbg_print_link(dest_link, LINK:);
goto exit;
diff --git a/net/tipc/node.c b/net/tipc/node.c
index 5f09754..ce9678e 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -125,6 +125,8 @@ void

[PATCH 8/32] [TIPC] Allow compilation when CONFIG_TIPC_DEBUG is not set.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/core.h |   19 ++-
 1 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/net/tipc/core.h b/net/tipc/core.h
index 1f2e8b2..d1edb7a 100644
--- a/net/tipc/core.h
+++ b/net/tipc/core.h
@@ -2,7 +2,7 @@
  * net/tipc/core.h: Include file for TIPC global declarations
  * 
  * Copyright (c) 2005-2006, Ericsson AB
- * Copyright (c) 2005, Wind River Systems
+ * Copyright (c) 2005-2006, Wind River Systems
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -111,10 +111,6 @@ #endif
 
 #else
 
-#ifndef DBG_OUTPUT
-#define DBG_OUTPUT NULL
-#endif
-
 /*
  * TIPC debug support not included:
  * - system messages are printed to system console
@@ -129,6 +125,19 @@ #define dbg(fmt, arg...) do {} while (0)
 #define msg_dbg(msg,txt) do {} while (0)
 #define dump(fmt,arg...) do {} while (0)
 
+
+/* 
+ * TIPC_OUTPUT is defined to be the system console, while DBG_OUTPUT is
+ * the null print buffer.  Thes ensures that any system or debug messages 
+ * that are generated without using the above macros are handled correctly.
+ */
+
+#undef  TIPC_OUTPUT
+#define TIPC_OUTPUT TIPC_CONS
+
+#undef  DBG_OUTPUT
+#define DBG_OUTPUT NULL
+
 #endif   
 
 
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 22/32] [TIPC] Simplify code for returning partial success of stream send request.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/socket.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index abecf2d..6d4d2b0 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -969,7 +969,7 @@ static int recv_stream(struct kiocb *ioc
 restart:
if (unlikely((skb_queue_len(sock-sk-sk_receive_queue) == 0) 
 (flags  MSG_DONTWAIT))) {
-   res = (sz_copied == 0) ? -EWOULDBLOCK : 0;
+   res = -EWOULDBLOCK;
goto exit;
}
 
@@ -1060,7 +1060,7 @@ restart:
 
 exit:
up(tsock-sem);
-   return res ? res : sz_copied;
+   return sz_copied ? sz_copied : res;
 }
 
 /**
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/32] [TIPC] Use correct upper bound when validating network zone number.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/core.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/tipc/core.c b/net/tipc/core.c
index 3d0a8ee..31c7dd5 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -198,7 +198,7 @@ static int __init tipc_init(void)
tipc_max_publications = 1;
tipc_max_subscriptions = 2000;
tipc_max_ports = delimit(CONFIG_TIPC_PORTS, 127, 65536);
-   tipc_max_zones = delimit(CONFIG_TIPC_ZONES, 1, 511);
+   tipc_max_zones = delimit(CONFIG_TIPC_ZONES, 1, 255);
tipc_max_clusters = delimit(CONFIG_TIPC_CLUSTERS, 1, 1);
tipc_max_nodes = delimit(CONFIG_TIPC_NODES, 8, 2047);
tipc_max_slaves = delimit(CONFIG_TIPC_SLAVE_NODES, 0, 2047);
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 16/32] [TIPC] Implied connect now saves dest name for retrieval as ancillary data.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/socket.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 758b2d2..98550b9 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -470,6 +470,10 @@ static int send_msg(struct kiocb *iocb, 
if ((tsock-p-published) ||
((sock-type == SOCK_STREAM)  (total_len != 0)))
return -EOPNOTSUPP;
+   if (dest-addrtype == TIPC_ADDR_NAME) {
+   tsock-p-conn_type = dest-addr.name.name.type;
+   tsock-p-conn_instance = dest-addr.name.name.instance;
+   }
}
 
if (down_interruptible(tsock-sem))
@@ -1269,10 +1273,6 @@ static int connect(struct socket *sock, 
   msg = buf_msg(buf);
res = auto_connect(sock, tsock, msg);
if (!res) {
-  if (dst-addrtype == TIPC_ADDR_NAME) {
-  tsock-p-conn_type = dst-addr.name.name.type;
-  tsock-p-conn_instance = 
dst-addr.name.name.instance;
-  }
   if (!msg_data_sz(msg))
   advance_queue(tsock);
   }
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 14/32] [TIPC] Non-operation-affecting corrections to comments function definitions.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/socket.c |   12 +++-
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index eaf4d69..0923213 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -437,7 +437,7 @@ static int dest_name_check(struct sockad
  * @iocb: (unused)
  * @sock: socket structure
  * @m: message to send
- * @total_len: (unused)
+ * @total_len: length of message
  * 
  * Message must have an destination specified explicitly.
  * Used for SOCK_RDM and SOCK_DGRAM messages, 
@@ -538,7 +538,7 @@ exit:
  * @iocb: (unused)
  * @sock: socket structure
  * @m: message to send
- * @total_len: (unused)
+ * @total_len: length of message
  * 
  * Used for SOCK_SEQPACKET messages and SOCK_STREAM data.
  * 
@@ -1386,7 +1386,7 @@ exit:
 /**
  * shutdown - shutdown socket connection
  * @sock: socket structure
- * @how: direction to close (always treated as read + write)
+ * @how: direction to close (unused; always treated as read + write)
  *
  * Terminates connection (if necessary), then purges socket's receive queue.
  * 
@@ -1469,7 +1469,8 @@ restart:
  * Returns 0 on success, errno otherwise
  */
 
-static int setsockopt(struct socket *sock, int lvl, int opt, char *ov, int ol)
+static int setsockopt(struct socket *sock, 
+ int lvl, int opt, char __user *ov, int ol)
 {
struct tipc_sock *tsock = tipc_sk(sock-sk);
u32 value;
@@ -1525,7 +1526,8 @@ static int setsockopt(struct socket *soc
  * Returns 0 on success, errno otherwise
  */
 
-static int getsockopt(struct socket *sock, int lvl, int opt, char *ov, int *ol)
+static int getsockopt(struct socket *sock, 
+ int lvl, int opt, char __user *ov, int *ol)
 {
struct tipc_sock *tsock = tipc_sk(sock-sk);
 int len;
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 23/32] [TIPC] Optimized argument validation done by connect().

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/socket.c |   17 +
 1 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 6d4d2b0..32d7784 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -455,7 +455,8 @@ static int send_msg(struct kiocb *iocb, 
 
if (unlikely(!dest))
return -EDESTADDRREQ;
-   if (unlikely(dest-family != AF_TIPC))
+   if (unlikely((m-msg_namelen  sizeof(*dest)) ||
+(dest-family != AF_TIPC)))
return -EINVAL;
 
needs_conn = (sock-state != SS_READY);
@@ -1245,7 +1246,8 @@ static int connect(struct socket *sock, 
if (sock-state == SS_READY)
   return -EOPNOTSUPP;
 
-   /* MOVE THE REST OF THIS ERROR CHECKING TO send_msg()? */
+   /* Issue Posix-compliant error code if socket is in the wrong state */
+
if (sock-state == SS_LISTENING)
   return -EOPNOTSUPP;
if (sock-state == SS_CONNECTING)
@@ -1253,13 +1255,20 @@ static int connect(struct socket *sock, 
if (sock-state != SS_UNCONNECTED)
return -EISCONN;
 
-   if ((destlen  sizeof(*dst)) || (dst-family != AF_TIPC) ||
-   ((dst-addrtype != TIPC_ADDR_NAME)  (dst-addrtype != TIPC_ADDR_ID)))
+   /*
+* Reject connection attempt using multicast address
+*
+* Note: send_msg() validates the rest of the address fields,
+*   so there's no need to do it here
+*/
+
+   if (dst-addrtype == TIPC_ADDR_MCAST)
return -EINVAL;
 
/* Send a 'SYN-' to destination */
 
m.msg_name = dest;
+   m.msg_namelen = destlen;
if ((res = send_msg(NULL, sock, m, 0))  0) {
   sock-state = SS_DISCONNECTING;
   return res;
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 18/32] [TIPC] Connected send now checks socket state when retrying congested send.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/socket.c |   16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 361dc34..9c834fc 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -565,15 +565,15 @@ static int send_packet(struct kiocb *ioc
return -ERESTARTSYS;
 }
 
-if (unlikely(sock-state != SS_CONNECTED)) {
-if (sock-state == SS_DISCONNECTING)
-res = -EPIPE;   
-else
-res = -ENOTCONN;
-goto exit;
-}
-
 do {
+   if (unlikely(sock-state != SS_CONNECTED)) {
+   if (sock-state == SS_DISCONNECTING)
+   res = -EPIPE;   
+   else
+   res = -ENOTCONN;
+   goto exit;
+   }
+
 res = tipc_send(tsock-p-ref, m-msg_iovlen, m-msg_iov);
 if (likely(res != -ELINKCONG)) {
 exit:
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/32] [TIPC] Corrected potential misuse of tipc_media_addr structure.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 include/net/tipc/tipc_bearer.h |   12 ++--
 net/tipc/eth_media.c   |4 +++-
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/net/tipc/tipc_bearer.h b/include/net/tipc/tipc_bearer.h
index 098607c..e07136d 100644
--- a/include/net/tipc/tipc_bearer.h
+++ b/include/net/tipc/tipc_bearer.h
@@ -49,10 +49,18 @@ #include linux/spinlock.h
 
 #define TIPC_MEDIA_TYPE_ETH1
 
+/* 
+ * Destination address structure used by TIPC bearers when sending messages
+ * 
+ * IMPORTANT: The fields of this structure MUST be stored using the specified
+ * byte order indicated below, as the structure is exchanged between nodes
+ * as part of a link setup process.
+ */
+
 struct tipc_media_addr {
-   __u32  type;
+   __u32  type;/* bearer type (network byte order) */
union {
-   __u8   eth_addr[6]; /* Ethernet bearer */ 
+   __u8   eth_addr[6]; /* 48 bit Ethernet addr (byte array) */ 
 #if 0
/* Prototypes for other possible bearer types */
 
diff --git a/net/tipc/eth_media.c b/net/tipc/eth_media.c
index b646619..3ecb100 100644
--- a/net/tipc/eth_media.c
+++ b/net/tipc/eth_media.c
@@ -254,7 +254,9 @@ int tipc_eth_media_start(void)
if (eth_started)
return -EINVAL;
 
-   memset(bcast_addr, 0xff, sizeof(bcast_addr));
+   bcast_addr.type = htonl(TIPC_MEDIA_TYPE_ETH);
+   memset(bcast_addr.dev_addr, 0xff, ETH_ALEN);
+
memset(eth_bearers, 0, sizeof(eth_bearers));
 
res = tipc_register_media(TIPC_MEDIA_TYPE_ETH, eth,
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 17/32] [TIPC] Can now return destination name of form {0,x,y} via ancillary data.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/socket.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 98550b9..361dc34 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -731,6 +731,7 @@ static int anc_data_recv(struct msghdr *
u32 anc_data[3];
u32 err;
u32 dest_type;
+   int has_name;
int res;
 
if (likely(m-msg_controllen == 0))
@@ -755,24 +756,27 @@ static int anc_data_recv(struct msghdr *
dest_type = msg ? msg_type(msg) : TIPC_DIRECT_MSG;
switch (dest_type) {
case TIPC_NAMED_MSG:
+   has_name = 1;
anc_data[0] = msg_nametype(msg);
anc_data[1] = msg_namelower(msg);
anc_data[2] = msg_namelower(msg);
break;
case TIPC_MCAST_MSG:
+   has_name = 1;
anc_data[0] = msg_nametype(msg);
anc_data[1] = msg_namelower(msg);
anc_data[2] = msg_nameupper(msg);
break;
case TIPC_CONN_MSG:
+   has_name = (tport-conn_type != 0);
anc_data[0] = tport-conn_type;
anc_data[1] = tport-conn_instance;
anc_data[2] = tport-conn_instance;
break;
default:
-   anc_data[0] = 0;
+   has_name = 0;
}
-   if (anc_data[0] 
+   if (has_name 
(res = put_cmsg(m, SOL_SOCKET, TIPC_DESTNAME, 12, anc_data)))
return res;
 
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 29/32] [TIPC] Enhanced cleaned up system messages; fixed 2 obscure memory leaks.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/bcast.c  |2 +
 net/tipc/bcast.h  |2 +
 net/tipc/bearer.c |   70 +++--
 net/tipc/cluster.c|   22 +--
 net/tipc/config.c |2 +
 net/tipc/discover.c   |7 +
 net/tipc/link.c   |   39 +++
 net/tipc/name_distr.c |   10 ---
 net/tipc/name_table.c |6 ++--
 net/tipc/node.c   |   68 +---
 net/tipc/port.c   |   10 ---
 net/tipc/subscr.c |   18 ++---
 net/tipc/zone.c   |   19 -
 13 files changed, 149 insertions(+), 126 deletions(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 44645f5..1633ef2 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -785,7 +785,7 @@ int tipc_bclink_init(void)
bclink = kmalloc(sizeof(*bclink), GFP_ATOMIC);
if (!bcbearer || !bclink) {
  nomem:
-   warn(Memory squeeze; Failed to create multicast link\n);
+   warn(Multicast link creation failed, no memory\n);
kfree(bcbearer);
bcbearer = NULL;
kfree(bclink);
diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
index 0e3be2a..b243d9d 100644
--- a/net/tipc/bcast.h
+++ b/net/tipc/bcast.h
@@ -180,7 +180,7 @@ static inline void tipc_port_list_add(st
if (!item-next) {
item-next = kmalloc(sizeof(*item), GFP_ATOMIC);
if (!item-next) {
-   warn(Memory squeeze: multicast destination 
port list is incomplete\n);
+   warn(Incomplete multicast delivery, no 
memory\n);
return;
}
item-next-next = NULL;
diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
index e213a8e..4fa24b5 100644
--- a/net/tipc/bearer.c
+++ b/net/tipc/bearer.c
@@ -112,39 +112,42 @@ int  tipc_register_media(u32 media_type,
goto exit;
 
if (!media_name_valid(name)) {
-   warn(Media registration error: illegal name %s\n, name);
+   warn(Media %s rejected, illegal name\n, name);
goto exit;
}
if (!bcast_addr) {
-   warn(Media registration error: no broadcast address 
supplied\n);
+   warn(Media %s rejected, no broadcast address\n, name);
goto exit;
}
if ((bearer_priority  TIPC_MIN_LINK_PRI) 
(bearer_priority  TIPC_MAX_LINK_PRI)) {
-   warn(Media registration error: priority %u\n, 
bearer_priority);
+   warn(Media %s rejected, illegal priority (%u)\n, name, 
+bearer_priority);
goto exit;
}
if ((link_tolerance  TIPC_MIN_LINK_TOL) || 
(link_tolerance  TIPC_MAX_LINK_TOL)) {
-   warn(Media registration error: tolerance %u\n, 
link_tolerance);
+   warn(Media %s rejected, illegal tolerance (%u)\n, name,
+link_tolerance);
goto exit;
}
 
media_id = media_count++;
if (media_id = MAX_MEDIA) {
-   warn(Attempt to register more than %u media\n, MAX_MEDIA);
+   warn(Media %s rejected, media limit reached (%u)\n, name,
+MAX_MEDIA);
media_count--;
goto exit;
}
for (i = 0; i  media_id; i++) {
if (media_list[i].type_id == media_type) {
-   warn(Attempt to register second media with type %u\n, 
+   warn(Media %s rejected, duplicate type (%u)\n, name,
 media_type);
media_count--;
goto exit;
}
if (!strcmp(name, media_list[i].name)) {
-   warn(Attempt to re-register media name %s\n, name);
+   warn(Media %s rejected, duplicate name\n, name);
media_count--;
goto exit;
}
@@ -283,6 +286,9 @@ static struct bearer *bearer_find(const 
struct bearer *b_ptr;
u32 i;
 
+   if (tipc_mode != TIPC_NET_MODE)
+   return NULL;
+
for (i = 0, b_ptr = tipc_bearers; i  MAX_BEARERS; i++, b_ptr++) {
if (b_ptr-active  (!strcmp(b_ptr-publ.name, name)))
return b_ptr;
@@ -475,26 +481,33 @@ int tipc_enable_bearer(const char *name,
u32 i;
int res = -EINVAL;
 
-   if (tipc_mode != TIPC_NET_MODE)
+   if (tipc_mode != TIPC_NET_MODE) {
+   warn(Bearer %s rejected, not supported in standalone mode\n,
+name);
return -ENOPROTOOPT;
-
-   if (!bearer_name_validate(name, b_name) ||
-

[PATCH 12/32] [TIPC] Added support for MODULE_VERSION capability.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/core.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/tipc/core.c b/net/tipc/core.c
index 31c7dd5..5003acb 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -2,7 +2,7 @@
  * net/tipc/core.c: TIPC module code
  *
  * Copyright (c) 2003-2006, Ericsson AB
- * Copyright (c) 2005, Wind River Systems
+ * Copyright (c) 2005-2006, Wind River Systems
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -57,7 +57,7 @@ void tipc_socket_stop(void);
 int  tipc_netlink_start(void);
 void tipc_netlink_stop(void);
 
-#define MOD_NAME tipc_start: 
+#define TIPC_MOD_VER 1.6.1
 
 #ifndef CONFIG_TIPC_ZONES
 #define CONFIG_TIPC_ZONES 3
@@ -224,6 +224,7 @@ module_exit(tipc_exit);
 
 MODULE_DESCRIPTION(TIPC: Transparent Inter Process Communication);
 MODULE_LICENSE(Dual BSD/GPL);
+MODULE_VERSION(TIPC_MOD_VER);
 
 /* Native TIPC API for kernel-space applications (see tipc.h) */
 
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 25/32] [TIPC] Added missing warning for out-of-memory condition

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/port.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/tipc/port.c b/net/tipc/port.c
index 899e08e..99846a1 100644
--- a/net/tipc/port.c
+++ b/net/tipc/port.c
@@ -1061,6 +1061,7 @@ int tipc_createport(u32 user_ref, 
 
up_ptr = (struct user_port *)kmalloc(sizeof(*up_ptr), GFP_ATOMIC);
if (up_ptr == NULL) {
+   warn(Port creation failed, no memory\n);
return -ENOMEM;
}
ref = tipc_createport_raw(NULL, port_dispatcher, port_wakeup, 
importance);
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 21/32] [TIPC] recvmsg() now returns TIPC ancillary data using correct level (SOL_TIPC)

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/socket.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index a1f2210..abecf2d 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -744,10 +744,10 @@ static int anc_data_recv(struct msghdr *
if (unlikely(err)) {
anc_data[0] = err;
anc_data[1] = msg_data_sz(msg);
-   if ((res = put_cmsg(m, SOL_SOCKET, TIPC_ERRINFO, 8, anc_data)))
+   if ((res = put_cmsg(m, SOL_TIPC, TIPC_ERRINFO, 8, anc_data)))
return res;
if (anc_data[1] 
-   (res = put_cmsg(m, SOL_SOCKET, TIPC_RETDATA, anc_data[1], 
+   (res = put_cmsg(m, SOL_TIPC, TIPC_RETDATA, anc_data[1], 
msg_data(msg
return res;
}
@@ -778,7 +778,7 @@ static int anc_data_recv(struct msghdr *
has_name = 0;
}
if (has_name 
-   (res = put_cmsg(m, SOL_SOCKET, TIPC_DESTNAME, 12, anc_data)))
+   (res = put_cmsg(m, SOL_TIPC, TIPC_DESTNAME, 12, anc_data)))
return res;
 
return 0;
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 13/32] [TIPC] Validate entire interface name when locating bearer to enable.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

This fix prevents a bearer from being enabled using the wrong interface.
For example, specifying eth:eth14 might enable eth:eth1 by mistake.

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/eth_media.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/tipc/eth_media.c b/net/tipc/eth_media.c
index 3ecb100..682da4a 100644
--- a/net/tipc/eth_media.c
+++ b/net/tipc/eth_media.c
@@ -2,7 +2,7 @@
  * net/tipc/eth_media.c: Ethernet bearer support for TIPC
  * 
  * Copyright (c) 2001-2006, Ericsson AB
- * Copyright (c) 2005, Wind River Systems
+ * Copyright (c) 2005-2006, Wind River Systems
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -127,8 +127,7 @@ static int enable_bearer(struct tipc_bea
 
/* Find device with specified name */
 
-   while (dev  dev-name 
-  (memcmp(dev-name, driver_name, strlen(dev-name {
+   while (dev  dev-name  strncmp(dev-name, driver_name, IFNAMSIZ)) { 
dev = dev-next;
}
if (!dev)
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/32] [TIPC] Improved tolerance to promiscuous mode interface

2006-06-22 Thread Per Liden

From: Jon Maloy [EMAIL PROTECTED]

Signed-off-by: Jon Maloy [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/eth_media.c |   20 +++-
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/net/tipc/eth_media.c b/net/tipc/eth_media.c
index 7a25278..b646619 100644
--- a/net/tipc/eth_media.c
+++ b/net/tipc/eth_media.c
@@ -98,17 +98,19 @@ static int recv_msg(struct sk_buff *buf,
u32 size;
 
if (likely(eb_ptr-bearer)) {
-   size = msg_size((struct tipc_msg *)buf-data);
-   skb_trim(buf, size);
-   if (likely(buf-len == size)) {
-   buf-next = NULL;
-   tipc_recv_msg(buf, eb_ptr-bearer);
-   } else {
-   kfree_skb(buf);
+  if (likely(!dev-promiscuity) ||
+  !memcmp(buf-mac.raw,dev-dev_addr,ETH_ALEN) ||
+  !memcmp(buf-mac.raw,dev-broadcast,ETH_ALEN)) {
+   size = msg_size((struct tipc_msg *)buf-data);
+   skb_trim(buf, size);
+   if (likely(buf-len == size)) {
+   buf-next = NULL;
+   tipc_recv_msg(buf, eb_ptr-bearer);
+   return TIPC_OK;
+   }
}
-   } else {
-   kfree_skb(buf);
}
+   kfree_skb(buf);
return TIPC_OK;
 }
 
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/32] [TIPC] Prevent name table corruption if no room for new publication

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Now exits cleanly if attempt to allocate larger array of subsequences fails,
without losing track of pointer to existing array.

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/name_table.c |   18 +-
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c
index d129422..0511436 100644
--- a/net/tipc/name_table.c
+++ b/net/tipc/name_table.c
@@ -284,18 +284,18 @@ static struct publication *tipc_nameseq_
/* Ensure there is space for new sub-sequence */
 
if (nseq-first_free == nseq-alloc) {
-   struct sub_seq *sseqs = nseq-sseqs;
-   nseq-sseqs = tipc_subseq_alloc(nseq-alloc * 2);
-   if (nseq-sseqs != NULL) {
-   memcpy(nseq-sseqs, sseqs,
-  nseq-alloc * sizeof (struct sub_seq));
-   kfree(sseqs);
-   dbg(Allocated %u sseqs\n, nseq-alloc);
-   nseq-alloc *= 2;
-   } else {
+   struct sub_seq *sseqs = tipc_subseq_alloc(nseq-alloc * 
2);
+
+   if (!sseqs) {
warn(Memory squeeze; failed to create 
sub-sequence\n);
return NULL;
}
+   dbg(Allocated %u more sseqs\n, nseq-alloc);
+   memcpy(sseqs, nseq-sseqs,
+  nseq-alloc * sizeof(struct sub_seq));
+   kfree(nseq-sseqs);
+   nseq-sseqs = sseqs;
+   nseq-alloc *= 2;
}
dbg(Have %u sseqs for type %u\n, nseq-alloc, type);
 
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 10/32] [TIPC] Fixed privilege checking typo in dest_name_check().

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

This patch originated by Stephane Ouellette [EMAIL PROTECTED].

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/socket.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 648a734..eaf4d69 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -426,7 +426,7 @@ static int dest_name_check(struct sockad
 
 if (copy_from_user(hdr, m-msg_iov[0].iov_base, sizeof(hdr)))
return -EFAULT;
-   if ((ntohs(hdr.tcm_type)  0xC000)  (!capable(CAP_NET_ADMIN)))
+   if ((ntohs(hdr.tcm_type)  0xC000)  (!capable(CAP_NET_ADMIN)))
return -EACCES;
 
return 0;
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/32] [TIPC] Allow ports to receive multicast messages through native API.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

This fix prevents a kernel panic if an application mistakenly sends a
multicast message to  TIPC's topology service or configuration service.

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/port.c |   26 --
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/net/tipc/port.c b/net/tipc/port.c
index 67e96cb..360920b 100644
--- a/net/tipc/port.c
+++ b/net/tipc/port.c
@@ -810,18 +810,20 @@ static void port_dispatcher_sigh(void *d
void *usr_handle;
int connected;
int published;
+   u32 message_type;
 
struct sk_buff *next = buf-next;
struct tipc_msg *msg = buf_msg(buf);
u32 dref = msg_destport(msg);

+   message_type = msg_type(msg);
+   if (message_type  TIPC_DIRECT_MSG)
+   goto reject;/* Unsupported message type */
+
p_ptr = tipc_port_lock(dref);
-   if (!p_ptr) {
-   /* Port deleted while msg in queue */
-   tipc_reject_msg(buf, TIPC_ERR_NO_PORT);
-   buf = next;
-   continue;
-   }
+   if (!p_ptr)
+   goto reject;/* Port deleted while msg in queue */
+
orig.ref = msg_origport(msg);
orig.node = msg_orignode(msg);
up_ptr = p_ptr-user_port;
@@ -832,7 +834,7 @@ static void port_dispatcher_sigh(void *d
if (unlikely(msg_errcode(msg)))
goto err;
 
-   switch (msg_type(msg)) {
+   switch (message_type) {

case TIPC_CONN_MSG:{
tipc_conn_msg_event cb = up_ptr-conn_msg_cb;
@@ -874,6 +876,7 @@ static void port_dispatcher_sigh(void *d
   orig);
break;
}
+   case TIPC_MCAST_MSG:
case TIPC_NAMED_MSG:{
tipc_named_msg_event cb = up_ptr-named_msg_cb;
 
@@ -886,7 +889,8 @@ static void port_dispatcher_sigh(void *d
goto reject;
dseq.type =  msg_nametype(msg);
dseq.lower = msg_nameinst(msg);
-   dseq.upper = dseq.lower;
+   dseq.upper = (message_type == TIPC_NAMED_MSG)
+   ? dseq.lower : msg_nameupper(msg);
skb_pull(buf, msg_hdr_sz(msg));
cb(usr_handle, dref, buf, msg_data(msg), 
   msg_data_sz(msg), msg_importance(msg),
@@ -899,7 +903,7 @@ static void port_dispatcher_sigh(void *d
buf = next;
continue;
 err:
-   switch (msg_type(msg)) {
+   switch (message_type) {

case TIPC_CONN_MSG:{
tipc_conn_shutdown_event cb = 
@@ -931,6 +935,7 @@ err:
   msg_data_sz(msg), msg_errcode(msg), orig);
break;
}
+   case TIPC_MCAST_MSG:
case TIPC_NAMED_MSG:{
tipc_named_msg_err_event cb = 
up_ptr-named_err_cb;
@@ -940,7 +945,8 @@ err:
break;
dseq.type =  msg_nametype(msg);
dseq.lower = msg_nameinst(msg);
-   dseq.upper = dseq.lower;
+   dseq.upper = (message_type == TIPC_NAMED_MSG)
+   ? dseq.lower : msg_nameupper(msg);
skb_pull(buf, msg_hdr_sz(msg));
cb(usr_handle, dref, buf, msg_data(msg), 
   msg_data_sz(msg), msg_errcode(msg), dseq);
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 15/32] [TIPC] Fixed connect() to detect a dest address that is missing or too short.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/socket.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 0923213..758b2d2 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -1244,7 +1244,7 @@ static int connect(struct socket *sock, 
if (sock-state != SS_UNCONNECTED)
return -EISCONN;
 
-   if ((dst-family != AF_TIPC) ||
+   if ((destlen  sizeof(*dst)) || (dst-family != AF_TIPC) ||
((dst-addrtype != TIPC_ADDR_NAME)  (dst-addrtype != TIPC_ADDR_ID)))
return -EINVAL;
 
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 27/32] [TIPC] Disallow config operations that aren't supported in certain modes.

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

This change provides user-friendly feedback when TIPC is unable to perform
certain configuration operations that don't work properly in certain modes.
(In particular, any reconfiguration request that would temporarily take TIPC
from network mode to standalone mode, or from standalone mode to not running
mode, is disallowed.)

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/config.c |   83 -
 1 files changed, 38 insertions(+), 45 deletions(-)

diff --git a/net/tipc/config.c b/net/tipc/config.c
index 48b5de2..41c8447 100644
--- a/net/tipc/config.c
+++ b/net/tipc/config.c
@@ -291,13 +291,22 @@ static struct sk_buff *cfg_set_own_addr(
if (!tipc_addr_node_valid(addr))
return tipc_cfg_reply_error_string(TIPC_CFG_INVALID_VALUE
(node address));
-   if (tipc_own_addr)
+   if (tipc_mode == TIPC_NET_MODE)
return tipc_cfg_reply_error_string(TIPC_CFG_NOT_SUPPORTED
(cannot change node 
address once assigned));
+   tipc_own_addr = addr;
+
+   /* 
+* Must release all spinlocks before calling start_net() because
+* Linux version of TIPC calls eth_media_start() which calls
+* register_netdevice_notifier() which may block!
+*
+* Temporarily releasing the lock should be harmless for non-Linux TIPC,
+* but Linux version of eth_media_start() should really be reworked
+* so that it can be called with spinlocks held.
+*/
 
spin_unlock_bh(config_lock);
-   tipc_core_stop_net();
-   tipc_own_addr = addr;
tipc_core_start_net();
spin_lock_bh(config_lock);
return tipc_cfg_reply_none();
@@ -350,50 +359,21 @@ static struct sk_buff *cfg_set_max_subsc
 
 static struct sk_buff *cfg_set_max_ports(void)
 {
-   int orig_mode;
u32 value;
 
if (!TLV_CHECK(req_tlv_area, req_tlv_space, TIPC_TLV_UNSIGNED))
return tipc_cfg_reply_error_string(TIPC_CFG_TLV_ERROR);
value = *(u32 *)TLV_DATA(req_tlv_area);
value = ntohl(value);
+   if (value == tipc_max_ports)
+   return tipc_cfg_reply_none();
if (value != delimit(value, 127, 65535))
return tipc_cfg_reply_error_string(TIPC_CFG_INVALID_VALUE
(max ports must be 
127-65535));
-
-   if (value == tipc_max_ports)
-   return tipc_cfg_reply_none();
-
-   if (atomic_read(tipc_user_count)  2)
+   if (tipc_mode != TIPC_NOT_RUNNING)
return tipc_cfg_reply_error_string(TIPC_CFG_NOT_SUPPORTED
-   (cannot change max ports 
while TIPC users exist));
-
-   spin_unlock_bh(config_lock);
-   orig_mode = tipc_get_mode();
-   if (orig_mode == TIPC_NET_MODE)
-   tipc_core_stop_net();
-   tipc_core_stop();
+(cannot change max ports while TIPC is active));
tipc_max_ports = value;
-   tipc_core_start();
-   if (orig_mode == TIPC_NET_MODE)
-   tipc_core_start_net();
-   spin_lock_bh(config_lock);
-   return tipc_cfg_reply_none();
-}
-
-static struct sk_buff *set_net_max(int value, int *parameter)
-{
-   int orig_mode;
-
-   if (value != *parameter) {
-   orig_mode = tipc_get_mode();
-   if (orig_mode == TIPC_NET_MODE)
-   tipc_core_stop_net();
-   *parameter = value;
-   if (orig_mode == TIPC_NET_MODE)
-   tipc_core_start_net();
-   }
-
return tipc_cfg_reply_none();
 }
 
@@ -405,10 +385,16 @@ static struct sk_buff *cfg_set_max_zones
return tipc_cfg_reply_error_string(TIPC_CFG_TLV_ERROR);
value = *(u32 *)TLV_DATA(req_tlv_area);
value = ntohl(value);
+   if (value == tipc_max_zones)
+   return tipc_cfg_reply_none();
if (value != delimit(value, 1, 255))
return tipc_cfg_reply_error_string(TIPC_CFG_INVALID_VALUE
(max zones must be 
1-255));
-   return set_net_max(value, tipc_max_zones);
+   if (tipc_mode == TIPC_NET_MODE)
+   return tipc_cfg_reply_error_string(TIPC_CFG_NOT_SUPPORTED
+(cannot change max zones once TIPC has joined a 
network));
+   tipc_max_zones = value;
+   return tipc_cfg_reply_none();
 }
 
 static struct sk_buff *cfg_set_max_clusters(void)
@@ -419,8 +405,8 @@ static struct sk_buff *cfg_set_max_clust
return tipc_cfg_reply_error_string(TIPC_CFG_TLV_ERROR);
value = *(u32 *)TLV_DATA(req_tlv_area);
value = ntohl(value);
-   if (value != 1)
-   return

[PATCH 9/32] [TIPC] Fix for NULL pointer dereference

2006-06-22 Thread Per Liden

From: Eric Sesterhenn [EMAIL PROTECTED]

This fixes a bug spotted by the coverity checker, bug id #366. If
(mod(seqno - prev) != 1) we set buf to NULL, dereference it in the for
case, and set it to whatever value happes to be at adress 0+next, if it
happens to be non-zero, we even stay in the loop. It seems that the author
intended to break there.

Signed-off-by: Eric Sesterhenn [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/bcast.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 00691b7..44645f5 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -349,8 +349,10 @@ static void tipc_bclink_peek_nack(u32 de
for (; buf; buf = buf-next) {
u32 seqno = buf_seqno(buf);
 
-   if (mod(seqno - prev) != 1)
+   if (mod(seqno - prev) != 1) {
buf = NULL;
+   break;
+   }
if (seqno == gap_after)
break;
prev = seqno;
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 26/32] [TIPC] Fixed memory leak in tipc_link_send() when destination is unreachable

2006-06-22 Thread Per Liden

From: Allan Stephens [EMAIL PROTECTED]

Signed-off-by: Allan Stephens [EMAIL PROTECTED]
Signed-off-by: Per Liden [EMAIL PROTECTED]
---
 net/tipc/link.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index ba7d3f1..ff40c91 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1135,9 +1135,13 @@ int tipc_link_send(struct sk_buff *buf, 
if (n_ptr) {
tipc_node_lock(n_ptr);
l_ptr = n_ptr-active_links[selector  1];
-   dbg(tipc_link_send: found link %x for dest %x\n, l_ptr, dest);
if (l_ptr) {
+   dbg(tipc_link_send: found link %x for dest %x\n, 
l_ptr, dest);
res = tipc_link_send_buf(l_ptr, buf);
+   } else {
+   dbg(Attempt to send msg to unreachable node:\n);
+   msg_dbg(buf_msg(buf),);
+   buf_discard(buf);
}
tipc_node_unlock(n_ptr);
} else {
-- 
1.4.0

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [0/5] GSO: Generic Segmentation Offload

2006-06-22 Thread YOSHIFUJI Hideaki / 吉藤英明

Hello.

Yes, I genrally like this idea.

In article [EMAIL PROTECTED] (at Thu, 22 Jun 2006 18:12:11 +1000), Herbert Xu 
[EMAIL PROTECTED] says:

 GSO like TSO is only effective if the MTU is significantly less than the
 maximum value of 64K.  So only the case where the MTU was set to 1500 is
 of interest.  There we can see that the throughput improved by 17.5%
 (3061.05Mb/s = 3598.17Mb/s).  The actual saving in transmission cost is
 in fact a lot more than that as the majority of the time here is spent on
 the RX side which still has to deal with 1500-byte packets.

Can you measure some with other sizes,
e.g. 4kByte, 8kByte, 9000Byte?

--yoshfuji
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

IPSec + large packets being corrupted

2006-06-22 Thread Chris Audley


I've been using the 2.6 kernel ipsec system for some time and have always
had to work around issues with large packets not traversing the VPN by
setting the LAN interface MTU size to something like 1400.

Because I always thought this was a hack and not a proper fix, I've spent
a few days trying to work out exactly why large packets aren't traversing
the VPN and have found something which may well be the cause. I really
don't know the kernel networking code that well so I was hoping that
someone can either verify that what I've found is really an issue, or
whether I'm doing something wrong.

This has been seen in the field with P4/e100+e1000 systems running 2.6.12
and in testing on Geode/dp8381x systems running 2.6.17, all using IPv4.
VPN is Racoon based, using x509 certs and ESP/AH (3DES/SHA1).


This is my understanding of how large packets get corrupted:

Large packet (eg. 1600 byte ping) received by VPN server A.
Packet encrypted and fragmented then sent from Server A to Server B.
Packet received by network subsytem on B and frag_list created
ah_input() strips the AH header -- frag sizes are not changed!
esp_input() decrypts data
ip_fragment() uses existing frag_list sizes from before the AH
  header being stripped, and sends too much data (16 bytes extra). This
  breaks the checksum and packets get dropped by destination host.

By setting the MTU on the local interface, this breaks one of the
checks for using the pre-existing frag list in ip_fragment() (MTU is now
smaller than the largest frag size), so the packet fragments are
re-generated from scratch and the large packet gets through.

If I disable the valid frag_list check in ip_fragment(), again large
packets traverse the VPN with no problems at all since the fragments are
re-generated from scratch.

If my analysis of the above is correct, then my feeling is that either
ah_input() should re-calculate the fragment sizes, or some flag should be
set to tell ip_fragment() to use the slow method and recreate the
fragments.

Does this sound like a real problem, or have I missed something obvious?

Regards,
-- 
Chris Audley mailto:[EMAIL PROTECTED]








-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [3/5] [NET]: Add software TSOv4

2006-06-22 Thread YOSHIFUJI Hideaki / 吉藤英明

In article [EMAIL PROTECTED] (at Thu, 22 Jun 2006 18:14:00 +1000), Herbert Xu 
[EMAIL PROTECTED] says:

 [NET]: Add software TSOv4
 
 This patch adds the GSO implementation for IPv4 TCP.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

I'd appreciate if you code up IPv6 TCP as well. :-)

Regards,

--yoshfuji
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread Steve Wise

On Thu, 2006-06-22 at 08:53 -0500, Steve Wise wrote:
 On Thu, 2006-06-22 at 01:57 -0700, David Miller wrote:
  From: Steve Wise [EMAIL PROTECTED]
  Date: Wed, 21 Jun 2006 13:45:19 -0500

   This patch implements a mechanism that allows interested clients to
   register for notification of certain network events.

  We have a generic network event notification facility called
  netlink, please use it and extend it for your needs if necessary.

 I'll investigate this.  

 Thanks,

The in-kernel Infiniband subsystem needs to know when certain events
happen.  For example, if the mac address of a neighbour changes.  Any
rdma devices that are using said neighbour need to be notified of the
change.  You are asking that I extend the netlink facility (if
necessary) to provide this functionality.  

Are you suggesting, then, that the Infiniband subsystem should create an
in-kernel NETLINK socket and obtain these events (and the pertinent
information) via the socket?  

I'm still learning about netlink, but my understanding to date is that
its a way to pass events/commands between the kernel and user
applications.  It perhaps seems overkill to use this mechanism for
kernel-kernel event notifications.  That's why I started with notifier
blocks and added a netevent_notifier mechanism.

Any help is greatly appreciated.  Sorry if I'm being dense...

Steve.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Binding a packet socket to interface down

2006-06-22 Thread Robert Iakobashvili


Hi,

When packet socket (PF_SOCKET) is attempted in syscall bind () to the
network interface, which is down (no IF_UPP flag), packet_do_bind ()
sets error to the socket, but bind () does not fail.

When datagram, stream or raw socket fail to bind to some local
ip-port/ip, bind () fails.

Is this behavior of bind () for packet socket done deliberately, or
better to correct it so that bind will fail and return errno, e.g. ENODEV?
Thanks.

--
Sincerely,
--
Robert Iakobashvili, coroberti at gmail dot com
Navigare necesse est, vivere non est necesse.
--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 11/21] e1000: disable CRC stripping workaround

2006-06-22 Thread Auke Kok


Ben Greear wrote:

Kok, Auke wrote:

CRC stripping is breaking SMBUS-connected BMC's. We disable this
feature to make it work. This fixes related bugs regarding SOL.


Shouldn't you also have to subtract 4 bytes when setting the skb len
in the receive logic?  Perhaps when setting the rx-bytes counter as well?


the hardware corrects for the size properly when we disable CRC stripping. The 
end result is the same.


Cheers,

Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 11/21] e1000: disable CRC stripping workaround

2006-06-22 Thread Jesse Brandeburg


On 6/21/06, Ben Greear [EMAIL PROTECTED] wrote:

Kok, Auke wrote:
 CRC stripping is breaking SMBUS-connected BMC's. We disable this
 feature to make it work. This fixes related bugs regarding SOL.

Shouldn't you also have to subtract 4 bytes when setting the skb len
in the receive logic?  Perhaps when setting the rx-bytes counter as well?


we thought about this, but most drivers don't strip the CRC, and we
couldn't find any tests including bridging that cared if the CRC was
there in the indicated packet.

If you can find me a failing case I'll fix it.  It was much simpler to
leave it out, especially when we add back in the multiple descriptor
receive code in the future (think about the case when subtracting the
CRC makes the last descriptor disappear)

Once again, let me know if you have info I don't :-)

Thanks for the review,
 Jesse
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 1/2] Hardware button support for Wireless cards: radiobtn

2006-06-22 Thread Jiri Benc

On Sat, 17 Jun 2006 17:05:55 +0200, Ivo van Doorn wrote:
 With this approach more buttons can be registered,
 it includes the optional field to report an update of the key status
 to the driver that registered it, and it supports for non-polling keys.

I think this is not specific to networking anymore, so it should go to
lkml. Please be sure to Cc: input devices maintainer, Dmitry Torokhov.

Regarding rfkill button, I talked about that with Vojtech Pavlik (Cc:ed)
and he suggests this solution:

- driver is responsible for turning on/off radio when the input device
  is not opened;
- when something opens the input device, it receives input events and 
  gets responsible to turn on/off the radio (by ioctl or putting the 
  network interfaces up/down).

This is of course not possible for all hardware, but it gives the most
flexibility while keeping the possibility to switch of the radio without
userspace support.

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Binding a packet socket to interface down

2006-06-22 Thread Evgeniy Polyakov

On Thu, Jun 22, 2006 at 06:32:29PM +0300, Robert Iakobashvili ([EMAIL 
PROTECTED]) wrote:
 Hi,
 
 When packet socket (PF_SOCKET) is attempted in syscall bind () to the
 network interface, which is down (no IF_UPP flag), packet_do_bind ()
 sets error to the socket, but bind () does not fail.
 
 When datagram, stream or raw socket fail to bind to some local
 ip-port/ip, bind () fails.
 
 Is this behavior of bind () for packet socket done deliberately, or
 better to correct it so that bind will fail and return errno, e.g. ENODEV?
 Thanks.

man page says that packet socket does not handle any errors.
And actually packet socket binds to the device, but you can not read
data. When device will be turned on, packet socket should start
to function (packet_notifier() - NETDEV_UP)..


-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 11/21] e1000: disable CRC stripping workaround

2006-06-22 Thread Ben Greear


Jesse Brandeburg wrote:

On 6/21/06, Ben Greear [EMAIL PROTECTED] wrote:


Kok, Auke wrote:
 CRC stripping is breaking SMBUS-connected BMC's. We disable this
 feature to make it work. This fixes related bugs regarding SOL.

Shouldn't you also have to subtract 4 bytes when setting the skb len
in the receive logic?  Perhaps when setting the rx-bytes counter as well?



we thought about this, but most drivers don't strip the CRC, and we
couldn't find any tests including bridging that cared if the CRC was
there in the indicated packet.

If you can find me a failing case I'll fix it.  It was much simpler to
leave it out, especially when we add back in the multiple descriptor
receive code in the future (think about the case when subtracting the
CRC makes the last descriptor disappear)

Once again, let me know if you have info I don't :-)


It should only be a problem if skb-len includes the extra 4 bytes for 
the crc.  Then, if I transmit that skb to another interface, I am afraid 
that the crc will be seen as data in the packet.  In the 2.6.13 days, 
the e1000 did not strip the CRC, but it subtracted 4 before it did the 
skb_put.  So, the crc was correctly stripped/ignored.  The e100 
functioned similarly I believe.


If you skb_put the extra 4 bytes, I believe this will break my 
(proprietary) app because on transmit it will append the extra 4 crc 
bytes, but that isn't your problem..and I can work around it.  If the 
receiving NIC can handle pkts 4 bytes bigger than normal, it will 
probably still receive the packet w/out problem, but in truth, the frame 
will not be exactly correct.


When you did your bridging tests, did you sniff the packets on the far 
side of the bridge to see if they were the right size?


Thanks,
Ben



Thanks for the review,
 Jesse



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 11/21] e1000: disable CRC stripping workaround

2006-06-22 Thread Lennert Buytenhek

On Thu, Jun 22, 2006 at 08:39:10AM -0700, Jesse Brandeburg wrote:

  CRC stripping is breaking SMBUS-connected BMC's. We disable this
  feature to make it work. This fixes related bugs regarding SOL.
 
 Shouldn't you also have to subtract 4 bytes when setting the skb len
 in the receive logic?  Perhaps when setting the rx-bytes counter as well?
 
 we thought about this, but most drivers don't strip the CRC,

Really?


 and we couldn't find any tests including bridging that cared if the
 CRC was there in the indicated packet.

Bridging definitely cares -- some years ago there was a case where
8139too NICs would pass packets up the stack with 4 bytes of FCS, and
that causes frames received on 8139too interfaces not to be forwarded
to other interfaces because on TX, the frame would be too long.

Maybe e1000 is okay with sending oversized frames, but other NIC
drivers might not be.

(Did you test without bridge-netfilter enabled?  bridge-nf might trim
incoming IP packets even in the bridging case.)


cheers,
Lennert
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 11/21] e1000: disable CRC stripping workaround

2006-06-22 Thread Jesse Brandeburg


On 6/22/06, Ben Greear [EMAIL PROTECTED] wrote:

Jesse Brandeburg wrote:
 On 6/21/06, Ben Greear [EMAIL PROTECTED] wrote:

 Kok, Auke wrote:
  CRC stripping is breaking SMBUS-connected BMC's. We disable this
  feature to make it work. This fixes related bugs regarding SOL.

 Shouldn't you also have to subtract 4 bytes when setting the skb len
 in the receive logic?  Perhaps when setting the rx-bytes counter as well?


 we thought about this, but most drivers don't strip the CRC, and we
 couldn't find any tests including bridging that cared if the CRC was
 there in the indicated packet.

 If you can find me a failing case I'll fix it.  It was much simpler to
 leave it out, especially when we add back in the multiple descriptor
 receive code in the future (think about the case when subtracting the
 CRC makes the last descriptor disappear)

 Once again, let me know if you have info I don't :-)

It should only be a problem if skb-len includes the extra 4 bytes for
the crc.  Then, if I transmit that skb to another interface, I am afraid
that the crc will be seen as data in the packet.  In the 2.6.13 days,
the e1000 did not strip the CRC, but it subtracted 4 before it did the
skb_put.  So, the crc was correctly stripped/ignored.  The e100
functioned similarly I believe.


currently the e100 driver in 2.6.X strips the CRC in hardware.


If you skb_put the extra 4 bytes, I believe this will break my
(proprietary) app because on transmit it will append the extra 4 crc
bytes, but that isn't your problem..and I can work around it.  If the
receiving NIC can handle pkts 4 bytes bigger than normal, it will
probably still receive the packet w/out problem, but in truth, the frame
will not be exactly correct.

When you did your bridging tests, did you sniff the packets on the far
side of the bridge to see if they were the right size?


hm, probably not, we touch tested bridging (probably with TCP), and
have completed several internal testing passes, to make sure it worked
but I don't think we went so far as to   sniff the traffic at the
other end of the bridge.  I'll look into it.

Jesse
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH, RFT] bcm43xx: AccessPoint mode

2006-06-22 Thread Jiri Benc

On Mon, 19 Jun 2006 11:07:34 +0200, Michael Buesch wrote:
 Well, it does not work 100%, but at least it's very promising.
 We are able to create a bssid and correctly send beacon frames out.

Great work! I was even able to ping. (Tried only open system
authentication for now, it seems it works quite well.)

 Please give it a testrun.
 Final note about hostapd:
 hostapd snapshot 0.5-2006-06-10 seems to work in the sense
 that it is able to bring up the device.
 hostapd snapshot 0.5-2006-06-11 seems to fail.

0.5-2006-06-19 works with the patch.

 Important notes from Alexander Tsvyashchenko's initial mail follow:
 [...]
 Although my previous patch to hostapd to make it interoperable with
 bcm43xx  dscape has been merged already in their CVS version, due to
 the subsequent changes in dscape stack current hostapd is again
 incompartible :-( So, to test this patch, the patch to hostapd should be
 applied.

Or, if you don't want to patch hostapd (untested, but should work):

iwpriv wlan0 param 1046 1
ip link set wmgmt0 name wmaster0ap
hostapd /path/to/hostapd.conf
iwpriv wlan0 param 1046 0

 I used hostapd snapshot 0.5-2006-06-10, patch for it is attached.
 The patch is very hacky and requires tricky way to bring everything up,
 but as dscape stack is changed quite constantly, I just do not want to
 waste time fixing it in proper way only to find a week later that
 dscape handling of master interface was changed completely once more and
 everything is broken again ;-)

Hopefully we will convert the whole hostapd-stack communication to
netlink in some near future ;-)

 2) Insert modules (80211, rate_control and bcm43xx-d80211)

modprobe bcm43xx-d80211 is enough, other modules will load
automatically.

 4) ifconfig wlan0 up (this should be done by hostapd actually, but
 its operation with current dscape stack seems to be broken)

hostapd tries to open (put to 'up' state) wmgmt0 earlier than wlan0,
which is not possible. It should open wlan0 first; even more, opening of
wmgmt0 is not necessary as it will be opened automatically when wlan0 is
opened.

 6) iwconfig wlan0 essid your-SSID-name (this also should not be
 required, but current combination of hostapd + dscape doesn't seem to
 generate config_interface callback when setting beacon, so this is
 required just to force call of config_interface).

The stack currently has very limited support for cards with beacon
templates. ieee80211_beacon_get function is not designed in a way it is
used in bcm43xx. Although this seems to be easy to fix now, we will run
into other problems later (with TIM elements mainly).

I need to look how PS mode works in bcm chipsets to find a correct
solution for this. Do you have any ideas?

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 2/7] NetLabel: core network changes

2006-06-22 Thread Steve Grubb

On Thursday 22 June 2006 05:00, David Miller wrote:
   #define NETLINK_GENERIC  16
  +#define NETLINK_NETLABEL 17  /* Network packet labeling */
   
   #define MAX_LINKS 32 

 Please use generic netlink.

Since this is a security interface, shouldn't it be its own protocol so that 
SE Linux can control commands being sent? Paul's patches do include a netlink 
table in security/selinux/nlmsgtab.c. But I do not see any hooks to control 
generic netlink messages. (There seems to be several protocols that SE Linux 
is not controlling.) I could see that someone in secadm role should be able 
to issue these commands, but someone at sysadm or auditadm would not.

If moving this over to generic is a must, then I think SE Linux will have to 
clip into generic to control its packet flow.

-Steve
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 3/7] NetLabel: CIPSOv4 engine

2006-06-22 Thread Ryan Pratt

Paul Moore wrote:

On Thursday 22 June 2006 5:12 am, David Miller wrote:

From: [EMAIL PROTECTED]
Date: Wed, 21 Jun 2006 15:42:38 -0400

The thing that concerns me most about CIPSO is that even once users
migrate to a more SELINUX native approach from this CIPSO stuff, the
CIPSO code, it's bloat, and it's maintainence burdon will remain.

It's easy to put stuff it, it's impossible to take stuff out even
once it's largely unused by even it's original target audience.

And that's what I see happening here.

This is why, to be perfectly honest with you, I'd much rather
something like this stay out-of-tree and people are strongly
encouraged to use the more native stuff under Linux.

Well, not exactly the response I was hoping for, but let me plead my case one 
more time :)

Traditional MLS CIPSO is a niche protocol, I won't try to argue that point, 
and I also won't try to argue that the NetLabel patch is late to the party, 
the IPsec/XFRM labeling approach has already been accepted as the SELinux 
packet labeling mechanism.  However, the XFRM labeling mechanism in not 
currently supported by any OS other than Linux/SELinux.  I have spoken with 
users that need CIPSO to interoperate with their other trusted systems, the 
XFRM approach is simply not a viable solution for them.  I strongly believe 
that failure to support an interoperable packet labeling mechanism on Linux 
will seriously restrict Linux's deployment in trusted networks.

The PitBull product uses the CIPSO/RIPSO labeling protocol in order to 
do interop packet labeling with other trusted systems and for passing 
labels between our own systems.  Because it is the standard, it is the 
protocol that government agencies use to do packet labeling across 
networks.  Not having CIPSO in the mainline would mean that government 
agencies would either a) only use SELinux from a distro that supports 
the CIPSO patch (by maintaining it in their kernel themselves), if such 
a distro exists, b) have to patch the kernels themselves (unlikely), or 
c) not use SELinux at all.

Also, the port of PitBull to Linux that I'm working on is currently 
using the netlabel patch to handle the CIPSO/RIPSO labeling.  Since the 
actual protocol for reading and writing out the IPSec option is 
independent from the security enforcment module it makes a lot of sense 
to have a generic handler in the kernel that LSM modules can use.  So, 
in short, it makes my life a lot easier to have all that work already 
done :)

--
Ryan Pratt
Chief Solaris Engineer
Innovative Security Systems, Inc.
(dba Argus Systems Group)
1809 Woodfield Dr.
Savoy IL 61874
(217) 355-6308
www.argus-systems.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] sysfs + configfs on 802.11 wireless drivers

2006-06-22 Thread Michael Wu

On Thursday 22 June 2006 05:12, Luis R. Rodriguez wrote:
 (3) On resume() talk to userspace via netlink to read our sysfs and
 configfs us

I think that's fairly overkill. I really do not like the idea of requiring any 
more userspace involvement in the suspend process than needed. At least for 
ADM8211, all the data needed to restore the card properly is already stored 
somewhere for userspace (iwconfig) to query, so I don't understand how this 
is suppose to reduce bloat. All the data that is needed should already be 
stored in the driver to support configuring the interface before taking the 
interface up.

-Michael Wu


pgpH1PqEloadB.pgp
Description: PGP signature

[1/1] Kevent subsystem.

2006-06-22 Thread Evgeniy Polyakov

Hello.

Kevent subsystem incorporates several AIO/kqueue design notes and ideas.
Kevent can be used both for edge and level notifications. It supports
socket notifications, network AIO (aio_send(), aio_recv() and 
aio_sendfile()), inode notifications (create/remove),
generic poll()/select() notifications and timer notifications.

It was tested against FreeBSD kqueue and Linux epoll and showed
noticeble performance win.

Network asynchronous IO operations were tested against Linux synchronous 
socket code and showed noticeble performance win.

Patch against linux-2.6.17-git tree attached (gzipped).
I would like to hear some comments about the overall design,
implementation and plans about it's usefullness for generic kernel.

Design notes, patches, userspace application and perfomance tests can be
found at project's homepages.

1. Kevent subsystem.
http://tservice.net.ru/~s0mbre/old/?section=projectsitem=kevent

2. Network AIO.
http://tservice.net.ru/~s0mbre/old/?section=projectsitem=naio

3. LWN article about kevent.
http://lwn.net/Articles/172844/

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

Thank you.

-- 
Evgeniy Polyakov


kevent-2.6.17-git.diff.gz
Description: application/gunzip

Re: [PATCH] Export accept queue len of a TCP listening socket via rx_queue

2006-06-22 Thread Sridhar Samudrala

On Thu, 2006-06-22 at 10:50 +1000, Herbert Xu wrote:
 Sridhar Samudrala [EMAIL PROTECTED] wrote:
  
  What about using the same fields (rqueue/wqueue) as you did for /proc?
  
  I meant extending tcp_info structure to add new fields. I think the user
  space also uses this structure.
 
 What about putting it into inet_idiag_msg.idiag_[rw]queue instead?

OK. I was under the mistaken assumption that [rw]queue fields are exported
via tcp_info. This makes it pretty simple to support netlink users also. 
Here is the updated patch.

Thanks
Sridhar

While debugging a TCP server hang issue, we noticed that currently there is 
no way for a user to get the acceptq backlog value for a TCP listen socket.

All the standard networking utilities that display socket info like netstat,
ss and /proc/net/tcp have 2 fields called rx_queue and tx_queue. These
fields do not mean much for listening sockets. This patch uses one of these
unused fields(rx_queue) to export the accept queue len for listening sockets.

Signed-off-by: Sridhar Samudrala [EMAIL PROTECTED]

diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c
index c148c10..b56399c 100644
--- a/net/ipv4/tcp_diag.c
+++ b/net/ipv4/tcp_diag.c
@@ -26,7 +26,10 @@ static void tcp_diag_get_info(struct soc
const struct tcp_sock *tp = tcp_sk(sk);
struct tcp_info *info = _info;
 
-   r-idiag_rqueue = tp-rcv_nxt - tp-copied_seq;
+   if (sk-sk_state == TCP_LISTEN)
+   r-idiag_rqueue = sk-sk_ack_backlog;
+   else
+   r-idiag_rqueue = tp-rcv_nxt - tp-copied_seq;
r-idiag_wqueue = tp-write_seq - tp-snd_una;
if (info != NULL)
tcp_get_info(sk, info);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 25ecc6e..4c6ef47 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1726,7 +1726,8 @@ static void get_tcp4_sock(struct sock *s
sprintf(tmpbuf, %4d: %08X:%04X %08X:%04X %02X %08X:%08X %02X:%08lX 
%08X %5d %8d %lu %d %p %u %u %u %u %d,
i, src, srcp, dest, destp, sp-sk_state,
-   tp-write_seq - tp-snd_una, tp-rcv_nxt - tp-copied_seq,
+   tp-write_seq - tp-snd_una,
+   (sp-sk_state == TCP_LISTEN) ? sp-sk_ack_backlog : 
(tp-rcv_nxt - tp-copied_seq),
timer_active,
jiffies_to_clock_t(timer_expires - jiffies),
icsk-icsk_retransmits,
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index a50eb30..b36d5b2 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1469,7 +1469,8 @@ static void get_tcp6_sock(struct seq_fil
   dest-s6_addr32[0], dest-s6_addr32[1],
   dest-s6_addr32[2], dest-s6_addr32[3], destp,
   sp-sk_state, 
-  tp-write_seq-tp-snd_una, tp-rcv_nxt-tp-copied_seq,
+  tp-write_seq-tp-snd_una,
+  (sp-sk_state == TCP_LISTEN) ? sp-sk_ack_backlog : 
(tp-rcv_nxt - tp-copied_seq),
   timer_active,
   jiffies_to_clock_t(timer_expires - jiffies),
   icsk-icsk_retransmits,

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL

2006-06-22 Thread jamal

On Tue, 2006-20-06 at 17:09 +0200, Patrick McHardy wrote:
 jamal wrote:

 Depend on bandwidth is not the right term. All of TBF, HTB and HFSC
 provide bandwidth per time, but with TBF and HTB the relation between
 the amount of bandwidth is linear to the amount of time, with HFSC
 it is only on a linear on larger scale since it uses service curves,
 which are represented as two linear pieces. So you have bandwidth b1
 for time t1, bandwidth b2 after that until eternity. By scaling the
 clock rate you alter after how much time b2 kicks in, which affects
 the guaranteed delays. The end result should be that both bandwidth
 and delay scale up or down proportionally, but I'm not sure that this
 is what HFSC would do in all cases (on small scale). But it should
 be easy to answer with a bit more time for visualizing it.
 

Ok, this makes things a little trickier though, no? 

 The thing I'm not sure about is whether this wouldn't be handled better
 by userspace, 

If you do it in user space you will need a daemon of some form; this is
my preference but it seems a lot of people hate daemons - the standard
claim is it is counter-usability. Such people are forgiving if you built
the daemon into the kernel as a thread. Perhaps the netcarrier that
Stefan Rompf has added could be extended to handle this)
Note, if you wanna do it right as well you will factor in other things
like some wireless technologies which changes their throughput
capability over a period of time ( A lot of these guys try to have their
own hardware level schedulers to compensate for this).

 if the link layer speed changes you might not want
 proportional scaling but prefer to still give a fixed amount of that
 bandwidth to some class, for example VoIP traffic. Do we have netlink
 notifications for link speed changes?

Not there at the moment - but we do emit event for other link layer
stuff like carrier on/off - so adding this should be trivial and a
reasonable thing to have; with a caveat: it will be link-layer specific;
so whoever ends up adding will have to be careful to make sure it is not
hard-coded to be specific to ethernet-like netdevices. It could probably
be reported together with link state as a TLV like ETHER_SPEED_CHANGED
which carries probably old speed and new speed
and maybe even reason why it changed.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 7/32] [TIPC] Multicast link failure now resets all links to nacking node.

2006-06-22 Thread James Morris

On Thu, 22 Jun 2006, Per Liden wrote:

 +static void link_reset_all(unsigned long addr)
 +{
 + struct node *n_ptr;
 + char addr_string[16];
 + u32 i;
 +
 + read_lock_bh(tipc_net_lock);
 + n_ptr = tipc_node_find((u32)addr);
 + if (!n_ptr) {
 + read_unlock_bh(tipc_net_lock);
 + return; /* node no longer exists */
 + }
 +
 + tipc_node_lock(n_ptr);

You already have bh's disabled here, and tipc_node_lock() also disables 
them.

Not sure if it's really worth worrying about but if so, you could perhaps 
implement tipc_node_lock_bh() and tipc_node_lock().



- James
-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/1] Kevent subsystem.

2006-06-22 Thread James Morris

On Thu, 22 Jun 2006, Evgeniy Polyakov wrote:

 Patch against linux-2.6.17-git tree attached (gzipped).
 I would like to hear some comments about the overall design,
 implementation and plans about it's usefullness for generic kernel.

Please send patches as in-line ascii text, along with documentation.


If they're too big, split them up logically into smaller pieces.


- James
-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-22 Thread jamal

On Tue, 2006-20-06 at 18:51 +0200, Patrick McHardy wrote:
 jamal wrote:

  The issue is really is whether Linux should be interested in the
  throughput it is told about or the goodput (also known as effective
  throughput) the service provider offers. Two different issues by
  definition. 

 In the case of PPPoE non-work-conserving qdiscs are already used
 to manage a link that is non-local with knowledge of the its
 bandwidth, 

I think that is a different issue though- you are managing a
point-to-point link then you will be working under the assumption of 
throughput not goodput. 

If you had knowledge of the goodput you should use that for a working
assumption; i think in practise that approach is valuable.
My arguement is against trying to make complex changes to compensate
the scheduler for such changes. Therefore i am not feeling sorry
for the poor guy who has to go and tell their PPP device bandwith is 
only 1Mbps when their ISP is claiming it is 2Mbps i.e
 
The ADSL case i have seen thus far is you trying manage something
because a BRAS 3-4 hops down the path uses ATM. To use my earlier
example the arguement is no different than saying 3-4 hops downlink
there is a wireless link which is 20% lossy. Armed with knowledge
like that you can tell something to the scheduler to resolve thing.
The daemon in user space for example could be sending bandwidth
measuring probes and telling the kernel of the new goodput.

 contrary to a local link that would be best managed
 in work-conserving mode. And I think for better accuracy it is
 necessary to manage effective throughput, especially if you're
 interested in guaranteed delays.
 

Indeed - but fixing the scheduler to achieve such management is not
the first choice (would be fine if it is generic and non-intrusive)

 Yes, Linux cant tell if your service provider is lying to you.
 
 I wouldn't call it lying as long as they don't say 1.5mbps IP
 layer throughput. 
  
  
  It is a scam for sure.
  By definition of what throughput is - you are telling the truth; just
  not the whole truth. Most users think in terms of goodput and not
  throughput. 
  i.e you are not telling the whole truth by not saying it is 1.5Mbps ATM
  throughput. Tpyically not an issue until somebody finds that by leaving
  out ATM you meant throughput and not goodput. 
 
 
 I think that point can be used to argue in favour of that Linux should
 be able to manage effective throughput :)
 

I think you have convinced me this is valuable I even suggest probes
above to discover goodput;-). I hope i have convinced you how rude it
would be to make extensive changes to compensate for goodput;-

  I am saying that #2 is the choice to go with hence my assertion earlier,
  it should be fine to tell the scheduler all it has is 1Mbps and nobody
  gets hurt. #1 if i could do it with minimal intrusion and still get to
  use it when i have 802.11g. 
  
  Not sure i made sense.
 
 HFSC is actually capable of handling this quite well. If you use it
 in work-conserving mode (and the card doesn't do (much) internal
 queueing) it will get clocked by successful transmissions. Using
 link-sharing classes you can define proportions for use of available
 bandwidth, possibly with upper limits. No hacks required :)
 

HFSC sounds very interesting - I should go and study it a little more.
My understanding is though that it is a bit of a CPU pig, true?

 Anyway, this again goes more in the direction of handling link speed
 changes.
 

The more we discuss this, the more i think they are the same thing ;-


  ip dev add compensate_header 100 bytes
 
 Something like that, but its a bit more complicated.
 For ATM we need some mapping:
 [0-48]  - 53
 [49-96] - 106
 ...
 
 for Ethernet we need:
 [0-60] - 64
 [60-n] - n + 4
 

an upper bound check against MTU would be reasonable. 

 We could do something like this (feel free to imagine nicer names):
 

The name should reflect that the table exists to compensate for
goodput.

 ATM:
 table = {
   .step = 53,
   .map = {
   [0..48] = 53,
   [49..96] = 106,
   ...
   }
 };
 
 Requiring a table of size 32 for typical MTUs.
 
 Ethernet:
 
 table = {
   .step = 60,
   .map = {
   [0..60] = 60,
   [...] = 0,
   },
   .fixed_overhead = 4,
 };
 
 static inline unsigned int
 skb_wire_len(struct sk_buff *skb, struct net_device *dev)
 {
   unsigned int idx, len;
 
   if (dev-lengthtable == NULL)
   return skb-len;
   idx = skb-len / dev-lengthtable-step;
   len = dev-lengthtable-map[idx];
   return dev-lengthtable-fixed_overhead + len ? len : skb-len;
 }
 
 Unforunately I can't think of a way to handle the ATM case without
 a division .. or iteration.
 

I am not thinking straight right now but it does not sound like a big
change to me i.e within reason. Note, it may be valuable to think of
this as related to the speed changing daemon as i stated earlier.
Only in this

Re: [DOC]: generic netlink

2006-06-22 Thread jamal

On Tue, 2006-20-06 at 23:34 +0200, Thomas Graf wrote:

 Ask Mr. Mashimaro has become my replacement for 8ball. Renaming
 it would lead to a serious loss of coolness ;-)
 

;- Blame Dave for that ;-
I think if you put it in some website, I will just add a url to point to
it. Shailabh has sent me an extension to the example, but i think it is
still not encompassing.

b) Describe some details on how user space - kernel works
probably using libnl??
   
   I'll take care of that.
  
  Whats the plan? To add to this doc or separate doc?
 
 The status is that the code is there including userspace tools
 to query the controller. 

I have a patch for the controller for iproute2 that i would like to
submit as well - but that is separate from this i think.

 Documentation is written as part of
 the API reference (coming up with -pre6), no architectural notes
 yet though. I think it's best to keep it separated and refer to
 it both ways.
 

So you mean just refer to the one in the kernel headers?

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [NET]: Prevent multiple qdisc runs

2006-06-22 Thread jamal

On Wed, 2006-21-06 at 09:52 +1000, Herbert Xu wrote:

 Well my gut feeling is that multiple qdisc_run's on the same dev can't
 be good for perfomance.  The reason is that SMP is only good when the
 CPUs work on different tasks.  If you get two or more CPUs to work on
 qdisc_run at the same time they can still only supply one skb to the
 device at any time.  What's worse is that they will now have to fight
 over the two spin locks involved which means that their cache lines
 will bounce back and forth.

1) If the CPUs collide all the time it is without a doubt
it is a bad thing (you can tell from tx_collission stats).
 
2) If on the other hand, the iota that a CPU enters that path in the
softirq it gets the txlock then there is benefit to not serialize
at the level you have done with that patch - you are enlarging the
granularity of the serialization so much so that the CPU wont even
get the opportunity to try and grab tx lock because it finds qdisc is
already running.

Your gut feeling is for #1 and my worry is for #2 ;-
I actually think your change is obviously valuable for scenarios where
the bus is slower and therefore transmits take longer - my feeling is it
may not be beneficial for fast buses like PCI-E or high speed PCI/X
where the possibility of getting access tx collision is lower.

The other reason I mentioned earlier as justification to leave the
granularity at the level where it was is for good qos clocking. i.e
to allow incoming packets to be used to clock the tx path - otherwise
you will be dependent on HZ for your egress rate accuracy. I am not sure
if this later point made sense - I could elaborate.

The experiment needed to prove things is not hard: one needs to get a 2
or 4way machine and create a funneling effect to one NIC. 
For forwarding, the best setup will be to have 3 NICs. packets coming in
on 2 NICs are forwarded to a third. The incoming-packet NICS are tied to
different CPUs. In a 4way, the outgoing as well is tied to its own CPU.
You then pummel the two incoming CPUs with pktgen or otherwise
at something like 1Mpps (which is higher than the wire rate the third
nic can handle).


cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread jamal

On Thu, 2006-22-06 at 10:27 -0500, Steve Wise wrote:

 
 The in-kernel Infiniband subsystem needs to know when certain events
 happen.  For example, if the mac address of a neighbour changes.  Any
 rdma devices that are using said neighbour need to be notified of the
 change.  You are asking that I extend the netlink facility (if
 necessary) to provide this functionality.  
 

No - what these 2 gents are saying was these events and infrastructure
already exist. If there are some events that dont and you need to extend
what already exists. Your patch was a serious reinvention of the wheel
(and in the case of the neighbor code looking very wrong).
As an example, search for NETDEV_CHANGEADDR,NETDEV_CHANGEMTU etc.
Actually you are probably making this too complicated. Listen to events
in user space and tell infiniband from user space.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/1] Kevent subsystem.

2006-06-22 Thread Robert Iakobashvili


Evgeniy,

On 6/22/06, Evgeniy Polyakov [EMAIL PROTECTED] wrote:


Kevent subsystem incorporates several AIO/kqueue design notes and ideas.
Kevent can be used both for edge and level notifications. It supports
socket notifications, network AIO (aio_send(), aio_recv() and
aio_sendfile()), inode notifications (create/remove),
generic poll()/select() notifications and timer notifications.


Great job!
Smooth integration with userland asynch POSIX frameworks
(e.g. ACE POSIX_Proactor) may require syscalls (or their emulation)
with POSIX interface:

* POSIX_API
* aio_read
* aio_write
* aio_suspend
* aio_error
* aio_return
* aio_cancel

where aio_suspend is very important.
--
Sincerely,
--
Robert Iakobashvili, coroberti at gmail dot com
Navigare necesse est, vivere non est necesse.
--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 2/7] NetLabel: core network changes

2006-06-22 Thread James Morris

On Thu, 22 Jun 2006, Steve Grubb wrote:

 On Thursday 22 June 2006 05:00, David Miller wrote:
    #define NETLINK_GENERIC  16
   +#define NETLINK_NETLABEL 17  /* Network packet labeling */
    
    #define MAX_LINKS 32 
 
  Please use generic netlink.
 
 Since this is a security interface, shouldn't it be its own protocol so that 
 SE Linux can control commands being sent? Paul's patches do include a netlink 
 table in security/selinux/nlmsgtab.c. But I do not see any hooks to control 
 generic netlink messages. (There seems to be several protocols that SE Linux 
 is not controlling.) I could see that someone in secadm role should be able 
 to issue these commands, but someone at sysadm or auditadm would not.
 
 If moving this over to generic is a must, then I think SE Linux will have to 
 clip into generic to control its packet flow.

SELinux will mediate them as 'generic' netlink.

Fine-grained SELinux support for generic netlink is todo.



-- 
James Morris
[EMAIL PROTECTED]

Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread Steve Wise

On Thu, 2006-06-22 at 15:43 -0400, jamal wrote:
 On Thu, 2006-22-06 at 10:27 -0500, Steve Wise wrote:
 
  
  The in-kernel Infiniband subsystem needs to know when certain events
  happen.  For example, if the mac address of a neighbour changes.  Any
  rdma devices that are using said neighbour need to be notified of the
  change.  You are asking that I extend the netlink facility (if
  necessary) to provide this functionality.  
  
 
 No - what these 2 gents are saying was these events and infrastructure
 already exist. If there are some events that dont and you need to extend
 what already exists. Your patch was a serious reinvention of the wheel
 (and in the case of the neighbor code looking very wrong).

ok.

 As an example, search for NETDEV_CHANGEADDR,NETDEV_CHANGEMTU etc.
 Actually you are probably making this too complicated. 

NETDEV_CHANGEADDR uses a notifier block, and the network subsystem calls
call_netdevice_notifiers() when it sets an addr.  And any kernel module
can register for these events.  That's the model I used to create the
netevent_notifier mechanism in the patch I posted.

I could add the new events to this netdevice notifier, but these aren't
really net device events.  Their network events.  

 Listen to events
 in user space and tell infiniband from user space.
 

I can indeed extend the rtnetlink stuff to add the events in question
(neighbour mac addr change, route redirect, etc). In fact, there is
similar functionality under the CONFIG_ARPD option to support a user
space arp daemon.  Its not quite the same, and it doesn't cover redirect
and routing events, just neighbour events.

But in the case of the RDMA subsystem, the consumer of these events is
in the kernel.  Why is it better to propagate events all the way up to
user space, then send the event back down into the Infiniband kernel
subsystem?  That seems very inefficient.  

Steve.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] WAN: register_hdlc_device() doesn't need dev_alloc_name()

2006-06-22 Thread Krzysztof Halasa

David Boggs noticed that register_hdlc_device() no longer needs
to call dev_alloc_name() as it's called by register_netdev().
register_hdlc_device() is currently equivalent to register_netdev().

hdlc_setup() is now EXPORTed as per David's request.

Signed-off-by: Krzysztof Halasa [EMAIL PROTECTED]

--- a/include/linux/hdlc.h
+++ b/include/linux/hdlc.h
@@ -188,7 +188,7 @@ int hdlc_x25_ioctl(struct net_device *de
 int hdlc_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd);
 
 /* Must be used by hardware driver on module startup/exit */
-int register_hdlc_device(struct net_device *dev);
+#define register_hdlc_device(dev)  register_netdev(dev)
 void unregister_hdlc_device(struct net_device *dev);
 
 struct net_device *alloc_hdlcdev(void *priv);
--- a/drivers/net/wan/hdlc_generic.c
+++ b/drivers/net/wan/hdlc_generic.c
@@ -259,7 +259,7 @@ int hdlc_ioctl(struct net_device *dev, s
}
 }
 
-static void hdlc_setup(struct net_device *dev)
+void hdlc_setup(struct net_device *dev)
 {
hdlc_device *hdlc = dev_to_hdlc(dev);
 
@@ -288,26 +288,6 @@ struct net_device *alloc_hdlcdev(void *p
return dev;
 }
 
-int register_hdlc_device(struct net_device *dev)
-{
-   int result = dev_alloc_name(dev, hdlc%d);
-   if (result  0)
-   return result;
-
-   result = register_netdev(dev);
-   if (result != 0)
-   return -EIO;
-
-#if 0
-   if (netif_carrier_ok(dev))
-   netif_carrier_off(dev); /* no carrier until DCD goes up */
-#endif
-
-   return 0;
-}
-
-
-
 void unregister_hdlc_device(struct net_device *dev)
 {
rtnl_lock();
@@ -326,8 +306,8 @@ EXPORT_SYMBOL(hdlc_open);
 EXPORT_SYMBOL(hdlc_close);
 EXPORT_SYMBOL(hdlc_set_carrier);
 EXPORT_SYMBOL(hdlc_ioctl);
+EXPORT_SYMBOL(hdlc_setup);
 EXPORT_SYMBOL(alloc_hdlcdev);
-EXPORT_SYMBOL(register_hdlc_device);
 EXPORT_SYMBOL(unregister_hdlc_device);
 
 static struct packet_type hdlc_packet_type = {
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] WAN: ioremap() failure checks in drivers

2006-06-22 Thread Krzysztof Halasa

Eric Sesterhenn found that pci200syn initialization lacks return
statement in ioremap() error path (coverity bug id #195). It looks
like more WAN drivers have problems with ioremap().

Signed-off-by: Krzysztof Halasa [EMAIL PROTECTED]

--- a/drivers/net/wan/c101.c
+++ b/drivers/net/wan/c101.c
@@ -326,21 +326,21 @@ static int __init c101_run(unsigned long
if (request_irq(irq, sca_intr, 0, devname, card)) {
printk(KERN_ERR c101: could not allocate IRQ\n);
c101_destroy_card(card);
-   return(-EBUSY);
+   return -EBUSY;
}
card-irq = irq;
 
if (!request_mem_region(winbase, C101_MAPPED_RAM_SIZE, devname)) {
printk(KERN_ERR c101: could not request RAM window\n);
c101_destroy_card(card);
-   return(-EBUSY);
+   return -EBUSY;
}
card-phy_winbase = winbase;
card-win0base = ioremap(winbase, C101_MAPPED_RAM_SIZE);
if (!card-win0base) {
printk(KERN_ERR c101: could not map I/O address\n);
c101_destroy_card(card);
-   return -EBUSY;
+   return -EFAULT;
}
 
card-tx_ring_buffers = TX_RING_BUFFERS;
--- a/drivers/net/wan/n2.c
+++ b/drivers/net/wan/n2.c
@@ -387,6 +387,11 @@ static int __init n2_run(unsigned long i
}
card-phy_winbase = winbase;
card-winbase = ioremap(winbase, USE_WINDOWSIZE);
+   if (!card-winbase) {
+   printk(KERN_ERR n2: ioremap() failed\n);
+   n2_destroy_card(card);
+   return -EFAULT;
+   }
 
outb(0, io + N2_PCR);
outb(winbase  12, io + N2_BAR);
--- a/drivers/net/wan/pci200syn.c
+++ b/drivers/net/wan/pci200syn.c
@@ -358,6 +358,7 @@ #endif
card-rambase == NULL) {
printk(KERN_ERR pci200syn: ioremap() failed\n);
pci200_pci_remove_one(pdev);
+   return -EFAULT;
}
 
/* Reset PLX */
--- a/drivers/net/wan/wanxl.c
+++ b/drivers/net/wan/wanxl.c
@@ -634,7 +634,13 @@ #endif
 
/* set up PLX mapping */
plx_phy = pci_resource_start(pdev, 0);
+
card-plx = ioremap_nocache(plx_phy, 0x70);
+   if (!card-plx) {
+   printk(KERN_ERR wanxl: ioremap() failed\n);
+   wanxl_pci_remove_one(pdev);
+   return -EFAULT;
+   }
 
 #if RESET_WHILE_LOADING
wanxl_reset(card);
@@ -700,6 +706,12 @@ #endif
}
 
mem = ioremap_nocache(mem_phy, PDM_OFFSET + sizeof(firmware));
+   if (!mem) {
+   printk(KERN_ERR wanxl: ioremap() failed\n);
+   wanxl_pci_remove_one(pdev);
+   return -EFAULT;
+   }
+
for (i = 0; i  sizeof(firmware); i += 4)
writel(htonl(*(u32*)(firmware + i)), mem + PDM_OFFSET + i);
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 7/32] [TIPC] Multicast link failure now resets all links to nacking node.

2006-06-22 Thread Stephens, Allan

Nice observation, James.  As a relative newcomer to the official Linux
kernel development world, I'm impressed that non-TIPC folks are looking
at TIPC changes closely enough to see things like this!

Per, I'll leave it to you to decide if you want to address James's
concern.  But be aware that the link_reset_all() routine is only called
to handle emergency situations when TIPC's multicast link has run into
serious problems and is trying to recover.  Most systems will never
follow this path, so the cost of the unnecessary
local_bh_disable()/local_bh_enable() pairing shouldn't have any real
impact on the overall performance of TIPC.

Regards,
Al Stephens

 -Original Message-
 From: James Morris [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, June 22, 2006 2:51 PM
 To: Per Liden
 Cc: David Miller; netdev@vger.kernel.org; Stephens, Allan
 Subject: Re: [PATCH 7/32] [TIPC] Multicast link failure now 
 resets all links to nacking node.
 
 On Thu, 22 Jun 2006, Per Liden wrote:
 
  +static void link_reset_all(unsigned long addr) {
  +   struct node *n_ptr;
  +   char addr_string[16];
  +   u32 i;
  +
  +   read_lock_bh(tipc_net_lock);
  +   n_ptr = tipc_node_find((u32)addr);
  +   if (!n_ptr) {
  +   read_unlock_bh(tipc_net_lock);
  +   return; /* node no longer exists */
  +   }
  +
  +   tipc_node_lock(n_ptr);
 
 You already have bh's disabled here, and tipc_node_lock() 
 also disables them.
 
 Not sure if it's really worth worrying about but if so, you 
 could perhaps implement tipc_node_lock_bh() and tipc_node_lock().
 
 
 
 - James
 -- 
 James Morris
 [EMAIL PROTECTED]
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 >

1 - 100 of 156 matches

Mail list logo