Re: [PATCH] [IPVS] Shrink ip_vs_*.c includes

2006-02-08 Thread Roberto Nibali

The correct way to go about this is to go through each included header
file and check if any of its symbols are used in the source file.

Or if this is too tedious just leave it alone.


Hi Herbert, 

thanks for your feedback. 

Dave, 


please discard this patch for now.

Ratz,

Unfortunately this seems like it is going to be more tedious than 
we first thought. I would guess writing some sort of tool to analyse

symbols and headers is the way to go. Else it seems more or less
impossible to clean up headers, even on a small scale.


I've thought about the linux/modules.h header (ignoring the fact 
mentioned by Herbert because of its small chance of happening) and my 
suggestion was/is to move it to ip_vs.h. I'll ping Arnaldo regarding 
this cleanup, since he's done it before and certainly has some scripts 
to automate this.


Thanks for the review and discussion,
Roberto Nibali, ratz
--
echo 
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [IPVS] Shrink ip_vs_*.c includes

2006-02-08 Thread Horms
On Tue, Feb 07, 2006 at 09:07:34PM -0800, David S. Miller wrote:
 From: Horms [EMAIL PROTECTED]
 Date: Wed, 8 Feb 2006 12:09:29 +0900
 
  Unfortunately this seems like it is going to be more tedious than 
  we first thought. I would guess writing some sort of tool to analyse
  symbols and headers is the way to go. Else it seems more or less
  impossible to clean up headers, even on a small scale.
 
 It's doable on a small scale, you just have to approach the problem
 from the other direction.  Ie. pick a header file and audit the use of
 that specific header file across the tree.
 
 Folks have done this with headers like linux/sched.h and friends in
 the past.

Point taken. 

I imagine a tool could digest this reasonably efficiently.
Its just doing it by hand which is somewhat labourious.

 And it's worthwhile because anything that minimises kernel rebuild
 when touching a header file helps streamline development.

Agreed

-- 
Horms
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread Andi Kleen
On Wednesday 08 February 2006 01:44, David S. Miller wrote:
 From: Ben Greear [EMAIL PROTECTED]
 Date: Tue, 07 Feb 2006 16:39:52 -0800
 
  Rick Jones wrote:
   In the realm of straw ideas, how often are netdevs added and removed, 
   and would leaving a tombstone behind consume too much memory?
  
  In certain cases...say, with vlans, you could very often create and
  destroy net devices.  I think that giving up and leaking the memory
  is not a good idea.
 
 I think he's suggesting another thing.  Reattach the skb-dev
 to some dummy device that always persists, when a device goes
 down.

One possible way would be to combine the tombstone with a global 
generation number.

tombstone would be just the netdevice memory itself not freed, but
possibly reused.

Instead of just having a pointer to struct netdevice in the skb have 
a pointer and a generation number. Each use checks the generation
number and if it doesn't match the net devices throw away the
skb because its device is gone. This check is cheap on SMP because 
it's fully read only and can happen with shared cache lines.

Then never free struct net_devices, but only reuse them with
higher generation numbers.

The use could be some central place like dev_queue_xmit()
and perhaps some places in netfilter. I don't think every
reference of skb-dev would need to check it because most could
probably tolerate stale devices

Disadvantages: 
- When the device is freed but not reused yet it must be 
in a state that it can be safely accessed by all users
[should be easy]
- If there is some workload that only needs temporarily a lot
of devices then the system can never recover to use less for them.
Probably not a big issue.
- It would need a few bytes in the skb (and possibly other data structures
referencing the device) to store the generation count.
- The generation number check would add a few cycles, but I guess it's much 
cheaper
than the cache misses from the reference counts.
- If the generation count wraps then skbs could be sent to the wrong devices.
With 32bit probably not a big issue.

-Andi

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH,2.4,SECURITY,NET] orinoco: CVE-2005-3180: Information leakage due to incorrect padding

2006-02-08 Thread Horms
 [PATCH] Better fixup for the orinoco driver
 
 The latest kernel added a pretty ugly fix for the orinoco etherleak bug
 which contains bogus skb-len checks already done by the caller and causes
 copies of all odd sized frames (which are quite common)
 
 While the skb-len check should be ripped out the other fix is harder to do
 properly so I'm proposing for this the -mm tree only until next 2.6.x so
 that it gets tested.
 
 Instead of copying buffers around blindly this code implements a padding
 aware version of the hermes buffer writing function which does padding as
 the buffer is loaded and thus more cleanly and without bogus 1.5K copies.
 
 Signed-off-by: Alan Cox [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]
 Signed-off-by: Jeff Garzik [EMAIL PROTECTED]

The above is a patch included in 2.6.16 as a fix for CVE-2005-3180.  It to
be applicable to 2.4.  I have made a backport below, with the only
semi-significant change being including the ALIGN macro in orinoco.c, as it
doesn't exist in 2.4.

As yet untested

Signed-off-by: Horms [EMAIL PROTECTED]

index 0c06b14..b99edd3 100644
--- a/drivers/net/wireless/hermes.c
+++ b/drivers/net/wireless/hermes.c
@@ -448,6 +448,43 @@ int hermes_bap_pwrite(hermes_t *hw, int 
return err;
 }
 
+/* Write a block of data to the chip's buffer with padding if
+ * neccessary, via the BAP. Synchronization/serialization is the
+ * caller's problem. len must be even.
+ *
+ * Returns:  0 on internal failure (errno), 0 on success,  0 on error from 
firmware
+ */
+int hermes_bap_pwrite_pad(hermes_t *hw, int bap, const void *buf, unsigned 
data_len, unsigned len,
+ u16 id, u16 offset)
+{
+   int dreg = bap ? HERMES_DATA1 : HERMES_DATA0;
+   int err = 0;
+
+   if (len  0 || len % 2 || data_len  len)
+   return -EINVAL;
+
+   err = hermes_bap_seek(hw, bap, id, offset);
+   if (err)
+   goto out;
+
+   /* Transfer all the complete words of data */
+   hermes_write_words(hw, dreg, buf, data_len/2);
+   /* If there is an odd byte left over pad and transfer it */
+   if (data_len  1) {
+   u8 end[2];
+   end[1] = 0;
+   end[0] = ((unsigned char *)buf)[data_len - 1];
+   hermes_write_words(hw, dreg, end, 1);
+   data_len ++;
+   }
+   /* Now send zeros for the padding */
+   if (data_len  len)
+   hermes_clear_words(hw, dreg, (len - data_len) / 2);
+   /* Complete */
+ out:
+   return err;
+}
+
 /* Read a Length-Type-Value record from the card.
  *
  * If length is NULL, we ignore the length read from the card, and
@@ -534,6 +571,7 @@ EXPORT_SYMBOL(hermes_allocate);
 
 EXPORT_SYMBOL(hermes_bap_pread);
 EXPORT_SYMBOL(hermes_bap_pwrite);
+EXPORT_SYMBOL(hermes_bap_pwrite_pad);
 EXPORT_SYMBOL(hermes_read_ltv);
 EXPORT_SYMBOL(hermes_write_ltv);
 
index 5c01d0d..5a7e587 100644
--- a/drivers/net/wireless/hermes.h
+++ b/drivers/net/wireless/hermes.h
@@ -319,6 +319,8 @@ int hermes_bap_pread(hermes_t *hw, int b
   u16 id, u16 offset);
 int hermes_bap_pwrite(hermes_t *hw, int bap, const void *buf, unsigned len,
u16 id, u16 offset);
+int hermes_bap_pwrite_pad(hermes_t *hw, int bap, const void *buf,
+   unsigned data_len, unsigned len, u16 id, u16 offset);
 int hermes_read_ltv(hermes_t *hw, int bap, u16 rid, unsigned buflen,
u16 *length, void *buf);
 int hermes_write_ltv(hermes_t *hw, int bap, u16 rid,
index 5b5ca26..ec4003f 100644
--- a/drivers/net/wireless/orinoco.c
+++ b/drivers/net/wireless/orinoco.c
@@ -2312,6 +2312,8 @@ orinoco_stat_gather(struct net_device *d
}
 }
 
+#define ALIGN(x,a) (((x)+(a)-1)~((a)-1))
+
 static int
 orinoco_xmit(struct sk_buff *skb, struct net_device *dev)
 {
@@ -2407,14 +2409,22 @@ orinoco_xmit(struct sk_buff *skb, struct
stats-tx_errors++;
goto fail;
}
+   /* Actual xfer length - allow for padding */
+   len = ALIGN(data_len, 2);
+   if (len  ETH_ZLEN - ETH_HLEN)
+   len = ETH_ZLEN - ETH_HLEN;
} else { /* IEEE 802.3 frame */
data_len = len + ETH_HLEN;
data_off = HERMES_802_3_OFFSET;
p = skb-data;
+   /* Actual xfer length - round up for odd length packets */
+   len = ALIGN(data_len, 2);
+   if (len  ETH_ZLEN)
+   len = ETH_ZLEN;
}
 
-   /* Round up for odd length packets */
-   err = hermes_bap_pwrite(hw, USER_BAP, p, RUP_EVEN(data_len), txfid, 
data_off);
+   err = hermes_bap_pwrite_pad(hw, USER_BAP, p, data_len, len,
+   txfid, data_off);
if (err) {
printk(KERN_ERR %s: Error %d writing packet to BAP\n,
   dev-name, err);
-
To unsubscribe from this list: send 

Re: [PATCH] Packet socket: directly access the mmapped ring buffer

2006-02-08 Thread Evgeniy Polyakov
On Wed, Feb 08, 2006 at 02:26:32PM +0900, FUJITA Tomonori ([EMAIL PROTECTED]) 
wrote:
 Mike Christie and I've developed the SCSI Userspace target
 framework. Target LLDs (for Fibre channel, iSCSI HBAs, etc) pass SCSI
 commands to SCSI commands to the user-space daemon. The daemon
 executes the commands and sends the results back to the LLDs.
 
 Please refer scsi-ml for further details.
 
 http://thread.gmane.org/gmane.linux.scsi/22409
 
 We need efficient kernel and user-space communication interface and
 used netlink. Jeff Garzik suggested the packet socket mmap'd ring
 buffer.
 
 The mmap'd ring buffer is really nice, but we want to access directly
 the ring buffer withough going through the networking stack to avoid
 memory allocation and overhead.

Depending on calling context, it can be not sufficient to leave sofirqs
enabled in that function.
 
 Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
 Signed-off-by: Mike Christie [EMAIL PROTECTED]
 ---
 
  include/net/af_packet.h |6 ++
  net/packet/af_packet.c  |   17 +
  2 files changed, 23 insertions(+), 0 deletions(-)
  create mode 100644 include/net/af_packet.h
 
 c627f3a1da6e5e7e9e46d58401adcf168ea45787
 diff --git a/include/net/af_packet.h b/include/net/af_packet.h
 new file mode 100644
 index 000..5a75e07
 --- /dev/null
 +++ b/include/net/af_packet.h
 @@ -0,0 +1,6 @@
 +#ifndef __LINUX_NET_AFPACKET_H
 +#define __LINUX_NET_AFPACKET_H
 +
 +extern struct tpacket_hdr *packet_socket_frame(struct sock *sk);
 +
 +#endif
 diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
 index 9db7dbd..b5fbd74 100644
 --- a/net/packet/af_packet.c
 +++ b/net/packet/af_packet.c
 @@ -562,6 +562,23 @@ drop:
  }
  
  #ifdef CONFIG_PACKET_MMAP
 +struct tpacket_hdr *packet_socket_frame(struct sock *sk)
 +{
 + struct packet_sock *po;
 + struct tpacket_hdr *h;
 +
 + po = pkt_sk(sk);
 + spin_lock(sk-sk_receive_queue.lock);
 + h = (struct tpacket_hdr *) packet_lookup_frame(po, po-head);
 + if (h-tp_status)
 + h = ERR_PTR(-ENOBUFS);
 + else
 + po-head = po-head != po-frame_max ? po-head+1 : 0;
 + spin_unlock(sk-sk_receive_queue.lock);
 + return h;
 +}
 +EXPORT_SYMBOL_GPL(packet_socket_frame);
 +
  static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev, struct 
 packet_type *pt, struct net_device *orig_dev)
  {
   struct sock *sk;
 -- 
 1.1.3

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread Andi Kleen
On Wednesday 08 February 2006 11:34, Eric Dumazet wrote:

 1) Instead of storing a 2-uple {pointer,generation} (and using 12 or 16 bytes 
 on 64 bits platforms), we could just use a 32 bit quantity 
 [(ifindex8)+(gen_number)]

That would add an 2^24 netdevice limit. Someone will sooner or later
run into this.

Also I'm not sure it's worth it just to save 4 bytes. It would cost more
TLB misses too.

 2) Not easy to not free the devices because their size is variable (see 
 alloc_netdev()) depending on private data for the driver.

you mean reuse?

I guess when you have lots of devices they are very likely to be all
the same size (e.g. all vlan or all PPP) So it would be reasonable to only 
reuse a device with the same size.

Only drawback is that it might be a long search for the right size
if it's at the other end of a long list and there isn't a hash table
or similar.  But I guess that case could be ignored at first because it should 
be 
very unlikely and if it really should be a problem for somebody then some
suitable data structure could be added.

-Andi

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange IPsec freeze/partial fix

2006-02-08 Thread Olaf Kirch
On Wed, Feb 08, 2006 at 07:46:48AM +1100, Herbert Xu wrote:
 I suggest that we simply bail out always.  If the dst decides to die
 on us later on, the packet will be dropped anyway.  So there is no
 great urgency to retry here.  Once we have the proper resolution
 queueing, we can then do the retry again.

Yes, that is simpler, and should work as well.

 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Acked-by: Olaf Kirch [EMAIL PROTECTED]

Olaf
-- 
Olaf Kirch   |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [IPVS] Shrink ip_vs_*.c includes

2006-02-08 Thread Arnaldo Carvalho de Melo
On 2/8/06, Roberto Nibali [EMAIL PROTECTED] wrote:

 I've thought about the linux/modules.h header (ignoring the fact
 mentioned by Herbert because of its small chance of happening) and my
 suggestion was/is to move it to ip_vs.h. I'll ping Arnaldo regarding
 this cleanup, since he's done it before and certainly has some scripts
 to automate this.

Well, I use a graphviz based tool to see the picture, i.e. the
header hierarchy,
like this one:

http://oops.ghostprotocols.net:81/acme/tcp.h.ps

The tools are called hviz and ghviz and are available at:

http://www.kernel.org/pub/linux/kernel/people/acme/

I've been thinking about using sparse to do a include linker, i.e. a tool
that would find all the symbols provided by .h files and the symbols
required by .h and .c files, that would tell what headers are needed
and what headers currently being #included aren't needed and should
be dropped, but have other priorities in my TODO list right now :-\

- Arnaldo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fix a severe bug in netlink

2006-02-08 Thread Alexey Kuznetsov
Hello!

netlink overrun was broken while improvement of netlink.
Destination socket is used in the place where it was meant to be source socket,
so that now overrun is never sent to user netlink sockets, when it should be,
and it even can be set on kernel socket, which results in complete deadlock
of rtnetlink.

Suggested fix is to restore status quo passing source socket as additional
argument to netlink_attachskb().

A little explanation: overrun is set on a socket, when it failed
to receive some message and sender of this messages does not or even
have no way to handle this error. This happens in two cases:
1. when kernel sends something. Kernel never retransmits and cannot
   wait for buffer space. 
2. when user sends a broadcast and the message was not delivered
   to some recipients. 


Signed-off-by: Alexey Kuznetsov [EMAIL PROTECTED]

diff --git a/include/linux/netlink.h b/include/linux/netlink.h
index 6a2ccf7..c256ebe 100644
--- a/include/linux/netlink.h
+++ b/include/linux/netlink.h
@@ -160,7 +160,8 @@ extern int netlink_unregister_notifier(s
 
 /* finegrained unicast helpers: */
 struct sock *netlink_getsockbyfilp(struct file *filp);
-int netlink_attachskb(struct sock *sk, struct sk_buff *skb, int nonblock, long 
timeo);
+int netlink_attachskb(struct sock *sk, struct sk_buff *skb, int nonblock,
+   long timeo, struct sock *ssk);
 void netlink_detachskb(struct sock *sk, struct sk_buff *skb);
 int netlink_sendskb(struct sock *sk, struct sk_buff *skb, int protocol);
 
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index 59302fc..fd2e26b 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -1018,7 +1018,8 @@ retry:
goto out;
}
 
-   ret = netlink_attachskb(sock, nc, 0, 
MAX_SCHEDULE_TIMEOUT);
+   ret = netlink_attachskb(sock, nc, 0,
+   MAX_SCHEDULE_TIMEOUT, NULL);
if (ret == 1)
goto retry;
if (ret) {
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 2101b45..6b9772d 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -702,7 +702,8 @@ struct sock *netlink_getsockbyfilp(struc
  * 0: continue
  * 1: repeat lookup - reference dropped while waiting for socket memory.
  */
-int netlink_attachskb(struct sock *sk, struct sk_buff *skb, int nonblock, long 
timeo)
+int netlink_attachskb(struct sock *sk, struct sk_buff *skb, int nonblock,
+   long timeo, struct sock *ssk)
 {
struct netlink_sock *nlk;
 
@@ -712,7 +713,7 @@ int netlink_attachskb(struct sock *sk, s
test_bit(0, nlk-state)) {
DECLARE_WAITQUEUE(wait, current);
if (!timeo) {
-   if (!nlk-pid)
+   if (!ssk || nlk_sk(ssk)-pid == 0)
netlink_overrun(sk);
sock_put(sk);
kfree_skb(skb);
@@ -797,7 +798,7 @@ retry:
kfree_skb(skb);
return PTR_ERR(sk);
}
-   err = netlink_attachskb(sk, skb, nonblock, timeo);
+   err = netlink_attachskb(sk, skb, nonblock, timeo, ssk);
if (err == 1)
goto retry;
if (err)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] NET : No need to update last_rx in loopback driver

2006-02-08 Thread Eric Dumazet
loopback driver carefully uses per_cpu storage for statistics but updates 
loopback_dev.last_rx


As last_rx is only used by bond driver to detect link activity, and loopback 
dev wont be used as a bond slave, can we avoid this write access that slowdown 
SMP platforms, because of cache ping pongs ?


If this patch is not correct, I suspect we should update trans_start in 
loopback driver since the condition in the bond driver is bidirectional :


if (((jiffies - slave-dev-trans_start) = delta_in_ticks) 
((jiffies - slave-dev-last_rx) = delta_in_ticks)) {

Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
--- a/drivers/net/loopback.c2006-02-07 12:10:55.0 +0100
+++ b/drivers/net/loopback.c2006-02-08 16:49:08.0 +0100
@@ -147,7 +147,6 @@
return 0;
}
 #endif
-   dev-last_rx = jiffies;
 
lb_stats = per_cpu(loopback_stats, get_cpu());
lb_stats-rx_bytes += skb-len;


[PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread Alexey Kuznetsov
Hello!

When a netlink message is not related to a netlink socket,
it is issued by kernel socket with pid 0. Netlink pid has nothing
to do with current-pid. I called it incorrectly, if it was named port,
the confusion would be avoided.

Jamal, please, review. Did you have reasons to do this?

Signed-off-by: Alexey Kuznetsov [EMAIL PROTECTED]

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 8700379..eca2976 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -455,7 +455,7 @@ void rtmsg_ifinfo(int type, struct net_d
if (!skb)
return;
 
-   if (rtnetlink_fill_ifinfo(skb, dev, type, current-pid, 0, change, 0)  
0) {
+   if (rtnetlink_fill_ifinfo(skb, dev, type, 0, 0, change, 0)  0) {
kfree_skb(skb);
return;
}
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 95b9d81..3ffa60d 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -1135,7 +1135,7 @@ static void rtmsg_ifa(int event, struct 
 
if (!skb)
netlink_set_err(rtnl, 0, RTNLGRP_IPV4_IFADDR, ENOBUFS);
-   else if (inet_fill_ifaddr(skb, ifa, current-pid, 0, event, 0)  0) {
+   else if (inet_fill_ifaddr(skb, ifa, 0, 0, event, 0)  0) {
kfree_skb(skb);
netlink_set_err(rtnl, 0, RTNLGRP_IPV4_IFADDR, EINVAL);
} else {
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index ef4724d..0f4145b 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1045,7 +1045,7 @@ fib_convert_rtentry(int cmd, struct nlms
}
 
nl-nlmsg_flags = NLM_F_REQUEST;
-   nl-nlmsg_pid = current-pid;
+   nl-nlmsg_pid = 0;
nl-nlmsg_seq = 0;
nl-nlmsg_len = NLMSG_LENGTH(sizeof(*rtm));
if (cmd == SIOCDELRT) {
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2.6.16-rc1] S2io: Large Receive Offload (LRO) feature(v2) for Neterion (s2io) 10GbE Xframe PCI-X and PCI-E NICs

2006-02-08 Thread Ravinandan Arakali
Hi,
Just wondering if anybody got a chance to review the below patch.
This version(as per Rick's comment on v1 patch) includes support
for TCP timestamps.

Thanks,
Ravi

-Original Message-
From: Ravinandan Arakali [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 25, 2006 11:53 AM
To: [EMAIL PROTECTED]; netdev@vger.kernel.org
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: [PATCH 2.6.16-rc1] S2io: Large Receive Offload (LRO)
feature(v2) for Neterion (s2io) 10GbE Xframe PCI-X and PCI-E NICs


Hi,
Below is a patch for the Large Receive Offload feature.
Please review and let us know your comments.

LRO algorithm was described in an OLS 2005 presentation, located at
ftp.s2io.com
user: linuxdocs
password: HALdocs

The same ftp site has Programming Manual for Xframe-I ASIC.
LRO feature is supported on Neterion Xframe-I, Xframe-II and
Xframe-Express 10GbE NICs.

Brief description:
The Large Receive Offload(LRO) feature is a stateless offload
that is complementary to TSO feature but on the receive path.
The idea is to combine and collapse(upto 64K maximum) in the
driver, in-sequence TCP packets belonging to the same session.
It is mainly designed to improve 1500 mtu receive performance,
since Jumbo frame performance is already close to 10GbE line
rate. Some performance numbers are attached below.

Implementation details:
1. Handle packet chains from multiple sessions(current default
MAX_LRO_SESSSIONS=32).
2. Examine each packet for eligiblity to aggregate. A packet is
considered eligible if it meets all the below criteria.
  a. It is a TCP/IP packet and L2 type is not LLC or SNAP.
  b. The packet has no checksum errors(L3 and L4).
  c. There are no IP options. The only TCP option supported is timestamps.
  d. Search and locate the LRO object corresponding to this
 socket and ensure packet is in TCP sequence.
  e. It's not a special packet(SYN, FIN, RST, URG, PSH etc. flags are not
set).
  f. TCP payload is non-zero(It's not a pure ACK).
  g. It's not an IP-fragmented packet.
3. If a packet is found eligible, the LRO object is updated with
   information such as next sequence number expected, current length
   of aggregated packet and so on. If not eligible or max packets
   reached, update IP and TCP headers of first packet in the chain
   and pass it up to stack.
4. The frag_list in skb structure is used to chain packets into one
   large packet.

Kernel changes required: None

Performance results:
Main focus of the initial testing was on 1500 mtu receiver, since this
is a bottleneck not covered by the existing stateless offloads.

There are couple disclaimers about the performance results below:
1. Your mileage will vary We initially concentrated on couple pci-x
2.0 platforms that are powerful enough to push 10 GbE NIC and do not
have bottlenecks other than cpu%;  testing on other platforms is still
in progress. On some lower end systems we are seeing lower gains.

2. Current LRO implementation is still (for the most part) software based,
and therefore performance potential of the feature is far from being
realized.
Full hw implementation of LRO is expected in the next version of Xframe
ASIC.

Performance delta(with MTU=1500) going from LRO disabled to enabled:
IBM 2-way Xeon (x366) : 3.5 to 7.1 Gbps
2-way Opteron : 4.5 to 6.1 Gbps

Signed-off-by: Ravinandan Arakali [EMAIL PROTECTED]
---

diff -urpN old/drivers/net/s2io.c new_ts/drivers/net/s2io.c
--- old/drivers/net/s2io.c  2006-01-19 04:31:05.0 -0800
+++ new_ts/drivers/net/s2io.c   2006-01-24 08:56:25.0 -0800
@@ -57,6 +57,9 @@
 #include linux/ethtool.h
 #include linux/workqueue.h
 #include linux/if_vlan.h
+#include linux/ip.h
+#include linux/tcp.h
+#include net/tcp.h

 #include asm/system.h
 #include asm/uaccess.h
@@ -66,7 +69,7 @@
 #include s2io.h
 #include s2io-regs.h

-#define DRV_VERSION Version 2.0.9.4
+#define DRV_VERSION 2.0.11.2

 /* S2io Driver name  version. */
 static char s2io_driver_name[] = Neterion;
@@ -168,6 +171,11 @@ static char ethtool_stats_keys[][ETH_GST
{\n DRIVER STATISTICS},
{single_bit_ecc_errs},
{double_bit_ecc_errs},
+   (lro_aggregated_pkts),
+   (lro_flush_both_count),
+   (lro_out_of_sequence_pkts),
+   (lro_flush_due_to_max_pkts),
+   (lro_avg_aggr_pkts),
 };

 #define S2IO_STAT_LEN sizeof(ethtool_stats_keys)/ ETH_GSTRING_LEN
@@ -317,6 +325,12 @@ static unsigned int indicate_max_pkts;
 static unsigned int rxsync_frequency = 3;
 /* Interrupt type. Values can be 0(INTA), 1(MSI), 2(MSI_X) */
 static unsigned int intr_type = 0;
+/* Large receive offload feature */
+static unsigned int lro = 0;
+/* Max pkts to be aggregated by LRO at one time. If not specified,
+ * aggregation happens until we hit max IP pkt size(64K)
+ */
+static unsigned int lro_max_pkts = 0x;

 /*
  * S2IO device table.
@@ -1476,6 +1490,19 @@ static int init_nic(struct s2io_nic *nic
writel((u32) 

Re: [PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread Hasso Tepper
jamal wrote:
 On Wed, 2006-08-02 at 18:27 +0300, Alexey Kuznetsov wrote:
  When a netlink message is not related to a netlink socket,
  it is issued by kernel socket with pid 0. Netlink pid has nothing
  to do with current-pid. I called it incorrectly, if it was named
  port, the confusion would be avoided.

This confusion was the main reason I rewrote rtnetlink.7 manpage. I 
received 0 comments till now, though ... does it mean it's so good? ;) 
Can you, Alexey, comment at least pid part?

http://www.mail-archive.com/netdev%40vger.kernel.org/msg06166.html

  Jamal, please, review. Did you have reasons to do this?

 The reason was driven by some apps such as quagga/zebra which
 get confused when they see pid of 0 for things _they_ added.
 Essentially there was lack of consistency, at times the app that made
 the kernel change has its pid appear on the resulting netlink message
 and at others it was 0 or the large (negative) number when you had
 more than 1 socket within the same process.

 CCing Hasso Tepper and more details of the original fix are
 here:
 http://lists.quagga.net/pipermail/quagga-dev/2005-June/003507.html

No, that was different issue and isn't related with issue Alexey poiting 
to. The issue I complained and you fixed it, Jamal, was that IPv6 related 
netlink messages had always pid 0 even if they were issued by 
application.

What Alexey pointing to is the change you did earlier - set pid in the 
messages not related to netlink sockets - ie. changes initiated by user 
using ioctls for example.

-- 
Hasso Tepper
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.16-rc1] S2io: Large Receive Offload (LRO) feature(v2) for Neterion (s2io) 10GbE Xframe PCI-X and PCI-E NICs

2006-02-08 Thread Jeff Garzik

Ravinandan Arakali wrote:

Hi,
Just wondering if anybody got a chance to review the below patch.
This version(as per Rick's comment on v1 patch) includes support
for TCP timestamps.


It's been merged in the 'lro' branch of netdev-2.6.git for a little 
while now.  Once it gets additional review (and hopefully testing), I am 
OK with it going upstream.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread Stephen Hemminger
On Tue, 07 Feb 2006 16:26:01 -0800 (PST)
David S. Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Tue, 7 Feb 2006 16:19:42 -0800
 
  Also, isn't a lot of the problem reduced if network devices
  are affinitied?
 
 Not for routing/firewalling, we touch the destination device's
 counters on input softing of the source device.

IMHO converting skb-dev to skb-devindex and using ifindex sounds best.
It gets rid of the need to refcount as much but keeps the safety from
buggy protocols.  Ipv6 could probably use ifindex as well.

-- 
Stephen Hemminger [EMAIL PROTECTED]
OSDL http://developer.osdl.org/~shemminger
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6 address autoconfiguration does not work after device down/up cycle

2006-02-08 Thread Juha-Matti Tapio
On Wed, Feb 08, 2006 at 05:17:24PM +0200, Kristian Slavov wrote:
 During NETDEV_DOWN we clear IF_READY, and we don't set it back in 
 NETDEV_UP. While starting to perform DAD on the link-local address, we 
 notice that the device is not in IF_READY, and we abort autoconfiguration 
 process (which would eventually send router solicitations).
 The following patch seems to get the work done. Patch is against 2.6.15.3.

Ack. This fixes the problems with sungem that I reported a few days ago.
Thank you. 


signature.asc
Description: Digital signature


Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread Ben Greear

Stephen Hemminger wrote:

On Tue, 07 Feb 2006 16:26:01 -0800 (PST)
David S. Miller [EMAIL PROTECTED] wrote:



From: Stephen Hemminger [EMAIL PROTECTED]
Date: Tue, 7 Feb 2006 16:19:42 -0800



Also, isn't a lot of the problem reduced if network devices
are affinitied?


Not for routing/firewalling, we touch the destination device's
counters on input softing of the source device.



IMHO converting skb-dev to skb-devindex and using ifindex sounds best.
It gets rid of the need to refcount as much but keeps the safety from
buggy protocols.  Ipv6 could probably use ifindex as well.


If we do this, can we keep a skb-dev pointer and assign it lazily
(sort of like we do with the timestamp?)  That way, we can hopefully
optimize to not bump the refcount in the hot path, but older protocols
can easily be made to work as they have been...

If there is any ifidx - skb lookups in the hot path, that is liable
to kill us since we'll have to take a lock and grovel through
the netdevice hash table?

Ben

--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread Ben Greear

Stephen Hemminger wrote:

On Wed, 08 Feb 2006 11:24:24 -0800
Ben Greear [EMAIL PROTECTED] wrote:



Stephen Hemminger wrote:


On Tue, 07 Feb 2006 16:26:01 -0800 (PST)
David S. Miller [EMAIL PROTECTED] wrote:




From: Stephen Hemminger [EMAIL PROTECTED]
Date: Tue, 7 Feb 2006 16:19:42 -0800




Also, isn't a lot of the problem reduced if network devices
are affinitied?


Not for routing/firewalling, we touch the destination device's
counters on input softing of the source device.



IMHO converting skb-dev to skb-devindex and using ifindex sounds best.
It gets rid of the need to refcount as much but keeps the safety from
buggy protocols.  Ipv6 could probably use ifindex as well.


If we do this, can we keep a skb-dev pointer and assign it lazily
(sort of like we do with the timestamp?)  That way, we can hopefully
optimize to not bump the refcount in the hot path, but older protocols
can easily be made to work as they have been...



No, just fix the protocols.


Well imagine that the protocol *does* need the netdev.  If we have
to look it up once, might as well store the result in case the next
user of the skb also needs it.  It's a net gain of 4 bytes in
the skb to preserve the skb pointer..seems like a reasonable tradeoff
to me.

Eventually, might be able to 'fix' all the protocols..but surely that
won't happen over-night.


If there is any ifidx - skb lookups in the hot path, that is liable
to kill us since we'll have to take a lock and grovel through
the netdevice hash table?


The hash table locking could be per chain. or lockless.


Even a per chain lock would be worse than the atomic inc/dec
of the refcount, wouldn't it?

Now, a lockless table..that would be nice.  That should be a win
regardless of any skb work.  Sending packets to a packet-socket
uses a dev_get for every packet, for instance...

--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KERNEL: assertion (!sk-sk_forward_alloc) failed

2006-02-08 Thread Jesse Brandeburg
this should be on netdev (cc'd), i included some of the thread here.

On 2/7/06, Yoseph Basri [EMAIL PROTECTED] wrote:
 Hello kernel maillist,

 I'm new member maillist.

 Currently, I receive the warning log from my kernel.

 Since update to
 Linux  2.6.14.3 #1 SMP Fri Nov 25 20:20:05 SGT 2005 i686 GNU/Linux from 2.4,

 I am getting the warning log:

 kernel: KERNEL: assertion (!sk-sk_forward_alloc) failed at
 net/core/stream.c (279)
 kernel: KERNEL: assertion (!sk-sk_forward_alloc) failed at
 net/ipv4/af_inet.c (148)

 Any information about this issue, and how to solve this problem ?

 Another server that i rolled back from 2.6.14 to 2.4 kernel and has no
 warning again. is this bug from 2.6 kernel?

 Thanks for your info.

On 2/7/06, Yoseph Basri [EMAIL PROTECTED] wrote:
 Hi Boris,

 I think so, here is from the dmeg info:

 Intel(R) PRO/1000 Network Driver - version 6.0.60-k2
 Copyright (c) 1999-2005 Intel Corporation.
 ACPI: PCI Interrupt :06:08.0[A] - GSI 29 (level, low) - IRQ 16
 e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
 ACPI: PCI Interrupt :06:08.1[B] - GSI 30 (level, low) - IRQ 17
 e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
 e100: Intel(R) PRO/100 Network Driver, 3.4.14-k2-NAPI
 e100: Copyright(c) 1999-2005 Intel Corporation

 any info regarding this warning?

 YB

 On 2/7/06, Boris B. Zhmurov [EMAIL PROTECTED] wrote:
  Hello, Yoseph Basri.
 
  On 07.02.2006 12:46 you said the following:
 
   I am getting the warning log:
  
   kernel: KERNEL: assertion (!sk-sk_forward_alloc) failed at
   net/core/stream.c (279)
   kernel: KERNEL: assertion (!sk-sk_forward_alloc) failed at
   net/ipv4/af_inet.c (148)
 
 
  Do you have Intel Pro 1000 network card? Same here...

whats the relevance of e1000?

I though Herbert had fixed these, and it looks like half the patches
got into 2.6.14.3, but not the fix to the fix committed on 9-6 (not in
2.6.14.* at all)

Jesse
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread Hasso Tepper
jamal wrote:
 Ok, thanks for the reminder Hasso.
 so essentially at the moment the pid that will show up (if
 quagga added the v6 route) will be that of quagga, correct?

No. Quote from Alexey:

Netlink pid has nothing to do with current-pid. I called it 
incorrectly, if it was named port, the confusion would be avoided.

And quote from netlink.7 I wrote:

nl_pid is the unicast address of netlink socket.  It's always 0 if the 
destination is in the kernel.  For a  userspace  process,  nl_pid  is 
usually  the  PID  of  the process owning the destination socket. 
However, nl_pid identifies a netlink socket, not a process.  If a process 
owns several netlink sockets, then nl_pid can only be equal to the 
process ID for at most one socket.

There are two ways to  assign  nl_pid to  a  netlink  socket.  If the 
application sets nl_pid before calling bind(2), then it is up to the 
application to make sure that nl_pid is unique.  If the application sets 
it to 0, the kernel takes care of assigning it.  The kernel assigns the 
process ID  to  the  first  netlink socket the process opens and assigns 
a unique nl_pid to every netlink socket that the process subsequently 
creates.

Once again link to the archive to the message I posted with new netlink.7 
no one bother to comment yet ;).

http://www.mail-archive.com/netdev%40vger.kernel.org/msg06166.html


-- 
Hasso Tepper
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Which driver for SysKonnect SK-9Exxx PCIe NIC?

2006-02-08 Thread Stephen Hemminger
On Wed, 08 Feb 2006 12:20:22 -0800
Ben Greear [EMAIL PROTECTED] wrote:

 I have a dual-port SysKonnect NIC.  In 'lspci' I see one entry
 (was expecting two..but maybe that is a separate issue).  The
 chipset appears to be SK-9Exx 10/100/1000Base-T adapter (rev 12).
 
 skge does not seem to support this NIC (in 2.6.15).  Is there
 another driver that does?
 
 Thanks,
 Ben
 

sky2 from 2.6.16 


-- 
Stephen Hemminger [EMAIL PROTECTED]
OSDL http://developer.osdl.org/~shemminger
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: network delays, mysterious push packets

2006-02-08 Thread Stephen Hemminger
On Wed, 08 Feb 2006 12:18:23 -0800
David Carlton [EMAIL PROTECTED] wrote:

 On Mon, 06 Feb 2006 15:50:28 -0800 (PST), David S. Miller [EMAIL 
 PROTECTED] said:
 
  From: David Carlton [EMAIL PROTECTED]
  Date: Mon, 06 Feb 2006 14:38:10 -0800
 
  I'm working on an application that we're trying to switch from a 2.4
  kernel to a 2.6 kernel.  (I believe we're using 2.6.9.)  One part of
  the program periodically sends out chunks of data (whose size is just
  over 1MB) via tcp.
 
  Please reproduce with something more current and report to the correct
  mailing list (netdev@vger.kernel.org).
 
 Thanks for the suggestion.  The problem went away with 2.6.15 (I
 believe uname -a called it 2.6.15.3), which is nice.
 
 I'll try to see if I can figure out which patch caused the problem (it
 may or may not be easy for us to switch to that kernel); I don't
 suppose you have any idea what patch caused the problem?  (The other
 endpoint is that the problem is still present in 2.6.10, where for the
 latter we're using whatever the latest 2.6.10 kernel is that's
 available for Fedora Core 2.)  Also, is
 http://linux.bkbits.net:8080/linux-2.6/hist/net/ipv4/ the right
 place to look for revision history?  Recent commits don't have a
 change message, which makes me think I'm looking in the wrong place.
 
 I put up a tcpdump at
 http://www.bactrian.org/~carlton/sender.tcpdump, if anybody's
 curious.  The problems start with packet 7367; the problematic
 transmission is from 64.12.82.96:33826 to 64.12.82.49:34483 (which
 starts at packet 6482).  The tcpdump was taken on the sender side, and
 some packets are missing or corrupted in the dump (for reasons that I
 assume have nothing to do with the problem at hand.)
 

I bet the new appropriate byte count (abc) logic is fixing your app.
You can get old behavior with sysctl.ipv4.tcp_abc=0.

If you want to find a patch that made things better/worse 
(and it is post git). Look into git bisect


-- 
Stephen Hemminger [EMAIL PROTECTED]
OSDL http://developer.osdl.org/~shemminger
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread jamal
On Wed, 2006-08-02 at 22:50 +0200, Hasso Tepper wrote:
 jamal wrote:
  Ok, thanks for the reminder Hasso.
  so essentially at the moment the pid that will show up (if
  quagga added the v6 route) will be that of quagga, correct?
 
 No. Quote from Alexey:
 

I sense you missed my question. I understand the variants of PIDs that
can occur(real, 0 or ones that need you to do a getname()), the question
is:
At the moment if a route (v6 or v4) was added by quagga and i had a
socket that was listening in a different process - what pid will i see
(in my user space app)? Is it of the quagga process or is it 0?

cheers,
jamal

PS:-I will read your man-page - I have it printed, just havent got
around to it yet.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread Hasso Tepper
jamal wrote:
 On Wed, 2006-08-02 at 22:50 +0200, Hasso Tepper wrote:
  jamal wrote:
   Ok, thanks for the reminder Hasso.
   so essentially at the moment the pid that will show up (if
   quagga added the v6 route) will be that of quagga, correct?
 
  No. Quote from Alexey:

 I sense you missed my question. I understand the variants of PIDs that
 can occur(real, 0 or ones that need you to do a getname()), the
 question is:
 At the moment if a route (v6 or v4) was added by quagga and i had a
 socket that was listening in a different process - what pid will i see
 (in my user space app)? Is it of the quagga process or is it 0?

No, it's 0xefff for most of Quagga users. The reason behind is that 
Quagga uses the first netlink socket it opens only for listening 
multicast groups. For sending messages to the kernel it opens second 
socket. Quagga lets kernel assign nl_pid's to the socket (by setting it 
to 0 before calling bind()). The first socket gets nl_pid equal to 
process pid of quagga (actually it's zebra daemon of quagga suite ;) and 
the second gets (0 - 4097) - see  
net/netlink/af_netlink.c:netlink_autobind() for details.

As far as I understand it - the only connection between process pid and 
nl_pid member in sockaddr_nl structure is that Alexey happened to use pid 
of the process opening netlink socket as base of algorithm assigning 
unique unicast address to this socket if application doesn't set it 
itself (ie. it's 0 before calling bind()).


-- 
Hasso Tepper
Elion Enterprises Ltd.
WAN administrator
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread jamal
On Wed, 2006-08-02 at 23:41 +0200, Hasso Tepper wrote:
 jamal wrote:

  At the moment if a route (v6 or v4) was added by quagga and i had a
  socket that was listening in a different process - what pid will i see
  (in my user space app)? Is it of the quagga process or is it 0?
 
 No, it's 0xefff for most of Quagga users. The reason behind is that 
 Quagga uses the first netlink socket it opens only for listening 
 multicast groups. For sending messages to the kernel it opens second 
 socket. Quagga lets kernel assign nl_pid's to the socket (by setting it 
 to 0 before calling bind()). The first socket gets nl_pid equal to 
 process pid of quagga (actually it's zebra daemon of quagga suite ;) and 
 the second gets (0 - 4097) -

Right - this is what i referred to as the negative number. if you add
a third socket it will go below 0xefff etc
  
 net/netlink/af_netlink.c:netlink_autobind() for details.
 
 As far as I understand it - the only connection between process pid and 
 nl_pid member in sockaddr_nl structure is that Alexey happened to use pid 
 of the process opening netlink socket as base of algorithm assigning 
 unique unicast address to this socket if application doesn't set it 
 itself (ie. it's 0 before calling bind()).

So the question is what would be the address/nl_pid of something
issued by an ioctl (refer to my earlier email to Alexey).

cheers,
jamal 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] check connect(2) status for IPv6 UDP socket

2006-02-08 Thread David S. Miller
From: Nicolas DICHTEL [EMAIL PROTECTED]
Date: Mon, 06 Feb 2006 12:00:30 +0100

 in the same way of this patch, why dst_entry are stored for
 RAW socket ? In case of specific IPSec rules for ICMPv6,
 xfrm state can be different for the same destination.
 Attached, a proposed patch.

We cache the flow we used to store that dst into the socket,
and we'll verify that on the next sendmsg() call so it's OK.

See the checks done in ip6_dst_lookup() when we have a cached
route attached to the socket.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread Hasso Tepper
jamal wrote:
 So the question is what would be the address/nl_pid of something
 issued by an ioctl (refer to my earlier email to Alexey).

It's the kernel who creates this message and puts it to the netlink 
domain, so I'd say 0.


-- 
Hasso Tepper
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET : No need to update last_rx in loopback driver

2006-02-08 Thread David S. Miller
From: Eric Dumazet [EMAIL PROTECTED]
Date: Wed, 08 Feb 2006 16:06:31 +0100

 loopback driver carefully uses per_cpu storage for statistics but updates 
 loopback_dev.last_rx

This has been discussed before, this is an attribute every
driver must keep uptodate.  Things like bonding use it,
for example.

Yes, loopback isn't usable with bonding, but it's a bad
precedence to set that a useful metric is set for some
devices and not for others.

Find a cleaner way to fix this SMP cacheline pingpong,
thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread Andi Kleen
On Wednesday 08 February 2006 20:12, Stephen Hemminger wrote:
 On Tue, 07 Feb 2006 16:26:01 -0800 (PST)
 David S. Miller [EMAIL PROTECTED] wrote:
 
  From: Stephen Hemminger [EMAIL PROTECTED]
  Date: Tue, 7 Feb 2006 16:19:42 -0800
  
   Also, isn't a lot of the problem reduced if network devices
   are affinitied?
  
  Not for routing/firewalling, we touch the destination device's
  counters on input softing of the source device.
 
 IMHO converting skb-dev to skb-devindex and using ifindex sounds best.
 It gets rid of the need to refcount as much but keeps the safety from
 buggy protocols.  Ipv6 could probably use ifindex as well.

But are you sure the hash table used for that would be strong enough to handle
the load?

And what happens when a ifindex is reused? Then packets could end up
on the wrong interface. That might be a security issue.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.16-rc1] S2io: Large Receive Offload (LRO) feature(v2) for Neterion (s2io) 10GbE Xframe PCI-X and PCI-E NICs

2006-02-08 Thread David S. Miller
From: Jeff Garzik [EMAIL PROTECTED]
Date: Wed, 08 Feb 2006 13:02:15 -0500

 Ravinandan Arakali wrote:
  Hi,
  Just wondering if anybody got a chance to review the below patch.
  This version(as per Rick's comment on v1 patch) includes support
  for TCP timestamps.
 
 It's been merged in the 'lro' branch of netdev-2.6.git for a little 
 while now.  Once it gets additional review (and hopefully testing), I am 
 OK with it going upstream.

Now that the timestamp support is in there I'm fine with these
changes.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread Alexey Dobriyan
Signed-off-by: Alexey Dobriyan [EMAIL PROTECTED]
---

 ppp is the biggest offender.

 drivers/net/ppp_async.c   |   34 ++--
 drivers/net/ppp_generic.c |  128 +++---
 drivers/net/ppp_synctty.c |   26 -
 drivers/net/pppoe.c   |2
 4 files changed, 95 insertions(+), 95 deletions(-)

--- a/drivers/net/ppp_async.c
+++ b/drivers/net/ppp_async.c
@@ -159,7 +159,7 @@ ppp_asynctty_open(struct tty_struct *tty
 
err = -ENOMEM;
ap = kmalloc(sizeof(*ap), GFP_KERNEL);
-   if (ap == 0)
+   if (!ap)
goto out;
 
/* initialize the asyncppp structure */
@@ -215,7 +215,7 @@ ppp_asynctty_close(struct tty_struct *tt
ap = tty-disc_data;
tty-disc_data = NULL;
write_unlock_irq(disc_data_lock);
-   if (ap == 0)
+   if (!ap)
return;
 
/*
@@ -230,10 +230,10 @@ ppp_asynctty_close(struct tty_struct *tt
tasklet_kill(ap-tsk);
 
ppp_unregister_channel(ap-chan);
-   if (ap-rpkt != 0)
+   if (ap-rpkt)
kfree_skb(ap-rpkt);
skb_queue_purge(ap-rqueue);
-   if (ap-tpkt != 0)
+   if (ap-tpkt)
kfree_skb(ap-tpkt);
kfree(ap);
 }
@@ -285,13 +285,13 @@ ppp_asynctty_ioctl(struct tty_struct *tt
int err, val;
int __user *p = (int __user *)arg;
 
-   if (ap == 0)
+   if (!ap)
return -ENXIO;
err = -EFAULT;
switch (cmd) {
case PPPIOCGCHAN:
err = -ENXIO;
-   if (ap == 0)
+   if (!ap)
break;
err = -EFAULT;
if (put_user(ppp_channel_index(ap-chan), p))
@@ -301,7 +301,7 @@ ppp_asynctty_ioctl(struct tty_struct *tt
 
case PPPIOCGUNIT:
err = -ENXIO;
-   if (ap == 0)
+   if (!ap)
break;
err = -EFAULT;
if (put_user(ppp_unit_number(ap-chan), p))
@@ -354,7 +354,7 @@ ppp_asynctty_receive(struct tty_struct *
struct asyncppp *ap = ap_get(tty);
unsigned long flags;
 
-   if (ap == 0)
+   if (!ap)
return;
spin_lock_irqsave(ap-recv_lock, flags);
ppp_async_input(ap, buf, cflags, count);
@@ -373,7 +373,7 @@ ppp_asynctty_wakeup(struct tty_struct *t
struct asyncppp *ap = ap_get(tty);
 
clear_bit(TTY_DO_WRITE_WAKEUP, tty-flags);
-   if (ap == 0)
+   if (!ap)
return;
set_bit(XMIT_WAKEUP, ap-xmit_flags);
tasklet_schedule(ap-tsk);
@@ -688,7 +688,7 @@ ppp_async_push(struct asyncppp *ap)
tty_stuffed = 1;
continue;
}
-   if (ap-optr = ap-olim  ap-tpkt != 0) {
+   if (ap-optr = ap-olim  ap-tpkt) {
if (ppp_async_encode(ap)) {
/* finished processing ap-tpkt */
clear_bit(XMIT_FULL, ap-xmit_flags);
@@ -708,7 +708,7 @@ ppp_async_push(struct asyncppp *ap)
clear_bit(XMIT_BUSY, ap-xmit_flags);
/* any more work to do? if not, exit the loop */
if (!(test_bit(XMIT_WAKEUP, ap-xmit_flags)
- || (!tty_stuffed  ap-tpkt != 0)))
+ || (!tty_stuffed  ap-tpkt)))
break;
/* more work to do, see if we can do it now */
if (test_and_set_bit(XMIT_BUSY, ap-xmit_flags))
@@ -719,7 +719,7 @@ ppp_async_push(struct asyncppp *ap)
 
 flush:
clear_bit(XMIT_BUSY, ap-xmit_flags);
-   if (ap-tpkt != 0) {
+   if (ap-tpkt) {
kfree_skb(ap-tpkt);
ap-tpkt = NULL;
clear_bit(XMIT_FULL, ap-xmit_flags);
@@ -852,7 +852,7 @@ ppp_async_input(struct asyncppp *ap, con
s = 0;
for (i = 0; i  count; ++i) {
c = buf[i];
-   if (flags != 0  flags[i] != 0)
+   if (flags  flags[i] != 0)
continue;
s |= (c  0x80)? SC_RCV_B7_1: SC_RCV_B7_0;
c = ((c  4) ^ c)  0xf;
@@ -869,7 +869,7 @@ ppp_async_input(struct asyncppp *ap, con
n = scan_ordinary(ap, buf, count);
 
f = 0;
-   if (flags != 0  (ap-state  SC_TOSS) == 0) {
+   if (flags  (ap-state  SC_TOSS) == 0) {
/* check the flags to see if any char had an error */
for (j = 0; j  n; ++j)
if ((f = flags[j]) != 0)
@@ -882,9 +882,9 @@ ppp_async_input(struct asyncppp *ap, con
} else if (n  0  (ap-state  SC_TOSS) == 0) {
/* stuff the chars in the skb */
skb = ap-rpkt;
-   if (skb == 0) {
+   if 

Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread James Carlson
Alexey Dobriyan writes:
 - if (ap == 0)
 + if (!ap)

And the solution is to treat it as a boolean instead?!  I'm not sure
which is more ugly.

Why wouldn't explicit comparison against NULL be the preferred fix?

-- 
James Carlson 42.703N 71.076W [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KERNEL: assertion (!sk-sk_forward_alloc) failed

2006-02-08 Thread David S. Miller
From: Jesse Brandeburg [EMAIL PROTECTED]
Date: Wed, 8 Feb 2006 12:07:14 -0800

 this should be on netdev (cc'd), i included some of the thread here.
 ...
 I though Herbert had fixed these, and it looks like half the patches
 got into 2.6.14.3, but not the fix to the fix committed on 9-6 (not in
 2.6.14.* at all)

What are the changeset IDs so I can fix this?

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread Herbert Xu
James Carlson [EMAIL PROTECTED] wrote:
 Alexey Dobriyan writes:
 - if (ap == 0)
 + if (!ap)
 
 And the solution is to treat it as a boolean instead?!  I'm not sure
 which is more ugly.

Treating it as a boolean looks good to me.  It's better than the existing
code because it shuts sparse up.
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread Roland Dreier
James And the solution is to treat it as a boolean instead?!  I'm
James not sure which is more ugly.

James Why wouldn't explicit comparison against NULL be the
James preferred fix?

if (ptr) and if (!ptr) are the preferred idiom for testing whether
a pointer is NULL.  What is gained by writing if (ptr != NULL) ?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] forcedeth: Add support for MSI/MSIX

2006-02-08 Thread Ayaz Abdulla
The ethernet controller is integrated into the chipset. Therefore, no 
mismatch between MSI/MSIX chipset support and device support should exist.


Ayaz


Roland Dreier wrote:

Roland Is forcedeth ever used with a discrete ethernet
Roland controller?  I thought that nforce NICs are always
Roland intergrated in the chipset, and hence the entire scope for
Roland MSI problems is within a single chip.  However, I don't
Roland know if there are devices that report MSI capability but
Roland where it doesn't work (that's a question for nvidia).

Just to be clear -- I meant nforce devices that report MSI capability
but where it doesn't work.

I know for a fact that there are many many devices in the world that
report an MSI capability but where it doesn't work ;)

 - R.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread David S. Miller
From: Andi Kleen [EMAIL PROTECTED]
Date: Wed, 8 Feb 2006 23:20:38 +0100

 But are you sure the hash table used for that would be strong enough
 to handle the load?

With RCU locking I think it is.

 And what happens when a ifindex is reused? Then packets could end up
 on the wrong interface. That might be a security issue.

It is an interesting issue, but I don't think it's a real problem.

For example, if the interface is downed, the route gets invalidated
and that would prevent sending.  The packet would go to /dev/null
because of what we do in net/core/dst.c/dst_ifdown().
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread Alexey Kuznetsov
Hello!

 What about the dilemma of when there are no netlink sockets
 involved? ;- 
 i.e what is the semantics when there is no netlink socket to map them
 to, such as in the case of ioctl?

0, which is legal address of kernel socket.

In this case it means that kernel did the operation.
Moreover, it is legal (not good, but legal) to use 0 even when it was
an explicit responce to a netlink request. Kernel did the work, eventually.


 a) if user-app using netlink modifies something, the event will report
 the nl_pid mapped to its pid or other non-zero value depending on
 the number of netlink sockets mapped to that process/user-app

It is pid of netlink socket requesting the operation.
If this pid is negative, it will be negative number.

The idea was that someone who heard the message, could ask the author
for an additional information via netlink. It is not used, of course,
but however only real netlink addresses should be used, otherwise
you could decide that the change was made by an innocent party,
which bound to this number by some reason.


 b) if the same user-app used ioctl instead the reported nl_pid is 0
 in the event

Exactly.

Alexey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread David Stevens
[EMAIL PROTECTED] wrote on 02/08/2006 02:19:20 PM:

 James Carlson [EMAIL PROTECTED] wrote:
  Alexey Dobriyan writes:
  - if (ap == 0)
  + if (!ap)
  
  And the solution is to treat it as a boolean instead?!  I'm not sure
  which is more ugly.
 
 Treating it as a boolean looks good to me.  It's better than the 
existing
 code because it shuts sparse up.

Why would sparse complain about this? 0 is a well-defined
pointer value (the only value guaranteed to be by the language).
From KR ANSI C edition, section 5.4:

Pointers and integers are not interchangeable. Zero is the sole
exception: the constant zero may be assigned to a pointer, and a pointer
may be compared with the constant zero.

+-DLS

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread David S. Miller
From: David Stevens [EMAIL PROTECTED]
Date: Wed, 8 Feb 2006 14:45:08 -0800

 Why would sparse complain about this? 0 is a well-defined
 pointer value (the only value guaranteed to be by the language).

Because sparse goes beyond the standards and tries to
catch cases that usually end up being bugs.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread Paul Mackerras
James Carlson writes:

 Alexey Dobriyan writes:
  -   if (ap == 0)
  +   if (!ap)
 
 And the solution is to treat it as a boolean instead?!  I'm not sure
 which is more ugly.
 
 Why wouldn't explicit comparison against NULL be the preferred fix?

I just think this whole you shouldn't compare a pointer to 0 thing
is completely silly, and patches like this have about the same level
of usefulness as patches that change color to colour or vice
versa.

However, the head penguin seems to have some bee in his bonnet about 0
not being a pointer value (despite what the C standard says), so
whatever.  *shrug*

Paul.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread Paul Mackerras
David S. Miller writes:

 Because sparse goes beyond the standards and tries to
 catch cases that usually end up being bugs.

When has a pointer comparison with an explicit 0 ever caused a bug?

Paul.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread Rick Jones

David S. Miller wrote:

From: David Stevens [EMAIL PROTECTED]
Date: Wed, 8 Feb 2006 14:45:08 -0800



   Why would sparse complain about this? 0 is a well-defined
pointer value (the only value guaranteed to be by the language).



Because sparse goes beyond the standards and tries to
catch cases that usually end up being bugs.


So, how might it tell when someone is mistakenly using a pointer as a boolean if 
it takes a pass on if(foo) or if(!foo)  instead of looking for if(foo == NULL) 
or if(foo != NULL)?(or I suppose if (NULL == foo) and if (NULL != foo))


it would seem that the compars with NULL, when read by pseudorandom folks would 
be more likely to reinforce that foo is a pointer rather than a boolean/flag.


rick jones
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread Alexey Dobriyan
On Wed, Feb 08, 2006 at 02:45:08PM -0800, David Stevens wrote:
 [EMAIL PROTECTED] wrote on 02/08/2006 02:19:20 PM:
 
  James Carlson [EMAIL PROTECTED] wrote:
   Alexey Dobriyan writes:
   - if (ap == 0)
   + if (!ap)
  
   And the solution is to treat it as a boolean instead?!  I'm not sure
   which is more ugly.
 
  Treating it as a boolean looks good to me.  It's better than the
 existing
  code because it shuts sparse up.

 Why would sparse complain about this? 0 is a well-defined
 pointer value (the only value guaranteed to be by the language).
 From KR ANSI C edition, section 5.4:

 Pointers and integers are not interchangeable. Zero is the sole
 exception: the constant zero may be assigned to a pointer, and a pointer
 may be compared with the constant zero.

O is an integer, NULL is a pointer. You compare integers with integers
and pointers with pointers. You don't compare integers with pointers
because they are fundamentally different objects. The former counts
things, the latter points to them.

The fact that they can be represented by the same bit patterns is
irrelevant. The fact that KR, C89, C99 make

p = malloc();
if (p == NULL)

legal C is irrelevant.

Because

p = malloc();
if (!p)
err;

is idiomatic C, I've used this form.

full-disclosure
Oh, and for the record: current sparse from Linus doesn't warn about
this. Slightly modified sparse warns. Bugs which were uncovered by
more or less trivial and slightly broken sparse patch [1] are:

[PATCH] dscc4: fix dscc4_init_dummy_skb check
[PATCH] opl3sa2: fix adding second dma channel
[PATCH] gusclassic: fix adding second dma channel
[PATCH] ipw2200: fix -eeprom[EEPROM_VERSION] check
[PATCH] ixj: fix writing silence check

ppp-NULL.patch is the biggest part of
187 files changed, 494 insertions(+), 499 deletions(-)
so I've flushed it first.

[1]:

--- a/evaluate.c
+++ b/evaluate.c
@@ -934,6 +934,10 @@ static struct symbol *evaluate_compare(s
/* Pointer types? */
if (is_ptr_type(ltype) || is_ptr_type(rtype)) {
// FIXME! Check the types for compatibility
+   if (is_ptr_type(ltype)  !is_ptr_type(rtype))
+   bad_expr_type(expr);
+   if (!is_ptr_type(ltype)  is_ptr_type(rtype))
+   bad_expr_type(expr);
expr-op = modify_for_unsigned(expr-op);
goto OK;
}

/full-disclosure

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread David Stevens
From: Stephen Hemminger [EMAIL PROTECTED]

 IMHO converting skb-dev to skb-devindex and using ifindex sounds best.
 It gets rid of the need to refcount as much but keeps the safety from
 buggy protocols.  Ipv6 could probably use ifindex as well.

A couple years ago, we identified a performance problem in
IPv6 because it had the index and not the dev in a per-packet piece
of code. The result was spending a lot of time searching the device
list just to increment a (required) per-device MIB counter.
I don't know if that's still there or not, but of course
those sorts of implications should be considered, too.

+-DLS

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread Hasso Tepper
jamal wrote:
 a) if user-app using netlink modifies something, the event will report
 the nl_pid mapped to its pid or other non-zero value depending on
 the number of netlink sockets mapped to that process/user-app

It will always be nl_pid of the socket. And don't forget that this pid and 
negative numbers stuff is valid only if application doesn't set nl_pid. 
It's completely legal for application to do that before calling bind() 
and if there's no conflict (no already open netlink socket with this 
nl_pid), everything is OK.


-- 
Hasso Tepper
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread Andi Kleen
On Thursday 09 February 2006 00:07, David Stevens wrote:
 From: Stephen Hemminger [EMAIL PROTECTED]
 
  IMHO converting skb-dev to skb-devindex and using ifindex sounds best.
  It gets rid of the need to refcount as much but keeps the safety from
  buggy protocols.  Ipv6 could probably use ifindex as well.
 
 A couple years ago, we identified a performance problem in
 IPv6 because it had the index and not the dev in a per-packet piece
 of code. The result was spending a lot of time searching the device
 list just to increment a (required) per-device MIB counter.
 I don't know if that's still there or not, but of course
 those sorts of implications should be considered, too.

It's using a hash these days.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread Herbert Xu
On Wed, Feb 08, 2006 at 02:47:12PM -0800, David S. Miller wrote:
 From: David Stevens [EMAIL PROTECTED]
 Date: Wed, 8 Feb 2006 14:45:08 -0800
 
  Why would sparse complain about this? 0 is a well-defined
  pointer value (the only value guaranteed to be by the language).
 
 Because sparse goes beyond the standards and tries to
 catch cases that usually end up being bugs.

Well to be frank the only significant NULL/0 bug that I know can't
be caught by converting 0 to NULL anyway.  The bug only exists on
architectures where the representations of null-pointers of types
are different.  On those architectures, if you have something like

int func(char *fmt, ...);

...

func(foo, NULL);

Then this may break if func actually expected a int * while NULL
could be something different altogether.  The only safe way to
write this is to have

func(foo, (int *)NULL);

or

func(foo, (int *)0);

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6 address autoconfiguration does not work after device down/up cycle

2006-02-08 Thread YOSHIFUJI Hideaki / 吉藤英明
Hello.

In article [EMAIL PROTECTED] (at Wed, 08 Feb 2006 17:17:24 +0200), Kristian 
Slavov [EMAIL PROTECTED] says:

 During NETDEV_DOWN we clear IF_READY, and we don't set it back in 
 NETDEV_UP. While starting to perform DAD on the link-local address, we 
 notice that the device is not in IF_READY, and we abort autoconfiguration 
 process (which would eventually send router solicitations).

Good spot.  Please drop {}, otherwise, I agree.  Thank you.

--yoshfuji
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6 address autoconfiguration does not work after device down/up cycle

2006-02-08 Thread David S. Miller
From: YOSHIFUJI Hideaki [EMAIL PROTECTED]
Date: Thu, 09 Feb 2006 08:44:36 +0900 (JST)

 Hello.
 
 In article [EMAIL PROTECTED] (at Wed, 08 Feb 2006 17:17:24 +0200), Kristian 
 Slavov [EMAIL PROTECTED] says:
 
  During NETDEV_DOWN we clear IF_READY, and we don't set it back in 
  NETDEV_UP. While starting to perform DAD on the link-local address, we 
  notice that the device is not in IF_READY, and we abort autoconfiguration 
  process (which would eventually send router solicitations).
 
 Good spot.  Please drop {}, otherwise, I agree.  Thank you.

I'll take care of fixing that when I put in the fix.
Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread Herbert Xu
Alexey Dobriyan [EMAIL PROTECTED] wrote:
 
 Oh, and for the record: current sparse from Linus doesn't warn about
 this. Slightly modified sparse warns. Bugs which were uncovered by
 more or less trivial and slightly broken sparse patch [1] are:
 
[PATCH] dscc4: fix dscc4_init_dummy_skb check
[PATCH] opl3sa2: fix adding second dma channel
[PATCH] gusclassic: fix adding second dma channel
[PATCH] ipw2200: fix -eeprom[EEPROM_VERSION] check
[PATCH] ixj: fix writing silence check

The first two can also be caught with gcc -pedantic.  In fact, catching
the last two should be doable there as well.  The difference between
gcc -pedantic and sparse is that it doesn't warn about obviously correct
cases like p != 0 or p = 0.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread David S. Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Thu, 09 Feb 2006 10:49:37 +1100

 The difference between gcc -pedantic and sparse is that it doesn't
 warn about obviously correct cases like p != 0 or p = 0.

So obviously correct that you left out an equals sign in the
second case :-)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread Randy.Dunlap
On Wed, 8 Feb 2006, David S. Miller wrote:

 From: Herbert Xu [EMAIL PROTECTED]
 Date: Thu, 09 Feb 2006 10:49:37 +1100

  The difference between gcc -pedantic and sparse is that it doesn't
  warn about obviously correct cases like p != 0 or p = 0.

 So obviously correct that you left out an equals sign in the
 second case :-)

You should thank him.  8;)

-- 
~Randy
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6 address autoconfiguration does not work after device down/up cycle

2006-02-08 Thread YOSHIFUJI Hideaki / 吉藤英明
In article [EMAIL PROTECTED] (at Wed, 08 Feb 2006 15:59:31 -0800 (PST)), 
David S. Miller [EMAIL PROTECTED] says:

  Good spot.  Please drop {}, otherwise, I agree.  Thank you.
 
 I'll take care of fixing that when I put in the fix.

Okay, thanks.

Acked-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]

--yoshfuji
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ppp: don't use 0 in pointer context

2006-02-08 Thread Herbert Xu
On Wed, Feb 08, 2006 at 03:58:59PM -0800, David S. Miller wrote:
 From: Herbert Xu [EMAIL PROTECTED]
 Date: Thu, 09 Feb 2006 10:49:37 +1100
 
  The difference between gcc -pedantic and sparse is that it doesn't
  warn about obviously correct cases like p != 0 or p = 0.
 
 So obviously correct that you left out an equals sign in the
 second case :-)

Actually I intended that to be an assignment as this is something that
has generated many sparse patches.  Now if this was a comparison
mistakenly written as an assignment, gcc would pick it up anyway:

a.c:3: warning: suggest parentheses around assignment used as truth value

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread jamal
On Thu, 2006-09-02 at 01:46 +0300, Alexey Kuznetsov wrote:
 Hello!
 
  What about the dilemma of when there are no netlink sockets
  involved? ;- 
  i.e what is the semantics when there is no netlink socket to map them
  to, such as in the case of ioctl?
 
 0, which is legal address of kernel socket.
 
 In this case it means that kernel did the operation.

Ok, that settles it then - so I have no qualms with the patch.
All along i was looking at the nl_pid as a representation of
who ordered the operation not who did the operation
I actually have found such knowledge useful in user space.
It is a good thing Hasso is writing the manpages because he
has good grasp of the whole thing.

Alexey - if you have time take a look at the sendmsg() thing i 
referenced earlier.

cheers,
jamal


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] recent commit breaks multi-descriptor receives with ip fragments

2006-02-08 Thread Jesse Brandeburg

On Tue, 7 Feb 2006, David S. Miller wrote:

From: Jesse Brandeburg [EMAIL PROTECTED]
Date: Tue, 7 Feb 2006 17:41:28 -0800 (Pacific Standard Time)

 so we generally call dev_alloc_skb to get the receive buffers to give to
 our hardware.  When we use multiple receive buffers what is the right way
 to allocate memory to give buffers to the hardware and then later, to
 chain the descriptors together to make the packet?  Using skb's is the
 common way as far as I've understood it.

 Your input on the correct way to do these things is greatly appreciated.

Allocate a single SKB and fill in the skb_shared_info() page/offset/len
pairs, making sure to take proper references to the pages you add.
Coalesce when possible.


Given that there is code in the kernel in
net/ipv4/ip_fragment.c: ip_frag_reasm

/* If the first fragment is fragmented itself, we split
 * it to two chunks: the first with data and paged part
 * and the second, holding only fragments. */

checks for a frag_list and in response, *creates* the frag 
list inside a frag list, when does that code get used?


Basically it appears that code has been there for a long time (since 2.4) 
and both e1000 and other drivers use it.  If the change to 
skb_copy_datagram_iovec is reverted, this method of chaining skbs 
continues to work.  I don't see any clear reason why it shouldn't work.


Jesse
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread Alexey Kuznetsov
Hello!

 BTW, Alexey - if you have a chance can you look at the breakage of
 sendmsg() in relation to multicast that exists today?

If it is is about groups  31, which cannot be mapped to nl_groups,
it is possible just to add setsockopt(), setting dst_group. It is just
to complete the API.

Do something use this? I do not think so.


 (libnetlink assumes its existence). I can explain more if what i said
 didnt make sense.

Please...

Alexey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] illegal use of pid in rtnetlink

2006-02-08 Thread David S. Miller
From: jamal [EMAIL PROTECTED]
Date: Wed, 08 Feb 2006 19:07:25 -0500

 On Thu, 2006-09-02 at 01:46 +0300, Alexey Kuznetsov wrote:
  Hello!
  
   What about the dilemma of when there are no netlink sockets
   involved? ;- 
   i.e what is the semantics when there is no netlink socket to map them
   to, such as in the case of ioctl?
  
  0, which is legal address of kernel socket.
  
  In this case it means that kernel did the operation.
 
 Ok, that settles it then - so I have no qualms with the patch.
 All along i was looking at the nl_pid as a representation of
 who ordered the operation not who did the operation
 I actually have found such knowledge useful in user space.
 It is a good thing Hasso is writing the manpages because he
 has good grasp of the whole thing.

I'll put in the patch either tonight or tomorrow.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] drivers/net/wireless: correct reported ssid lengths

2006-02-08 Thread John W. Linville
On Mon, Jan 30, 2006 at 02:28:43PM -0800, Jean Tourrilhes wrote:
 On Mon, Jan 30, 2006 at 01:28:44PM -0500, Dan Williams wrote:
  
  I'll post a revert patch for the patch I originally sent so that we go
  back to the original behavior.
 
   Sorry, I may have overreacted. I think that's a worthwhile
 change, let's just plan it properly.

Is this resolved?  Does Dan still need to send a patch?

John
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] check connect(2) status for IPv6 UDP socket

2006-02-08 Thread Herbert Xu
David S. Miller [EMAIL PROTECTED] wrote:
 From: Nicolas DICHTEL [EMAIL PROTECTED]
 Date: Mon, 06 Feb 2006 12:00:30 +0100
 
 in the same way of this patch, why dst_entry are stored for
 RAW socket ? In case of specific IPSec rules for ICMPv6,
 xfrm state can be different for the same destination.
 Attached, a proposed patch.
 
 We cache the flow we used to store that dst into the socket,
 and we'll verify that on the next sendmsg() call so it's OK.
 
 See the checks done in ip6_dst_lookup() when we have a cached
 route attached to the socket.

I think he's saying that the checks in ip6_dst_lookup is not enough
for IPsec because it only checks the destination address and oif
instead of all the addresses/protocol/ports.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KERNEL: assertion (!sk-sk_forward_alloc) failed

2006-02-08 Thread Jesse Brandeburg
On 2/8/06, David S. Miller [EMAIL PROTECTED] wrote:
 From: Jesse Brandeburg [EMAIL PROTECTED]
 Date: Wed, 8 Feb 2006 12:07:14 -0800

  this should be on netdev (cc'd), i included some of the thread here.
  ...
  I though Herbert had fixed these, and it looks like half the patches
  got into 2.6.14.3, but not the fix to the fix committed on 9-6 (not in
  2.6.14.* at all)

 What are the changeset IDs so I can fix this?

I think the commit id that is missing from 2.6.14.X is
fb5f5e6e0cebd574be737334671d1aa8f170d5f3

but here is the web link if i gave the wrong info
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fb5f5e6e0cebd574be737334671d1aa8f170d5f3
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: Assert: CPU #..., mangle/filter comefrom(....) = ...

2006-02-08 Thread Andrew Morton


Begin forwarded message:

Date: Thu, 09 Feb 2006 06:10:49 +0100
From: Knut Petersen [EMAIL PROTECTED]
To: linux-kernel@vger.kernel.org
Subject: Assert: CPU #..., mangle/filter comefrom() = ...


As there probably is a reason to printk assertions:

[   28.616455] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   28.617112] ASSERT: CPU #0, mangle comefrom(f75ae890) = 1
[   28.617115] ASSERT: CPU #0, filter comefrom(f79ca090) = 2
[   28.934256] ASSERT: CPU #0, mangle comefrom(f7051890) = 1
[   28.934260] ASSERT: CPU #0, filter comefrom(f7aa9090) = 2
[   32.766440] ASSERT: CPU #0, mangle comefrom(f709c090) = 1
[   32.766443] ASSERT: CPU #0, filter comefrom(f71f3090) = 2

cu,
 Knut
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html