Re: [PATCH 2.6.20-rc3]: 8139cp: Don't blindly enable interrupts

2007-01-18 Thread Chris Lalancette
Jeff Garzik wrote:

 Chris Lalancette wrote:

 Francois Romieu wrote:

 Chris Lalancette [EMAIL PROTECTED] :
 [...]
  

 Thanks for the comments.  While the patch you sent will help,
 there are
 still other places that will have problems.  For example, in
 netpoll_send_skb,
 we call local_irq_save(flags), then call dev-hard_start_xmit(),
 and then call
 local_irq_restore(flags).  This is a similar situation to what I
 described
 above; we will re-enable interrupts in cp_start_xmit(), when
 netpoll_send_skb
 doesn't expect that, and will probably run into issues.
 Is there a problem with changing cp_start_xmit to use the
 spin_lock_irqsave(), besides the extra instructions it needs?
   

 No. Given the history of locking in netpoll and the content of
 Documentation/networking/netdevices.txt, asking Herbert which rule(s)
 the code is supposed to follow seemed safer to me.

 You can forget my patch.

 Please resend your patch inlined to Jeff as described in
 http://linux.yyz.us/patch-format.html.

  

 Francois,
  Great.  Resending mail, shortening subject to  65 characters and
 inlining the patch.

 Thanks,
 Chris Lalancette

 Similar to this commit:

 http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d15e9c4d9a75702b30e00cdf95c71c88e3f3f51e


 It's not safe in cp_start_xmit to blindly call spin_lock_irq and then
 spin_unlock_irq, since it may very well be the case that cp_start_xmit
 was called with interrupts already disabled (I came across this bug in
 the context of netdump in RedHat kernels, but the same issue holds, for
 example, in netconsole). Therefore, replace all instances of
 spin_lock_irq and spin_unlock_irq with spin_lock_irqsave and
 spin_unlock_irqrestore, respectively, in cp_start_xmit(). I tested this
 against a fully-virtualized Xen guest using netdump, which happens to
 use the 8139cp driver to talk to the emulated hardware. I don't have a
 real piece of 8139cp hardware to test on, so someone else will have to
 do that.

 Signed-off-by: Chris Lalancette [EMAIL PROTECTED]


 applied.

 In the future, please remove the quoted emails stuff, and anything
 else that does not belong in the kernel changelog.  It must be
 hand-edited out, before using git-am to merge your patch into the
 kernel tree.

 Jeff



Jeff,
 Ah, I see.  Noted.  Thanks.

Chris Lalancette
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]: 8139cp: Don't blindly enable interrupts in cp_start_xmit

2007-01-16 Thread Chris Lalancette
Francois Romieu wrote:
 Chris Lalancette [EMAIL PROTECTED] :
 [...]
 
 Similar to this commit:

http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d15e9c4d9a75702b30e00cdf95c71c88e3f3f51e

It's not safe in cp_start_xmit to blindly call spin_lock_irq and then
spin_unlock_irq, since it may very well be the case that cp_start_xmit
was called with interrupts already disabled (I came across this bug in
the context of netdump in RedHat kernels, but the same issue holds, for
example, in netconsole).  Therefore, replace all instances of spin_lock_irq
and spin_unlock_irq with spin_lock_irqsave and spin_unlock_irqrestore,
respectively, in cp_start_xmit().  I tested this against a fully-virtualized
Xen guest, which happens to use the 8139cp driver to talk to the emulated
hardware.  I don't have a real piece of 8139cp hardware to test on, so
someone else will have to do that.
 
 
 (message reformated to fit in 80 columns, please fix your mailer)
 
 As I understand http://lkml.org/lkml/2006/12/12/239, something like the
 patch below should had been sent instead. Herbert, ack/nak ?
 

Francois,
 Thanks for the comments.  While the patch you sent will help, there are
still other places that will have problems.  For example, in netpoll_send_skb,
we call local_irq_save(flags), then call dev-hard_start_xmit(), and then call
local_irq_restore(flags).  This is a similar situation to what I described
above; we will re-enable interrupts in cp_start_xmit(), when netpoll_send_skb
doesn't expect that, and will probably run into issues.
 Is there a problem with changing cp_start_xmit to use the
spin_lock_irqsave(), besides the extra instructions it needs?

Thanks,
Chris Lalancette
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2.6.20-rc3]: 8139cp: Don't blindly enable interrupts

2007-01-16 Thread Chris Lalancette
Francois Romieu wrote:

Chris Lalancette [EMAIL PROTECTED] :
[...]
  

 Thanks for the comments.  While the patch you sent will help, there are
still other places that will have problems.  For example, in netpoll_send_skb,
we call local_irq_save(flags), then call dev-hard_start_xmit(), and then call
local_irq_restore(flags).  This is a similar situation to what I described
above; we will re-enable interrupts in cp_start_xmit(), when netpoll_send_skb
doesn't expect that, and will probably run into issues.
 Is there a problem with changing cp_start_xmit to use the
spin_lock_irqsave(), besides the extra instructions it needs?



No. Given the history of locking in netpoll and the content of
Documentation/networking/netdevices.txt, asking Herbert which rule(s)
the code is supposed to follow seemed safer to me.

You can forget my patch.

Please resend your patch inlined to Jeff as described in
http://linux.yyz.us/patch-format.html.

  

Francois,
 Great.  Resending mail, shortening subject to  65 characters and
inlining the patch.

Thanks,
Chris Lalancette

Similar to this commit:

http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d15e9c4d9a75702b30e00cdf95c71c88e3f3f51e

It's not safe in cp_start_xmit to blindly call spin_lock_irq and then
spin_unlock_irq, since it may very well be the case that cp_start_xmit
was called with interrupts already disabled (I came across this bug in
the context of netdump in RedHat kernels, but the same issue holds, for
example, in netconsole). Therefore, replace all instances of
spin_lock_irq and spin_unlock_irq with spin_lock_irqsave and
spin_unlock_irqrestore, respectively, in cp_start_xmit(). I tested this
against a fully-virtualized Xen guest using netdump, which happens to
use the 8139cp driver to talk to the emulated hardware. I don't have a
real piece of 8139cp hardware to test on, so someone else will have to
do that.

Signed-off-by: Chris Lalancette [EMAIL PROTECTED]

diff --git a/drivers/net/8139cp.c b/drivers/net/8139cp.c
index e2cb19b..6f93a76 100644
--- a/drivers/net/8139cp.c
+++ b/drivers/net/8139cp.c
@@ -765,17 +765,18 @@ static int cp_start_xmit (struct sk_buff *skb, struct 
net_device *dev)
struct cp_private *cp = netdev_priv(dev);
unsigned entry;
u32 eor, flags;
+   unsigned long intr_flags;
 #if CP_VLAN_TAG_USED
u32 vlan_tag = 0;
 #endif
int mss = 0;
 
-   spin_lock_irq(cp-lock);
+   spin_lock_irqsave(cp-lock, intr_flags);
 
/* This is a hard error, log it. */
if (TX_BUFFS_AVAIL(cp) = (skb_shinfo(skb)-nr_frags + 1)) {
netif_stop_queue(dev);
-   spin_unlock_irq(cp-lock);
+   spin_unlock_irqrestore(cp-lock, intr_flags);
printk(KERN_ERR PFX %s: BUG! Tx Ring full when queue awake!\n,
   dev-name);
return 1;
@@ -908,7 +909,7 @@ static int cp_start_xmit (struct sk_buff *skb, struct 
net_device *dev)
if (TX_BUFFS_AVAIL(cp) = (MAX_SKB_FRAGS + 1))
netif_stop_queue(dev);
 
-   spin_unlock_irq(cp-lock);
+   spin_unlock_irqrestore(cp-lock, intr_flags);
 
cpw8(TxPoll, NormalTxPoll);
dev-trans_start = jiffies;


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH]: 8139cp: Don't blindly enable interrupts in cp_start_xmit

2007-01-15 Thread Chris Lalancette
(trying again, this time to the correct maintainer)

All,
 Similar to this commit:

http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d15e9c4d9a75702b30e00cdf95c71c88e3f3f51e

It's not safe in cp_start_xmit to blindly call spin_lock_irq and then 
spin_unlock_irq, since it may very well be the case that cp_start_xmit was 
called with interrupts already disabled (I came across this bug in the context 
of netdump in RedHat kernels, but the same issue holds, for example, in 
netconsole).  Therefore, replace all instances of spin_lock_irq and 
spin_unlock_irq with spin_lock_irqsave and spin_unlock_irqrestore, 
respectively, in cp_start_xmit().  I tested this against a fully-virtualized 
Xen guest, which happens to use the 8139cp driver to talk to the emulated 
hardware.  I don't have a real piece of 8139cp hardware to test on, so someone 
else will have to do that.

Signed-off-by: Chris Lalancette [EMAIL PROTECTED]
diff --git a/drivers/net/8139cp.c b/drivers/net/8139cp.c
index e2cb19b..6f93a76 100644
--- a/drivers/net/8139cp.c
+++ b/drivers/net/8139cp.c
@@ -765,17 +765,18 @@ static int cp_start_xmit (struct sk_buff *skb, struct net_device *dev)
 	struct cp_private *cp = netdev_priv(dev);
 	unsigned entry;
 	u32 eor, flags;
+	unsigned long intr_flags;
 #if CP_VLAN_TAG_USED
 	u32 vlan_tag = 0;
 #endif
 	int mss = 0;
 
-	spin_lock_irq(cp-lock);
+	spin_lock_irqsave(cp-lock, intr_flags);
 
 	/* This is a hard error, log it. */
 	if (TX_BUFFS_AVAIL(cp) = (skb_shinfo(skb)-nr_frags + 1)) {
 		netif_stop_queue(dev);
-		spin_unlock_irq(cp-lock);
+		spin_unlock_irqrestore(cp-lock, intr_flags);
 		printk(KERN_ERR PFX %s: BUG! Tx Ring full when queue awake!\n,
 		   dev-name);
 		return 1;
@@ -908,7 +909,7 @@ static int cp_start_xmit (struct sk_buff *skb, struct net_device *dev)
 	if (TX_BUFFS_AVAIL(cp) = (MAX_SKB_FRAGS + 1))
 		netif_stop_queue(dev);
 
-	spin_unlock_irq(cp-lock);
+	spin_unlock_irqrestore(cp-lock, intr_flags);
 
 	cpw8(TxPoll, NormalTxPoll);
 	dev-trans_start = jiffies;




[PATCH]: 8139cp: Don't blindly enable interrupts in cp_start_xmit

2007-01-08 Thread Chris Lalancette
All,
 Similar to this commit:

http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d15e9c4d9a75702b30e00cdf95c71c88e3f3f51e

It's not safe in cp_start_xmit to blindly call spin_lock_irq and then 
spin_unlock_irq, since it may very well be the case that cp_start_xmit was 
called with interrupts already disabled (I came across this bug in the context 
of netdump in RedHat kernels, but the same issue holds, for example, in 
netconsole).  Therefore, replace all instances of spin_lock_irq and 
spin_unlock_irq with spin_lock_irqsave and spin_unlock_irqrestore, 
respectively, in cp_start_xmit().  I tested this against a fully-virtualized 
Xen guest, which happens to use the 8139cp driver to talk to the emulated 
hardware.  I don't have a real piece of 8139cp hardware to test on, so someone 
else will have to do that.

Signed-off-by: Chris Lalancette [EMAIL PROTECTED]


diff --git a/drivers/net/8139cp.c b/drivers/net/8139cp.c
index e2cb19b..6f93a76 100644
--- a/drivers/net/8139cp.c
+++ b/drivers/net/8139cp.c
@@ -765,17 +765,18 @@ static int cp_start_xmit (struct sk_buff *skb, struct net_device *dev)
 	struct cp_private *cp = netdev_priv(dev);
 	unsigned entry;
 	u32 eor, flags;
+	unsigned long intr_flags;
 #if CP_VLAN_TAG_USED
 	u32 vlan_tag = 0;
 #endif
 	int mss = 0;
 
-	spin_lock_irq(cp-lock);
+	spin_lock_irqsave(cp-lock, intr_flags);
 
 	/* This is a hard error, log it. */
 	if (TX_BUFFS_AVAIL(cp) = (skb_shinfo(skb)-nr_frags + 1)) {
 		netif_stop_queue(dev);
-		spin_unlock_irq(cp-lock);
+		spin_unlock_irqrestore(cp-lock, intr_flags);
 		printk(KERN_ERR PFX %s: BUG! Tx Ring full when queue awake!\n,
 		   dev-name);
 		return 1;
@@ -908,7 +909,7 @@ static int cp_start_xmit (struct sk_buff *skb, struct net_device *dev)
 	if (TX_BUFFS_AVAIL(cp) = (MAX_SKB_FRAGS + 1))
 		netif_stop_queue(dev);
 
-	spin_unlock_irq(cp-lock);
+	spin_unlock_irqrestore(cp-lock, intr_flags);
 
 	cpw8(TxPoll, NormalTxPoll);
 	dev-trans_start = jiffies;




Re: [PATCH]: Fix netpoll arp_reply for multiple routers

2006-12-07 Thread Chris Lalancette
Chris Lalancette wrote:
 All,
  Attached is a patch to fix arp_reply when you have multiple possible 
 routers doing ARP requests to you.  Currently arp_reply always replies to the 
 MAC address that it was given when it starts.  However, if you have multiple 
 routers that can ARP request you, you might respond to the wrong router; in 
 which case the other router will eventually drop you from it's ARP cache and 
 stop delivering packets to you.  The fix is to always reply to the router 
 that asked you, not the one you have hard-coded.
  I do not have the setup to test this; the customer that does have the 
 setup has tested the patch and reports that it fixes the problems they were 
 having.
 
 
 Signed-off-by: Chris Lalancette [EMAIL PROTECTED]
 
 
 
 
 diff --git a/net/core/netpoll.c b/net/core/netpoll.c
 index 3c58846..5833b21 100644
 --- a/net/core/netpoll.c
 +++ b/net/core/netpoll.c
 @@ -331,6 +331,7 @@ static void arp_reply(struct sk_buff *sk
   unsigned char *arp_ptr;
   int size, type = ARPOP_REPLY, ptype = ETH_P_ARP;
   __be32 sip, tip;
 + unsigned char *sha;
   struct sk_buff *send_skb;
   struct netpoll *np = NULL;
  
 @@ -357,9 +358,14 @@ static void arp_reply(struct sk_buff *sk
   arp-ar_op != htons(ARPOP_REQUEST))
   return;
  
 - arp_ptr = (unsigned char *)(arp+1) + skb-dev-addr_len;
 + arp_ptr = (unsigned char *)(arp+1);
 + /* save the location of the src hw addr */
 + sha = arp_ptr;
 + arp_ptr += skb-dev-addr_len;
   memcpy(sip, arp_ptr, 4);
 - arp_ptr += 4 + skb-dev-addr_len;
 + arp_ptr += 4;
 + /* if we actually cared about dst hw addr, it would get copied here */
 + arp_ptr += skb-dev-addr_len;
   memcpy(tip, arp_ptr, 4);
  
   /* Should we ignore arp? */
 @@ -382,7 +388,7 @@ static void arp_reply(struct sk_buff *sk
  
   if (np-dev-hard_header 
   np-dev-hard_header(send_skb, skb-dev, ptype,
 -  np-remote_mac, np-local_mac,
 +  sha, np-local_mac,
send_skb-len)  0) {
   kfree_skb(send_skb);
   return;
 @@ -406,7 +412,7 @@ static void arp_reply(struct sk_buff *sk
   arp_ptr += np-dev-addr_len;
   memcpy(arp_ptr, tip, 4);
   arp_ptr += 4;
 - memcpy(arp_ptr, np-remote_mac, np-dev-addr_len);
 + memcpy(arp_ptr, sha, np-dev-addr_len);
   arp_ptr += np-dev-addr_len;
   memcpy(arp_ptr, sip, 4);
  

All,
 Sorry for the confusion...this was already picked up by akpm on linux-net. 
 Please don't take the patch.

Thanks,
Chris Lalancette
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH]: Fix netpoll arp_reply for multiple routers

2006-12-06 Thread Chris Lalancette
All,
 Attached is a patch to fix arp_reply when you have multiple possible 
routers doing ARP requests to you.  Currently arp_reply always replies to the 
MAC address that it was given when it starts.  However, if you have multiple 
routers that can ARP request you, you might respond to the wrong router; in 
which case the other router will eventually drop you from it's ARP cache and 
stop delivering packets to you.  The fix is to always reply to the router that 
asked you, not the one you have hard-coded.
 I do not have the setup to test this; the customer that does have the 
setup has tested the patch and reports that it fixes the problems they were 
having.


Signed-off-by: Chris Lalancette [EMAIL PROTECTED]
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 3c58846..5833b21 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -331,6 +331,7 @@ static void arp_reply(struct sk_buff *sk
 	unsigned char *arp_ptr;
 	int size, type = ARPOP_REPLY, ptype = ETH_P_ARP;
 	__be32 sip, tip;
+	unsigned char *sha;
 	struct sk_buff *send_skb;
 	struct netpoll *np = NULL;
 
@@ -357,9 +358,14 @@ static void arp_reply(struct sk_buff *sk
 	arp-ar_op != htons(ARPOP_REQUEST))
 		return;
 
-	arp_ptr = (unsigned char *)(arp+1) + skb-dev-addr_len;
+	arp_ptr = (unsigned char *)(arp+1);
+	/* save the location of the src hw addr */
+	sha = arp_ptr;
+	arp_ptr += skb-dev-addr_len;
 	memcpy(sip, arp_ptr, 4);
-	arp_ptr += 4 + skb-dev-addr_len;
+	arp_ptr += 4;
+	/* if we actually cared about dst hw addr, it would get copied here */
+	arp_ptr += skb-dev-addr_len;
 	memcpy(tip, arp_ptr, 4);
 
 	/* Should we ignore arp? */
@@ -382,7 +388,7 @@ static void arp_reply(struct sk_buff *sk
 
 	if (np-dev-hard_header 
 	np-dev-hard_header(send_skb, skb-dev, ptype,
- np-remote_mac, np-local_mac,
+ sha, np-local_mac,
  send_skb-len)  0) {
 		kfree_skb(send_skb);
 		return;
@@ -406,7 +412,7 @@ static void arp_reply(struct sk_buff *sk
 	arp_ptr += np-dev-addr_len;
 	memcpy(arp_ptr, tip, 4);
 	arp_ptr += 4;
-	memcpy(arp_ptr, np-remote_mac, np-dev-addr_len);
+	memcpy(arp_ptr, sha, np-dev-addr_len);
 	arp_ptr += np-dev-addr_len;
 	memcpy(arp_ptr, sip, 4);
 


Re: Zero checksum in netconsole/netdump packets

2006-11-07 Thread Chris Lalancette

David Miller wrote:

From: Chris Lalancette [EMAIL PROTECTED]
Date: Mon, 06 Nov 2006 18:40:59 -0500


 Assuming that this is just an oversight, attached is a simple
 patch to compute the UDP checksum in netpoll_send_udp.


If the resulting checksum is zero, you should set it to
all 1's, like the real UDP code does.


David,
Ah, thanks.  Forgot about that.  I re-spun the patch with the change 
(attached).  I also moved the UDP checksum calculation up to where the rest of 
the UDP header setup is, to make it more consistent.

Thanks again for the comments!

Signed-off-by: Chris Lalancette [EMAIL PROTECTED]
--- linux-2.6/net/core/netpoll.c.orig	2006-11-06 18:16:58.0 -0500
+++ linux-2.6/net/core/netpoll.c	2006-11-07 08:16:29.0 -0500
@@ -340,6 +340,12 @@ void netpoll_send_udp(struct netpoll *np
 	udph-dest = htons(np-remote_port);
 	udph-len = htons(udp_len);
 	udph-check = 0;
+	udph-check = csum_tcpudp_magic(htonl(np-local_ip),
+	htonl(np-remote_ip),
+	udp_len, IPPROTO_UDP,
+	csum_partial((unsigned char *)udph, udp_len, 0));
+	if (udph-check == 0)
+		udph-check = -1;
 
 	skb-nh.iph = iph = (struct iphdr *)skb_push(skb, sizeof(*iph));
 


Zero checksum in netconsole/netdump packets

2006-11-06 Thread Chris Lalancette
Hello,
  I was reading some tcpdump's of netdump traffic today, and I realized 
that all of the packets that go from the crashing machine to the netdump server 
have a zero checksum.  Looking at the code, it looks like netconsole/netdump 
use the function netpoll_send_udp to send out the packets.  However, in 
netdump_send_udp, the checksum is set to 0, and never seems to be computed.  Is 
this intentional, or just an oversight?  I would think that we would always 
want to compute the UDP checksum, but there might be something I am 
overlooking.  Incidentally, it seems like the only user of netpoll_send_udp is 
netconsole (and netdump in RedHat kernels).
 Assuming that this is just an oversight, attached is a simple patch to 
compute the UDP checksum in netpoll_send_udp.

Signed-off-by: Chris Lalancette [EMAIL PROTECTED]
--- linux-2.6/net/core/netpoll.c.orig	2006-11-06 18:16:58.0 -0500
+++ linux-2.6/net/core/netpoll.c	2006-11-06 18:31:20.0 -0500
@@ -356,6 +356,10 @@ void netpoll_send_udp(struct netpoll *np
 	put_unaligned(htonl(np-remote_ip), (iph-daddr));
 	iph-check= ip_fast_csum((unsigned char *)iph, iph-ihl);
 
+	udph-check = csum_tcpudp_magic(iph-saddr, iph-daddr, udp_len,
+	IPPROTO_UDP,
+	csum_partial((unsigned char *)udph, udp_len, 0));
+
 	eth = (struct ethhdr *) skb_push(skb, ETH_HLEN);
 	skb-mac.raw = skb-data;
 	skb-protocol = eth-h_proto = htons(ETH_P_IP);