date:20070604

Re: b44: regression in 2.6.22 (resend)

2007-06-04 Thread Thomas Gleixner

On Sun, 2007-06-03 at 18:26 +0200, Maximilian Engelhardt wrote:
  Is there any other strange behavior of the high res enabled kernel than
  the b44 problem ?
 
 I didn't notice anything in the past (as I wrote). But today I did some tests 
 for an updated version of the p54 mac80211 wlan driver and I noticed exactly 
 the same problem:
 
 when booting with highres=off everything is fine.
 But when I boot an highres enabled kernel and I do the iperf-test with the 
 p54 
 driver, my systems becomes unresponsive during the test. It seems to be 
 exactly the same problem I have with the b44 driver.
 So this might not be a bug in the b44 code but a bug somewhere in the linux 
 networking code.
 
 I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
 external as module. 

Can you look at iperf to figure out, whether it does some weird timer
stuff (high frequency interval timer or such) ? Either check the code or
strace it.

tglx


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] TCP : use LIMIT_NETDEBUG in tcp_retransmit_timer()

2007-06-04 Thread Eric Dumazet

LIMIT_NETDEBUG allows the admin to disable some warning messages (echo 0 
/proc/sys/net/core/warnings).


The TCP: Treason uncloaked! message can use this facility.

Signed-off-by: Eric Dumazet [EMAIL PROTECTED]

diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index e613401..e9b151b 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -292,9 +292,9 @@ static void tcp_retransmit_timer(struct 
 * we cannot allow such beasts to hang infinitely.
 */
 #ifdef TCP_DEBUG
-   if (net_ratelimit()) {
+   if (1) {
struct inet_sock *inet = inet_sk(sk);
-   printk(KERN_DEBUG TCP: Treason uncloaked! Peer 
%u.%u.%u.%u:%u/%u shrinks window %u:%u. Repaired.\n,
+   LIMIT_NETDEBUG(KERN_DEBUG TCP: Treason uncloaked! Peer 
%u.%u.%u.%u:%u/%u shrinks window %u:%u. Repaired.\n,
   NIPQUAD(inet-daddr), ntohs(inet-dport),
   inet-num, tp-snd_una, tp-snd_nxt);
}

[PATCH] sundance: PHY address form 0, only for device ID 0x0200 (IP100A)

2007-06-04 Thread Jesse Huang

From: Jesse Huang [EMAIL PROTECTED]

Change Logs:
Search PHY address form 0, only for device ID 0x0200 (IP100A). Other device are 
from PHY address 1.

Signed-off-by: Jesse Huang [EMAIL PROTECTED]
---

 drivers/net/sundance.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

5724a72722dfc9cafbb8f273cb82dbf577bd9ad0
diff --git a/drivers/net/sundance.c b/drivers/net/sundance.c
index e1f912d..914ab29 100644
--- a/drivers/net/sundance.c
+++ b/drivers/net/sundance.c
@@ -562,7 +562,9 @@ #endif
 * It seems some phys doesn't deal well with address 0 being accessed
 * first, so leave address zero to the end of the loop (32  31).
 */
-   for (phy = 1; phy = 32  phy_idx  MII_CNT; phy++) {
+   if(sundance_pci_tbl[np-chip_id].device == 0x0200) phy = 0;
+   else phy = 1;
+   for (; phy = 32  phy_idx  MII_CNT; phy++) {
int phyx = phy  0x1f;
int mii_status = mdio_read(dev, phyx, MII_BMSR);
if (mii_status != 0xmii_status != 0x) {
-- 
1.3.GIT



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

2007-06-04 Thread Milton Miller



On Jun 1, 2007, at 3:45 PM, David Acker wrote:
Ok, I took a stab at coding and testing these ideas.  Below is a patch 
against 2.6.22-rc3.

Let me know what you think.


I think you got most of the ideas.   As Auke noted, your coding style 
is showing again.   And your mailer again munged whitespace (fixed by 
s/^spacespace/space/ s/^$/space/).



  Testing shows that we can hit any of the following scenarios:

Find a buffer not complete with rx-el and rx-s0 set.
I read back the status and see if the receiver is still running.



This means the hardware reached what we think is end of list, good.


Find a buffer not complete with rx-el not set and rx-s0 set.
	I check the next buffer and if it is complete, I free the skb and 
return 0.	
This is the case when hardware read the descriptor between el being 
removed and size being set.



If the next buffer is not  complete, I read the receiver's status to 
see if he is still running.




The polling of still running will save us waiting for irq or next poll 
when hardware read el before we cleared it, at the cost of an ioread 
when there is simply no packet ready and we hit a fill mark.



Find a buffer that is complete with rx-el not set and rx-s0 set.
	It appears that hardware can read the rfd's el-bit, then software can 
clear the rfd el-bit and set the rfd size, and then hardware can come 
in and read the size.


Yes, since the size is after the EL flag in the descriptor, this can 
happen since the pci read is not atomic.


I am reading the status back, although I don't think that I have to in 
this instance.


Actually, you are reading it when the rfd still has EL set.  Since the 
cpu will never encounter that case, the if condition is never 
satisfied.


How about creating a state unknown, for when we think we should check 
the device if its running.
If we are in this state and then encounter a received packet without s0 
set, we can set it back

to running.   We set it when we rx a packet with s0 set.

We then move both io_status reads to the caller.


I am testing a version of this code patched against 2.6.18.4 on my PXA 
255 based system.  I will let you all know how it goes.


I'm assuming this is why the cleanup of the receiver start to always 
start on rx_to_clean got dropped again. :-)


Also, I would like a few sentences in the Driver Operation section IV 
Receive big comment.  Something like


In order to keep updates to the RFD link field from colliding with 
hardware writes to mark packets complete, we use the feature that 
hardware will not write to a size 0 descriptor and mark the previous 
packet as end-of-list (EL).   After updating the link, we remove EL and 
only then restore the size such that hardware may use the 
previous-to-end RFD.


at the end of the first paragraph, and insert software before no 
locking is required in the second.




On the ARM, their is a race condition between software allocating a 
new receive
buffer and hardware writing into a buffer.  The two race on touching 
the last
Receive Frame Descriptor (RFD).  It has its el-bit set and its next 
link equal
to 0.  When hardware encounters this buffer it attempts to write data 
to it
and then update Status Word bits and Actual Count in the RFD.  At the 
same time
software may try to clear the el-bit and set the link address to a new 
buffer.

Since the entire RFD is once cache-line, the two write operations can
collide.  This can lead to the receive unit stalling or freed receive 
buffers

getting written to.

The fix is to set the el-bit on and the size to 0 on the next to last 
buffer
in the chain.  When the hardware encounters this buffer it stops and 
does

not write to it at all.  The hardware issues an RNR interrupt with the
receive unit in the No Resources state.  When software allocates 
buffers,

it can update the tail of the list when either it knows the hardware
has stopped or the previous to the new one to mark marked.
Once it has a new next to last buffer prepared, it can clear the el-bit
and set the size on the previous one.  The race on this buffer is safe
since the link already points to a valid next buffer.  We keep flags
in our software descriptor to note if the el bit is set and if the size
was 0.  When we clear the RFD's el bit and set its size, we also clear
the el flag but we leave the size was 0 bit set.  This was we can find
this buffer again later.

If the hardware sees the el-bit cleared without the size set, it will
move on to the next buffer and skip this one.  If it sees
the size set but the el-bit still set, it will complete that buffer
and then RNR interrupt and wait.


Signed-off-by: David Acker [EMAIL PROTECTED]

---

--- linux-2.6.22-rc3/drivers/net/e100.c.orig	2007-06-01 
16:13:16.0 -0400
+++ linux-2.6.22-rc3/drivers/net/e100.c	2007-06-01 16:20:36.0 
-0400

@@ -281,10 +281,17 @@ struct csr {
 };

 enum scb_status {
+   rus_no_res   = 0x08,
rus_ready= 0x10,
rus_mask =

Re: [PATCH] fix race in AF_UNIX

2007-06-04 Thread Miklos Szeredi

 A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
 send()+close() on the peer, causing recv() to return zero, even though
 the sent data should be received.
 
 This happens if the send() and the close() is performed between
 skb_dequeue() and checking sk-sk_shutdown in unix_stream_recvmsg():
 
 process A  skb_dequeue() returns NULL, there's no data in the socket queue
 process B  new data is inserted onto the queue by unix_stream_sendmsg()
 process B  sk-sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
 process A  sk-sk_shutdown is checked, unix_release_sock() returns zero

This is only part of the story.  It turns out, there are other races
involving the garbage collector, that can throw away perfectly good
packets with AF_UNIX sockets in them.

The problems arise when a socket goes from installed to in-flight or
vica versa during garbage collection.  Since gc is done with a
spinlock held, this only shows up on SMP.

The following patch fixes it for me, but it's possibly the wrong
approach.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux-2.6.22-rc2/net/unix/garbage.c
===
--- linux-2.6.22-rc2.orig/net/unix/garbage.c2007-06-03 23:58:11.0 
+0200
+++ linux-2.6.22-rc2/net/unix/garbage.c 2007-06-04 11:39:42.0 +0200
@@ -90,6 +90,7 @@
 static struct sock *gc_current = GC_HEAD; /* stack of objects to mark */
 
 atomic_t unix_tot_inflight = ATOMIC_INIT(0);
+DECLARE_RWSEM(unix_gc_sem);
 
 
 static struct sock *unix_get_socket(struct file *filp)
@@ -169,7 +170,7 @@ static void maybe_unmark_and_push(struct
 
 void unix_gc(void)
 {
-   static DEFINE_MUTEX(unix_gc_sem);
+   static DEFINE_MUTEX(unix_gc_local_lock);
int i;
struct sock *s;
struct sk_buff_head hitlist;
@@ -179,9 +180,22 @@ void unix_gc(void)
 *  Avoid a recursive GC.
 */
 
-   if (!mutex_trylock(unix_gc_sem))
+   if (!mutex_trylock(unix_gc_local_lock))
return;
 
+
+   /*
+* unix_gc_sem protects against sockets going from in-flight to
+* installed
+*
+* Can't sleep on this, because skb_recv_datagram could be
+* waiting for a packet that is to be sent by the thread which
+* invoked the gc
+*/
+   if (!down_write_trylock(unix_gc_sem)) {
+   mutex_unlock(unix_gc_local_lock);
+   return;
+   }
spin_lock(unix_table_lock);
 
forall_unix_sockets(i, s)
@@ -207,8 +221,6 @@ void unix_gc(void)
 
forall_unix_sockets(i, s)
{
-   int open_count = 0;
-
/*
 *  If all instances of the descriptor are not
 *  in flight we are in use.
@@ -218,10 +230,20 @@ void unix_gc(void)
 *  In this case (see unix_create1()) we set artificial
 *  negative inflight counter to close race window.
 *  It is trick of course and dirty one.
+*
+*  Get the inflight counter first, then the open
+*  counter.  This avoids problems if racing with
+*  sendmsg
+*
+*  If just created socket is not yet attached to
+*  a file descriptor, assume open_count of 1
 */
+   int inflight_count = atomic_read(unix_sk(s)-inflight);
+   int open_count = 1;
+
if (s-sk_socket  s-sk_socket-file)
open_count = file_count(s-sk_socket-file);
-   if (open_count  atomic_read(unix_sk(s)-inflight))
+   if (open_count  inflight_count)
maybe_unmark_and_push(s);
}
 
@@ -302,11 +324,12 @@ void unix_gc(void)
u-gc_tree = GC_ORPHAN;
}
spin_unlock(unix_table_lock);
+   up_write(unix_gc_sem);
 
/*
 *  Here we are. Hitlist is filled. Die.
 */
 
__skb_queue_purge(hitlist);
-   mutex_unlock(unix_gc_sem);
+   mutex_unlock(unix_gc_local_lock);
 }
Index: linux-2.6.22-rc2/include/net/af_unix.h
===
--- linux-2.6.22-rc2.orig/include/net/af_unix.h 2007-04-26 05:08:32.0 
+0200
+++ linux-2.6.22-rc2/include/net/af_unix.h  2007-06-04 09:13:56.0 
+0200
@@ -14,6 +14,7 @@ extern void unix_gc(void);
 
 extern struct hlist_head unix_socket_table[UNIX_HASH_SIZE + 1];
 extern spinlock_t unix_table_lock;
+extern struct rw_semaphore unix_gc_sem;
 
 extern atomic_t unix_tot_inflight;
 
Index: linux-2.6.22-rc2/net/unix/af_unix.c
===
--- linux-2.6.22-rc2.orig/net/unix/af_unix.c2007-06-03 23:58:11.0 
+0200
+++ linux-2.6.22-rc2/net/unix/af_unix.c 2007-06-04 11:04:15.0 +0200
@@ -1572,6 +1572,7 @@ static int

Re: [PATCH] fix race in AF_UNIX

2007-06-04 Thread Miklos Szeredi

(resend, sorry, fscked up the address list)

 A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
 send()+close() on the peer, causing recv() to return zero, even though
 the sent data should be received.
 
 This happens if the send() and the close() is performed between
 skb_dequeue() and checking sk-sk_shutdown in unix_stream_recvmsg():
 
 process A  skb_dequeue() returns NULL, there's no data in the socket queue
 process B  new data is inserted onto the queue by unix_stream_sendmsg()
 process B  sk-sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
 process A  sk-sk_shutdown is checked, unix_release_sock() returns zero

This is only part of the story.  It turns out, there are other races
involving the garbage collector, that can throw away perfectly good
packets with AF_UNIX sockets in them.

The problems arise when a socket goes from installed to in-flight or
vica versa during garbage collection.  Since gc is done with a
spinlock held, this only shows up on SMP.

The following patch fixes it for me, but it's possibly the wrong
approach.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux-2.6.22-rc2/net/unix/garbage.c
===
--- linux-2.6.22-rc2.orig/net/unix/garbage.c2007-06-03 23:58:11.0 
+0200
+++ linux-2.6.22-rc2/net/unix/garbage.c 2007-06-04 11:39:42.0 +0200
@@ -90,6 +90,7 @@
 static struct sock *gc_current = GC_HEAD; /* stack of objects to mark */
 
 atomic_t unix_tot_inflight = ATOMIC_INIT(0);
+DECLARE_RWSEM(unix_gc_sem);
 
 
 static struct sock *unix_get_socket(struct file *filp)
@@ -169,7 +170,7 @@ static void maybe_unmark_and_push(struct
 
 void unix_gc(void)
 {
-   static DEFINE_MUTEX(unix_gc_sem);
+   static DEFINE_MUTEX(unix_gc_local_lock);
int i;
struct sock *s;
struct sk_buff_head hitlist;
@@ -179,9 +180,22 @@ void unix_gc(void)
 *  Avoid a recursive GC.
 */
 
-   if (!mutex_trylock(unix_gc_sem))
+   if (!mutex_trylock(unix_gc_local_lock))
return;
 
+
+   /*
+* unix_gc_sem protects against sockets going from in-flight to
+* installed
+*
+* Can't sleep on this, because skb_recv_datagram could be
+* waiting for a packet that is to be sent by the thread which
+* invoked the gc
+*/
+   if (!down_write_trylock(unix_gc_sem)) {
+   mutex_unlock(unix_gc_local_lock);
+   return;
+   }
spin_lock(unix_table_lock);
 
forall_unix_sockets(i, s)
@@ -207,8 +221,6 @@ void unix_gc(void)
 
forall_unix_sockets(i, s)
{
-   int open_count = 0;
-
/*
 *  If all instances of the descriptor are not
 *  in flight we are in use.
@@ -218,10 +230,20 @@ void unix_gc(void)
 *  In this case (see unix_create1()) we set artificial
 *  negative inflight counter to close race window.
 *  It is trick of course and dirty one.
+*
+*  Get the inflight counter first, then the open
+*  counter.  This avoids problems if racing with
+*  sendmsg
+*
+*  If just created socket is not yet attached to
+*  a file descriptor, assume open_count of 1
 */
+   int inflight_count = atomic_read(unix_sk(s)-inflight);
+   int open_count = 1;
+
if (s-sk_socket  s-sk_socket-file)
open_count = file_count(s-sk_socket-file);
-   if (open_count  atomic_read(unix_sk(s)-inflight))
+   if (open_count  inflight_count)
maybe_unmark_and_push(s);
}
 
@@ -302,11 +324,12 @@ void unix_gc(void)
u-gc_tree = GC_ORPHAN;
}
spin_unlock(unix_table_lock);
+   up_write(unix_gc_sem);
 
/*
 *  Here we are. Hitlist is filled. Die.
 */
 
__skb_queue_purge(hitlist);
-   mutex_unlock(unix_gc_sem);
+   mutex_unlock(unix_gc_local_lock);
 }
Index: linux-2.6.22-rc2/include/net/af_unix.h
===
--- linux-2.6.22-rc2.orig/include/net/af_unix.h 2007-04-26 05:08:32.0 
+0200
+++ linux-2.6.22-rc2/include/net/af_unix.h  2007-06-04 09:13:56.0 
+0200
@@ -14,6 +14,7 @@ extern void unix_gc(void);
 
 extern struct hlist_head unix_socket_table[UNIX_HASH_SIZE + 1];
 extern spinlock_t unix_table_lock;
+extern struct rw_semaphore unix_gc_sem;
 
 extern atomic_t unix_tot_inflight;
 
Index: linux-2.6.22-rc2/net/unix/af_unix.c
===
--- linux-2.6.22-rc2.orig/net/unix/af_unix.c2007-06-03 23:58:11.0 
+0200
+++ linux-2.6.22-rc2/net/unix/af_unix.c 2007-06-04 11:04:15.0

Re: [RFC][PATCH] Add suspend and resume support to uli526x

2007-06-04 Thread Pavel Machek

Hi!

 From: Rafael J. Wysocki [EMAIL PROTECTED]
 
 Add suspend/resume support to the uli526x network driver (tested on x86_64,
 with 'Ethernet controller: ALi Corporation M5263 Ethernet Controller, rev 
 40').
 
 Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED]

Looks ok to me.

 +#ifdef CONFIG_PM
 +
 +/*
 + *   Suspend the interface.
 + */
 +
 +static int uli526x_suspend(struct pci_dev *pdev, pm_message_t state)
 +{
 + struct net_device *dev = pci_get_drvdata(pdev);
 +
 + ULI526X_DBUG(0, uli526x_suspend, 0);
 +
 + if (dev  netdev_priv(dev)) {
 + if (netif_running(dev)) {
 + netif_device_detach(dev);
 + uli526x_down(dev);
 + }
 + pci_save_state(pdev);
 + pci_enable_wake(pdev, pci_choose_state(pdev, state), 0);
 + pci_disable_device(pdev);
 + pci_set_power_state(pdev, pci_choose_state(pdev, state));
 + }
 + return 0;
 +}
 +
 +/*
 + *   Resume the interface.
 + */
 +
 +static int uli526x_resume(struct pci_dev *pdev)
 +{
 + struct net_device *dev = pci_get_drvdata(pdev);
 + struct uli526x_board_info *db = netdev_priv(dev);
 + int err;
 +
 + ULI526X_DBUG(0, uli526x_resume, 0);
 +
 + if (dev  db) {
 + pci_set_power_state(pdev, PCI_D0);
 + err = pci_enable_device(pdev);
 + if (err) {
 + printk(KERN_WARNING %s: Could not enable device \n,
 + dev-name);
 + return err;
 + }
 + pci_restore_state(pdev);
 + pci_set_master(pdev);
 + if (netif_running(dev)) {
 + uli526x_up(dev);
 + netif_device_attach(dev);
 + }
 + }
 + return 0;
 +}
 +
#else 
#define *_resume NULL
#define *_suspend NULL

 +#endif /* CONFIG_PM */

 @@ -1689,6 +1775,10 @@ static struct pci_driver uli526x_driver 
   .id_table   = uli526x_pci_tbl,
   .probe  = uli526x_init_one,
   .remove = __devexit_p(uli526x_remove_one),
 +#ifdef CONFIG_PM
 + .suspend= uli526x_suspend,
 + .resume = uli526x_resume,
 +#endif

...so that this ifdef is not needed?
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] cfg80211: fix signed macaddress in sysfs

2007-06-04 Thread Johannes Berg

On Mon, 2007-06-04 at 00:06 +0200, David Lamparter wrote:
 Fix signedness mixup making mac addresses show up strangely
 (like 00:11:22:33:44:ffaa) in /sys/class/ieee80211/*/macaddress.

Huh, my mistake, patch looks good. Do we care for this in .22 still?

Acked-by: Johannes Berg [EMAIL PROTECTED]


 Signed-off-by: David Lamparter [EMAIL PROTECTED]
 ---
 
  net/wireless/sysfs.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/net/wireless/sysfs.c b/net/wireless/sysfs.c
 index 7844be4..374d16d 100644
 --- a/net/wireless/sysfs.c
 +++ b/net/wireless/sysfs.c
 @@ -33,7 +33,7 @@ static ssize_t _show_permaddr(struct device *dev,
   struct device_attribute *attr,
   char *buf)
  {
 -   char *addr = dev_to_rdev(dev)-wiphy.perm_addr;
 +   unsigned char *addr = dev_to_rdev(dev)-wiphy.perm_addr;
  
 return sprintf(buf, %.2x:%.2x:%.2x:%.2x:%.2x:%.2x\n,
addr[0], addr[1], addr[2], addr[3], addr[4], addr[5]);

This probably should also use MAC_FMT/MAC_ARG, hah. Oh well, I'll fix it
up later.

johannes


signature.asc
Description: This is a digitally signed message part

Re: [patch 5/7] CAN: Add virtual CAN netdevice driver

2007-06-04 Thread Patrick McHardy

Oliver Hartkopp wrote:
 1. The needed skb-sk is preserved for the rx path (the way 'up').


I was thinking about this, don't CAN frames include the identity
of the sender? That would be the easiest way to determine whether
the frame needs to be delivered.

 2. The loopback indication is done by using the unused skb-protocol in
 the tx path.


Actually its not unused, it needs to be set on TX frames once the
packet leaves the protocol specific code. Its used for example by
packet sockets (sk_run_filter), qdisc classification, neighbour
resolution, gso segmentation, ...

I don't get why you can't directly check the socket option on the
TX path.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] ps3: add wireless LAN support

2007-06-04 Thread MOKUNO Masakazu

Hi

Thank you for your comments!

It seems to be pretty preferable we make the driver wpa_supplicant
compatible, I'll plan to rework and resubmit the patch (to
linux-wireless)


On Fri, 01 Jun 2007 12:35:49 -0400
Dan Williams [EMAIL PROTECTED] wrote:
 There are a few basic
 issues, rehashed here:
 
 1) Need endian annotations and conversions.

OK, I'll fix. 

 2) How much WPA does the device really support?  If it only supports
 WPA[2]-PSK, I guess that's OK, but the driver _MUST_ return the WPA and
 RSN information elements from each BSS that provides them using the
 IWEVGENIE tag from the get scan handler.  A hack might be to construct a
 suitable synthetic IE depending on the contents of the 'security' field
 the firmware passes back.  But userspace needs to know that a particular
 BSS supports WPA in a standard manner.

It supports WEP and WPA[2]-PSKs(TKIP and AES) only. 

I'll consider incorporating wpa_supplicant stuff, ie to support IWEVGENIE
parameters.

 3) Should remove the WPA passphrase bits from the encode functions,
 these functions should operate only on raw key material.  Userspace
 tools do the hashing.

The hypervisor (firmware) has functionality of hashing WPA passphrase. 
So the driver code just passes it to hypervisor.  With that feature, end
users can easily set their passphase.  I thought it was convenient.
Anyway I can remove it to delegate to user space.

 4) I'd like an explanation of the scanning behavior too, looks fishy.

Well, it basically start to scan as requested.  But since scanning is
time consuming job and it seems to block normal messages, rate control
code you mentioned was inserted.
Background scan is performed for roaming as you thought.

 
 5) Should be using a list of scanned BSSs, should be aging and culling
 them, instead of a static array.

OK.

 
 6) The private ioctl needs to die.

These private ioctls were made so that we can setup the driver simply
with wireless-tools, like the answer to 3) above.
Anyway I'll try to incorporate wpa_supplicant support.

 
 7) Does the device work with wpa_supplicant's 'wext' driver?

I don't think it works with wext of wpa_supplicant, although I didn't test
it with wpa_supplicant well.


--
Masakazu MOKUNO

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] ehea: Receive SKB Aggregation

2007-06-04 Thread Christoph Raisch



Stephen Hemminger  wrote on 31.05.2007 18:37:03:


 
 
  +static int try_get_ip_tcp_hdr(struct ehea_cqe *cqe, struct sk_buff
*skb,
  +   struct iphdr **iph, struct tcphdr **tcph)
  +{
  +   int ip_len;
  +
  +   /* non tcp/udp packets */
  +   if (!cqe-header_length)
  +  return -1;
  +
  +   /* non tcp packet */
  +   *iph = (struct iphdr *)(skb-data);

 Why the indirection, copying of headers..
This interacts with the header split function in the hardware.

  +   if ((*iph)-protocol != IPPROTO_TCP)
  +  return -1;
  +
  +   ip_len = (u8)((*iph)-ihl);
  +   ip_len = 2;
  +   *tcph = (struct tcphdr *)(((u64)*iph) + ip_len);
  +
  +   return 0;
  +}
  +
 

 This code seems to be duplicating a lot  (but not all) of the TCP/IP
 input path validation checks. This is a security problem if nothing
else...

We should only do aggregation in the driver if this really is a TCP header,
otherwise things will get worse.
You're right, we should at least check that tcph is within the received
frame.

 Also, how do you prevent DoS attacks from hostile TCP senders that send
 huge number of back to back frames?

Actually a huge number of back to back frames is what we would want to
receive
at 10 gbit ;-)
How is it possible to figure out if this is what you want or just DoS?
It doesn't change anything compared to a non LRO driver, we process a
certain
maximum amount of frames before waiting for the next interrupt,
the packet filters/DoS should still see all traffic (which is above the
driver).
Any suggestions how to handle this better/different?


 --
 Stephen Hemminger

Gruss / Regards
Christoph Raisch

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] ehea: Receive SKB Aggregation

2007-06-04 Thread Christoph Raisch


Christoph Hellwig wrote on 31.05.2007 15:41:18:

 I'm still very unhappy with having all this in various drivers.  There's
 a lot of code that can be turned into generic library functions, and even
 more code that could be made generic with some amount of refactoring.

Yes, we'd also prefer to use a generic function, but we first would want to
get
some real world experience how our driver behaves with LRO to be even
able to
define requirements for such a generic function. A lot of this is tied into
pathlengths,
caching, and why does that help compared to a different TCP receive side
processing?
In a perfect world we shouldn't see a diffference if this is enabled or
not,
but measurements indicate something completely different at 10gbit.

Gruss / Regards
Christoph Raisch


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 5/7] CAN: Add virtual CAN netdevice driver

2007-06-04 Thread Urs Thuermann

Patrick McHardy [EMAIL PROTECTED] writes:

 I was thinking about this, don't CAN frames include the identity
 of the sender? That would be the easiest way to determine whether
 the frame needs to be delivered.

No, CAN is quite a broken networking technology (at least in my view,
coming from tcp/ip).  There is no station addresses and therefore no
sender or receiver addresses.  Everything is broadcast.

 Actually its not unused, it needs to be set on TX frames once the
 packet leaves the protocol specific code.

Oops, yes, this is something still missing in our code.  Needs to be
fixed.

 I don't get why you can't directly check the socket option on the
 TX path.

We have several types of sockets in the PF_CAN family, two of which
are GPL'ed and which are in the patch series.  These are CAN_RAW and
CAN_BCM.  The protocol implementations use can_send() in af_can.c to
send a CAN frame and indicate to can_send() in an int argument,
whether this frame should be looped back.  Only the raw protocol has a
socket option (setsockopt(2)) in struct raw_sock for this, bcm always
sets this to 1 to have the frame looped back.  There is no option in
struct bcm_sock for this.  In can_send() and in the driver we don't
know what type of socket skb-sk points to and can't check that
option.  Changing this would mean we have to add such an option in the
same position in all CAN socket types and set it to fixed values in
some of them (e.g. to 1 for bcm).  While it's doable, I wouldn't like
that very much.

Is there anything that prevents can_send() from using skb-pkt_type to
pass the loopback flag down to the driver?

urs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 5/7] CAN: Add virtual CAN netdevice driver

2007-06-04 Thread Urs Thuermann

Oliver Hartkopp [EMAIL PROTECTED] writes:

 2. The loopback indication is done by using the unused skb-protocol in
 the tx path.

I don't think we should (mis-)use skb-protocol as a loopback flag.
As the name says, it's the protocol number and not a flag whether
loopback is to be done by the driver.  Even if it's not used otherwise
in tx path (haven't checked this), in my experience it is asking for
future trouble to use variables other than for it's intended purpose.

IMO it would be better to skb-pkt_type.  This is used to indicate
packet type to rcv functions registered by dev_add_pack().  It is set
by netdevice drivers to PACKET_{MULTICAST,BROADCAST,HOST,OTHER} for
received packets.  In the send path it is set to PACKET_OUTGOING on
the copy of the skbuff that is delivered to the sockets registered on
ptype_all (typically packet sockets from tcpdump or other sniffers).
AFAICS, pkt_type is not used otherwise in the send path.

We could set skb-pkt_type = PACKET_LOOPBACK to flag to the CAN
netdevice driver whether to loop back the packet.

Any objections?

urs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC] Failover-friendly TCP retrans mission

2007-06-04 Thread noboru . obata . ar

Hi all,

I would like to hear comments on how TCP retransmission can be
done better on failover-capable network devices, such as an
active-backup bonding device.


Premise
===

Please note first that I want to address physical failures by
the failover-capable network devices, which are increasingly
becoming important as Xen-based VM systems are getting popular.
Reducing a single-point-of-failure (physical device) is vital on
such VM systems.

And the failover here is not going to address overloaded or
congested networks here, which should be solved separately.


Background
==

When designing a TCP/IP based network system on failover-capable
network devices, people want to set timeouts hierarchically in
three layers, network device layer, TCP layer, and application
layer (bottom-up order), such that:

1. Network device layer detects a failure first and switch to a
   backup device (say, in 20sec).

2. TCP layer timeout  retransmission comes next, _hopefully_
   before the application layer timeout.

3. Application layer detects a network failure last (by, say,
   30sec timeout) and may trigger a system-level failover.

It should be noted that the timeouts for #1 and #2 are handled
independently and there is no relationship between them.

Also note that the actual timeout settings (20sec or 30sec in
this example) are often determined by systems requirement and so
setting them to certain safe values (if any) are usually not
possible.


Problem
===

If TCP retransmission misses the time frame between event #1 and
#3 in Background above (between 20 and 30sec since network
failure), a failure causes the system-level failover where the
network-device-level failover should be enough.

The problem in this hierarchical timeout scheme is that TCP
layer does not guarantee the next retransmission to occur in
certain period of time.  In the above example, people expect TCP
to retransmit a packet between 20 and 30sec since network
failure, but it may not happen.

Starting from RTO=0.5sec for example, retransmission will occur
at time 0.5, 1.5, 3.5, 7.5, 15.5, and 31.5 as indicated by 'o'
in the following diagram, but miss the time frame between time
20 and 30.

   time: 0 102030sec
 | | | |
  App. layer |-+-+-X  == system failover
   TCP layer oo-o---o--+o+-+o == expects retrans. b/w 20~30
Netdev layer |-+-X== network failover


Solution


It seems reasonable for me to solve this problem by capping
TCP_RTO_MAX, i.e., making TCP_RTO_MAX a sysctl variable and set
it to a small number.

In this example, setting to (10 - RTT)[sec] at most should work
because retransmission will take place between time 20 and 30.

My rationale follows.

* This solution is simple and so less error-prone.

* The solution does not violate RFC 2988 in maximum RTO value,
  because RFC 2988's requirement in maximum RTO value, at least
  60 seconds (in (2.5)), is OPTIONAL.

* The solution adds a system-wide setting, which is preferable
  to per-socket setting (by, say, setsockopt or something),
  because all application benefits from the solution.


Before posting patches, I would like to hear comments here.
Any comments or suggestions, to make TCP retransmission work
better on failover-capable network devices, are welcome.

Regards,

-- 
OBATA Noboru ([EMAIL PROTECTED])

Re: r8169: hard freezes on TX

2007-06-04 Thread Rolf Eike Beer

Rolf Eike Beer wrote:
 Francois Romieu wrote:
  Rolf Eike Beer [EMAIL PROTECTED] :
  [...]

 Ok, just tested. I used a file of 200MB and copied it to another host on
 the LAN. If I used our 100 MBit switch nothing happened. When I put a 10
 MBit hub in the middle it died at 77 MB.

I just had another freeze using your patches. After 512kB over smb it was 
dead.

Eike


signature.asc
Description: This is a digitally signed message part.

Re: [RFC][PATCH] Add suspend and resume support to uli526x

2007-06-04 Thread Rafael J. Wysocki

Hi,

On Monday, 4 June 2007 13:11, Pavel Machek wrote:
 Hi!
 
  From: Rafael J. Wysocki [EMAIL PROTECTED]
  
  Add suspend/resume support to the uli526x network driver (tested on x86_64,
  with 'Ethernet controller: ALi Corporation M5263 Ethernet Controller, rev 
  40').
  
  Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED]
 
 Looks ok to me.
 
  +#ifdef CONFIG_PM
  +
  +/*
  + * Suspend the interface.
  + */
  +
  +static int uli526x_suspend(struct pci_dev *pdev, pm_message_t state)
  +{
  +   struct net_device *dev = pci_get_drvdata(pdev);
  +
  +   ULI526X_DBUG(0, uli526x_suspend, 0);
  +
  +   if (dev  netdev_priv(dev)) {
  +   if (netif_running(dev)) {
  +   netif_device_detach(dev);
  +   uli526x_down(dev);
  +   }
  +   pci_save_state(pdev);
  +   pci_enable_wake(pdev, pci_choose_state(pdev, state), 0);
  +   pci_disable_device(pdev);
  +   pci_set_power_state(pdev, pci_choose_state(pdev, state));
  +   }
  +   return 0;
  +}
  +
  +/*
  + * Resume the interface.
  + */
  +
  +static int uli526x_resume(struct pci_dev *pdev)
  +{
  +   struct net_device *dev = pci_get_drvdata(pdev);
  +   struct uli526x_board_info *db = netdev_priv(dev);
  +   int err;
  +
  +   ULI526X_DBUG(0, uli526x_resume, 0);
  +
  +   if (dev  db) {
  +   pci_set_power_state(pdev, PCI_D0);
  +   err = pci_enable_device(pdev);
  +   if (err) {
  +   printk(KERN_WARNING %s: Could not enable device \n,
  +   dev-name);
  +   return err;
  +   }
  +   pci_restore_state(pdev);
  +   pci_set_master(pdev);
  +   if (netif_running(dev)) {
  +   uli526x_up(dev);
  +   netif_device_attach(dev);
  +   }
  +   }
  +   return 0;
  +}
  +
 #else 
 #define *_resume NULL
 #define *_suspend NULL

  +#endif /* CONFIG_PM */
 
  @@ -1689,6 +1775,10 @@ static struct pci_driver uli526x_driver 
  .id_table   = uli526x_pci_tbl,
  .probe  = uli526x_init_one,
  .remove = __devexit_p(uli526x_remove_one),
  +#ifdef CONFIG_PM
  +   .suspend= uli526x_suspend,
  +   .resume = uli526x_resume,
  +#endif
 
 ...so that this ifdef is not needed?

OK, why not.

Greetings,
Rafael


-- 
Premature optimization is the root of all evil. - Donald Knuth
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Failover-friendly TCP retransmission

2007-06-04 Thread Andi Kleen

[EMAIL PROTECTED] writes:

 Please note first that I want to address physical failures by
 the failover-capable network devices, which are increasingly
 becoming important as Xen-based VM systems are getting popular.
 Reducing a single-point-of-failure (physical device) is vital on
 such VM systems.

Just you typically still have lots of other single points of failures in 
a single system, some of them quite less reliable than your typical
NIC. But at least it gives impressive demos when pulling ethernet cables @)

 1. Network device layer detects a failure first and switch to a
backup device (say, in 20sec).
 
 2. TCP layer timeout  retransmission comes next, _hopefully_
before the application layer timeout.
 
 3. Application layer detects a network failure last (by, say,
30sec timeout) and may trigger a system-level failover.
 
 It should be noted that the timeouts for #1 and #2 are handled
 independently and there is no relationship between them.

 If TCP retransmission misses the time frame between event #1 and
 #3 in Background above (between 20 and 30sec since network
 failure), a failure causes the system-level failover where the
 network-device-level failover should be enough.

You should probably make sure that the device ends up returning the
right NET_XMIT_* code for such drops to TCP, in particular
NET_XMIT_DROP. This might require slight driver interface
changes. Also right now it only affects the congestion window, I think, 
it  might be reasonable to let it affect the timer backoff too.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Strange soft lockup detected message (looks like spin_lock bug in pcnet32)

2007-06-04 Thread Lennart Sorensen

On Mon, May 07, 2007 at 01:45:11PM -0400, Lennart Sorensen wrote:
 Hmm, I thought I saw it on two systems already, but I should go try that
 again.

Hmm, still haven't figured this out.  I just saw this one this morning:

BUG: soft lockup detected on CPU#0!
 [c0103fc4] dump_stack+0x24/0x30
 [c013d36e] softlockup_tick+0x7e/0xc0
 [c011eb23] update_process_times+0x33/0x80
 [c01062c9] timer_interrupt+0x39/0x80
 [c013d6fd] handle_IRQ_event+0x3d/0x70
 [c013da59] __do_IRQ+0xa9/0x150
 [c0104e55] do_IRQ+0x25/0x60
 [c010313a] common_interrupt+0x1a/0x20
 [c013d6d8] handle_IRQ_event+0x18/0x70
 [c013da59] __do_IRQ+0xa9/0x150
 [c0104e55] do_IRQ+0x25/0x60
 [c010313a] common_interrupt+0x1a/0x20
 [c0119cda] __do_softirq+0x3a/0xa0
 [c0119d6d] do_softirq+0x2d/0x30
 [c0119fb7] irq_exit+0x37/0x40
 [c0104e5a] do_IRQ+0x2a/0x60
 [c010313a] common_interrupt+0x1a/0x20
 [c013dcee] setup_irq+0xce/0x1e0
 [c013de97] request_irq+0x97/0xb0
 [d0851f9d] pcnet32_open+0x4d/0x3d0 [pcnet32]
 [c023a4f9] dev_open+0x39/0x80
 [c0238cea] dev_change_flags+0xfa/0x130
 [c027eb9f] devinet_ioctl+0x4ff/0x6f0
 [c022dab1] sock_ioctl+0xf1/0x1f0
 [c017413c] do_ioctl+0x2c/0x80
 [c01741e2] vfs_ioctl+0x52/0x2f0
 [c01744ef] sys_ioctl+0x6f/0x80
 [c0102ef7] syscall_call+0x7/0xb
 [b7f41d04] 0xb7f41d04

And it is happening on multiple systems.  I am starting to wonder if it
is a bug in the soft lockup detection.  Maybe it really isn't locked up
but just momentarily appears to be.  I will try turning off the soft
lockup detection and see what happens.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [TCP]: Honour sk_bound_dev_if in tcp_v4_send_ack

2007-06-04 Thread Patrick McHardy

Patrick McHardy wrote:
 [TCP]: Honour sk_bound_dev_if in tcp_v4_send_ack
 
 A time_wait socket inherits sk_bound_dev_if from the original socket,
 but it is not used when sending ACK packets using ip_send_reply.
 
 Fix by passing the oif to ip_send_reply in struct ip_reply_arg and
 use it for output routing.


The patch was missing a NULL pointer check in tcp_v4_send_ack for
a different code path where twsk == NULL. Fixed version attached.

[TCP]: Honour sk_bound_dev_if in tcp_v4_send_ack

A time_wait socket inherits sk_bound_dev_if from the original socket,
but it is not used when sending ACK packets using ip_send_reply.

Fix by passing the oif to ip_send_reply in struct ip_reply_arg and
use it for output routing.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 9f66116df33984809a5cc0ea66a358db684d4346
tree a0b163d17181f1d1fafb4b7517726b5bb414b579
parent ea1601c50bf23af25094511e2a9ce1b755ab9669
author Patrick McHardy [EMAIL PROTECTED] Mon, 04 Jun 2007 17:10:41 +0200
committer Patrick McHardy [EMAIL PROTECTED] Mon, 04 Jun 2007 17:13:17 +0200

 include/net/ip.h |1 +
 net/ipv4/ip_output.c |4 +++-
 net/ipv4/tcp_ipv4.c  |2 ++
 3 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index bb207db..abf2820 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -143,6 +143,7 @@ struct ip_reply_arg {
__wsum  csum;
int csumoffset; /* u16 offset of csum in iov[0].iov_base */
/* -1 if not needed */ 
+   int bound_dev_if;
 }; 
 
 void ip_send_reply(struct sock *sk, struct sk_buff *skb, struct ip_reply_arg 
*arg,
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index d6427d9..34ea454 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1352,7 +1352,8 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, 
struct ip_reply_arg *ar
}
 
{
-   struct flowi fl = { .nl_u = { .ip4_u =
+   struct flowi fl = { .oif = arg-bound_dev_if,
+   .nl_u = { .ip4_u =
  { .daddr = daddr,
.saddr = rt-rt_spec_dst,
.tos = RT_TOS(ip_hdr(skb)-tos) 
} },
@@ -1376,6 +1377,7 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, 
struct ip_reply_arg *ar
inet-tos = ip_hdr(skb)-tos;
sk-sk_priority = skb-priority;
sk-sk_protocol = ip_hdr(skb)-protocol;
+   sk-sk_bound_dev_if = arg-bound_dev_if;
ip_append_data(sk, ip_reply_glue_bits, arg-iov-iov_base, len, 0,
   ipc, rt, MSG_DONTWAIT);
if ((skb = skb_peek(sk-sk_write_queue)) != NULL) {
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 47c6105..97e294e 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -705,6 +705,8 @@ static void tcp_v4_send_ack(struct tcp_timewait_sock *twsk,
  ip_hdr(skb)-saddr, /* XXX */
  arg.iov[0].iov_len, IPPROTO_TCP, 0);
arg.csumoffset = offsetof(struct tcphdr, check) / 2;
+   if (twsk)
+   arg.bound_dev_if = twsk-tw_sk.tw_bound_dev_if;
 
ip_send_reply(tcp_socket-sk, skb, arg, arg.iov[0].iov_len);

RE: Definition and usage of NETIF_F_HW_SUM?

2007-06-04 Thread Ron Mercer

I was out of town last week and did not have a chance to respond.  Yes,
qla3xxx is (before Stephen's fix) broken on IPV6.  I will review the
changes and post a patch if necessary. 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Stephen Hemminger
 Sent: Tuesday, May 29, 2007 4:46 PM
 To: Michael Chan
 Cc: Herbert Xu; netdev
 Subject: Re: Definition and usage of NETIF_F_HW_SUM?

 On Tue, 29 May 2007 17:10:52 -0700
 Michael Chan [EMAIL PROTECTED] wrote:

  On Wed, 2007-05-30 at 07:36 +1000, Herbert Xu wrote:

   I just checked e1000 and it's correct as it does use the 
 csum_offset
   when doing TX offload.  However, you're definitely right that bnx2
   seems to be broken.

A few devices take a offset, starting point, and 
 insertion point. This looks like
the correct model. But no upper layer protocols other 
 than IPV4/IPV6 can do checksum
offload at present, so it seems moot.

   I could easily whip up a patch to get GRE to use it for a start :)

IMHO the correct solution would be to get rid if 
 NETIF_F_HW_SUM and make a new flag
NETIF_F_IPV6_SUM. Devices that can checksum both could 
 do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM.

   We should definitely keep NETIF_F_HW_SUM for sane 
 hardware such as the
   e1000.  Unfortunately we may just have to invent IPV6_SUM 
 for the broken
   ones.

   Ccing Michael to see if the bnx2 chip can actually do offset-based
   checksum offload.

  bnx2 and tg3 cannot do offset-based checksumming because 
 the hardware
  doesn't have room in the buffer descriptors to specify the 
 offsets.  So
  regrettably, the NETIF_F_HW_SUM flag has been misused in 
 these drivers.
  A new NETIF_F_IPV6_SUM flag will be very useful for us.

 Look furthur many drivers are just plain broken and use F_HW_SUM
 and can't even do IPV6 properly.  I'll fix

 The worst code award goes to: qla3xxx.c
 which is broken on IPV6 and goes to trouble of computing all the
 offsets and they are already there in skb...

 -- 
 Stephen Hemminger [EMAIL PROTECTED]
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] sundance: PHY address form 0, only for device ID 0x0200 (IP100A)

2007-06-04 Thread Randy Dunlap

On Mon, 04 Jun 2007 17:04:39 -0400 Jesse Huang wrote:

 From: Jesse Huang [EMAIL PROTECTED]
 
 Change Logs:
 Search PHY address form 0, only for device ID 0x0200 (IP100A). Other device 
 are from PHY address 1.
 
 Signed-off-by: Jesse Huang [EMAIL PROTECTED]
 ---
 
  drivers/net/sundance.c |4 +++-
  1 files changed, 3 insertions(+), 1 deletions(-)
 
 5724a72722dfc9cafbb8f273cb82dbf577bd9ad0
 diff --git a/drivers/net/sundance.c b/drivers/net/sundance.c
 index e1f912d..914ab29 100644
 --- a/drivers/net/sundance.c
 +++ b/drivers/net/sundance.c
 @@ -562,7 +562,9 @@ #endif
* It seems some phys doesn't deal well with address 0 being accessed
* first, so leave address zero to the end of the loop (32  31).
*/
 - for (phy = 1; phy = 32  phy_idx  MII_CNT; phy++) {
 + if(sundance_pci_tbl[np-chip_id].device == 0x0200) phy = 0;

kernel style:
if (sundance_pci_tbl[np-chip_id].device == 0x0200)
phy = 0;
else
phy = 1;

 + else phy = 1;
 + for (; phy = 32  phy_idx  MII_CNT; phy++) {
   int phyx = phy  0x1f;
   int mii_status = mdio_read(dev, phyx, MII_BMSR);
   if (mii_status != 0xmii_status != 0x) {
 -- 


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: b44: regression in 2.6.22 (resend)

2007-06-04 Thread Stephen Hemminger

On Mon, 04 Jun 2007 08:39:48 +0200
Thomas Gleixner [EMAIL PROTECTED] wrote:

 On Sun, 2007-06-03 at 18:26 +0200, Maximilian Engelhardt wrote:
   Is there any other strange behavior of the high res enabled kernel than
   the b44 problem ?
  
  I didn't notice anything in the past (as I wrote). But today I did some 
  tests 
  for an updated version of the p54 mac80211 wlan driver and I noticed 
  exactly 
  the same problem:
  
  when booting with highres=off everything is fine.
  But when I boot an highres enabled kernel and I do the iperf-test with the 
  p54 
  driver, my systems becomes unresponsive during the test. It seems to be 
  exactly the same problem I have with the b44 driver.
  So this might not be a bug in the b44 code but a bug somewhere in the linux 
  networking code.
  
  I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
  external as module. 
 
 Can you look at iperf to figure out, whether it does some weird timer
 stuff (high frequency interval timer or such) ? Either check the code or
 strace it.
 
   tglx
 
 


It is the receiver doing a tight loop doing gettimeofday/recv calls.


sendto(-1227715616, 0xc, 3085438964, 0, {...}, 3067249832) = 0
gettimeofday({1180973726, 981615}, NULL) = 0
gettimeofday({1180973726, 981751}, NULL) = 0
futex(0x8055c64, 0x5 /* FUTEX_??? */, 1) = 1
futex(0x8055c90, FUTEX_WAKE, 1) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 982754}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 983790}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 984355}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 984706}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 985111}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 985499}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 986088}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 986436}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 986916}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 987397}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 987872}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 988440}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 988823}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 989314}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 990029}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 990890}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 991803}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 992616}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 993105}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 993585}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 994014}, NULL) = 0
...

recv(4, 45678901234567890123456789012345..., 8192, 0) = 1448
gettimeofday({1180973757, 172437}, NULL) = 0
recv(4, 23456789012345678901234567890123..., 8192, 0) = 1448
gettimeofday({1180973757, 172576}, NULL) = 0
recv(4, 01234567890123456789012345678901..., 8192, 0) = 1632
gettimeofday({1180973757, 172752}, NULL) = 0
recv(4, , 8192, 0)= 0
gettimeofday({1180973757, 172797}, NULL) = 0
gettimeofday({1180973757, 172817}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
close(4) = 0
futex(0x8055d04, 0x5 /* FUTEX_??? */, 1) = 1
futex(0x8055d30, FUTEX_WAKE, 1) = 0
_exit(0)

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 5/7] CAN: Add virtual CAN netdevice driver

2007-06-04 Thread Oliver Hartkopp

Urs Thuermann wrote:
 Oliver Hartkopp [EMAIL PROTECTED] writes:

   
 2. The loopback indication is done by using the unused skb-protocol in
 the tx path.
 

 I don't think we should (mis-)use skb-protocol as a loopback flag.
   

Yep! After reading the comments from Patrick and Urs i definitely agree
to their concerns.

 IMO it would be better to skb-pkt_type.  This is used to indicate
 packet type to rcv functions registered by dev_add_pack().  It is set
 by netdevice drivers to PACKET_{MULTICAST,BROADCAST,HOST,OTHER} for
 received packets.  In the send path it is set to PACKET_OUTGOING on
 the copy of the skbuff that is delivered to the sockets registered on
 ptype_all (typically packet sockets from tcpdump or other sniffers).
 AFAICS, pkt_type is not used otherwise in the send path.

 We could set skb-pkt_type = PACKET_LOOPBACK to flag to the CAN
 netdevice driver whether to loop back the packet.
   

I think, it goes into the right direction to use skb-pkt_type. The flag
should really be somewhere inside the skb as all back references into
the sk would become sticky in the implementation.

But regarding the use of skb-pkt_type i would suggest to take a closer
look on the definitions in include/linux/if_packet.h and how the
pkt_type is to be used inside the kernel. In my opinion we should use ...

* TX-Path:
PACKET_OTHERHOST: send the CAN frame without loopback
PACKET_BROADCAST : send the CAN frame with loopback (aka local broadcast)

See an example of this approach in drivers/net/arcnet/rfc1051.c :
http://www.linux-m32r.org/lxr/http/source/drivers/net/arcnet/rfc1051.c?a=i386#L99

* RX-Path:
PACKET_HOST : just an incoming CAN frame for this host

Any comments? ACKs?

Best regards,
Oliver





-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: b44: regression in 2.6.22 (resend)

2007-06-04 Thread Thomas Gleixner

On Mon, 2007-06-04 at 09:09 -0700, Stephen Hemminger wrote:
   I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
   external as module. 
  
  Can you look at iperf to figure out, whether it does some weird timer
  stuff (high frequency interval timer or such) ? Either check the code or
  strace it.

 It is the receiver doing a tight loop doing gettimeofday/recv calls.
 
 
 sendto(-1227715616, 0xc, 3085438964, 0, {...}, 3067249832) = 0
 gettimeofday({1180973726, 981615}, NULL) = 0
 gettimeofday({1180973726, 981751}, NULL) = 0
 futex(0x8055c64, 0x5 /* FUTEX_??? */, 1) = 1
 futex(0x8055c90, FUTEX_WAKE, 1) = 0
 recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 
 0) = 8192
 gettimeofday({1180973726, 982754}, NULL) = 0
 recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 
 0) = 8192
 gettimeofday({1180973726, 983790}, NULL) = 0

Well, gettimeofday() is not affected by the highres code, but

 nanosleep({0, 0}, NULL) = 0
 nanosleep({0, 0}, NULL) = 0

is. The nanosleep call with a relative timeout of 0 returns immediately
with highres enabled, while it sleeps at least until the next tick
arrives when highres is off. Are there more of those stupid sleeps in
the code ?

tglx


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/1] myri10ge: limit the number of recoveries

2007-06-04 Thread Brice Goglin

Jeff Garzik wrote:
 Brice Goglin wrote:
 Limit the number of recoveries from a NIC hw watchdog reset to 1 by
 default.
 It enables detection of defective NICs immediately since these memory
 parity
 errors are expected to happen very rarely (less than once per
 century*NIC).
 However, a defective NIC (very rare, fortunately) can see such an error
 quite often, ie. every few minutes under high load.

 Make the limit tunable to allow people with mission critical
 installations
 to crank up the tunable and recover an INTMAX number of times while
 waiting
 for a downtime window to replace the NIC. The performance won't be
 optimal,
 but at least, it will still work.

 Signed-off-by: Brice Goglin [EMAIL PROTECTED]
 ---
  drivers/net/myri10ge/myri10ge.c |   15 +--
  1 file changed, 13 insertions(+), 2 deletions(-)

 NAK.

Ok...
Then please apply the following patch which limits the number of
recovery to 1 without making it tunable. It will at least enable
detection of bad NICs.

Brice


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/1] myri10ge: limit the number of recoveries

2007-06-04 Thread Brice Goglin

Limit the number of recoveries from a NIC hw watchdog reset to 1 by default.
It enables detection of defective NICs immediately since these memory parity
errors are expected to happen very rarely (less than once per century*NIC).

Signed-off-by: Brice Goglin [EMAIL PROTECTED]
---
 drivers/net/myri10ge/myri10ge.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Index: linux-rc/drivers/net/myri10ge/myri10ge.c
===
--- linux-rc.orig/drivers/net/myri10ge/myri10ge.c   2007-05-30 
20:58:22.0 +0200
+++ linux-rc/drivers/net/myri10ge/myri10ge.c2007-06-04 19:01:54.0 
+0200
@@ -279,6 +279,8 @@
 module_param(myri10ge_fill_thresh, int, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(myri10ge_fill_thresh, Number of empty rx slots allowed\n);
 
+static int myri10ge_reset_recover = 1;
+
 static int myri10ge_wcfifo = 0;
 module_param(myri10ge_wcfifo, int, S_IRUGO);
 MODULE_PARM_DESC(myri10ge_wcfifo, Enable WC Fifo when WC is enabled\n);
@@ -2730,8 +2732,14 @@
 * For now, just report it */
reboot = myri10ge_read_reboot(mgp);
printk(KERN_ERR
-  myri10ge: %s: NIC rebooted (0x%x), resetting\n,
-  mgp-dev-name, reboot);
+  myri10ge: %s: NIC rebooted (0x%x),%s resetting\n,
+  mgp-dev-name, reboot,
+  myri10ge_reset_recover ?   :  not);
+   if (myri10ge_reset_recover == 0)
+   return;
+
+   myri10ge_reset_recover--;
+
/*
 * A rebooted nic will come back with config space as
 * it was after power was applied to PCIe bus.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Stephen Hemminger

On Mon, 04 Jun 2007 18:35:58 +0200
Thomas Gleixner [EMAIL PROTECTED] wrote:

 On Mon, 2007-06-04 at 09:09 -0700, Stephen Hemminger wrote:
I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
external as module. 
   
   Can you look at iperf to figure out, whether it does some weird timer
   stuff (high frequency interval timer or such) ? Either check the code or
   strace it.
 
  It is the receiver doing a tight loop doing gettimeofday/recv calls.
  
  
  sendto(-1227715616, 0xc, 3085438964, 0, {...}, 3067249832) = 0
  gettimeofday({1180973726, 981615}, NULL) = 0
  gettimeofday({1180973726, 981751}, NULL) = 0
  futex(0x8055c64, 0x5 /* FUTEX_??? */, 1) = 1
  futex(0x8055c90, FUTEX_WAKE, 1) = 0
  recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 
  0) = 8192
  gettimeofday({1180973726, 982754}, NULL) = 0
  recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 
  0) = 8192
  gettimeofday({1180973726, 983790}, NULL) = 0
 
 Well, gettimeofday() is not affected by the highres code, but
 
  nanosleep({0, 0}, NULL) = 0
  nanosleep({0, 0}, NULL) = 0
 
 is. The nanosleep call with a relative timeout of 0 returns immediately
 with highres enabled, while it sleeps at least until the next tick
 arrives when highres is off. Are there more of those stupid sleeps in
 the code ?

GLIBC pthread_mutex does it, YES it is a problem!
Looks like the old behavior is required for ABI compatibility.

iperf server has several threads. One thread is using pthread_mutex_lock
to wait for the other thread.  It looks like pthread_mutex_lock is using
nanosleep as yield().

Multi-thread strace shows how this could kill performance.
These are with old 2.6.20 kernel that doesn't have the problem:

sendto(-1210930208, 0xc, 3085438964, 0, {...}, 3084035240) = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(3, {sa_family=AF_INET, sin_port=htons(5001), 
sin_addr=inet_addr(0.0.0.0)}, 16)   = 0
listen(3, 5)= 0
futex(0x8055c64, 0x5 /* FUTEX_??? */, 1) = 1
accept(3, {sa_family=AF_INET, sin_port=htons(49973), 
sin_addr=inet_addr(192.168.0.14)}, [16]) = 4
getsockname(4, {sa_family=AF_INET, sin_port=htons(5001), 
sin_addr=inet_addr(192.168.0.12)}, [16]) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 24, 0) = 
24
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xb6528000
mprotect(0xb6528000, 4096, PROT_NONE)   = 0
clone(child_stack=0xb6d284a4, 
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED,
 parent_tidptr=0xb6d28bd8, {entry_number:6, base_addr:0xb6d28b90, 
limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, 
seg_not_present:0, useable:1}, child_tidptr=0xb6d28bd8) = 5622
accept(3, 0x8058b98, [128]) = ? ERESTARTSYS (To be restarted)

==

sendto(-1219322912, 0xc, 3085438964, 0, {...}, 3075642536) = 0
futex(0x8055d04, 0x5 /* FUTEX_??? */, 1) = 1
futex(0x8055d30, FUTEX_WAKE, 1) = 1
futex(0x8055c64, FUTEX_WAIT, 1, NULL) = 0
futex(0x8055d04, 0x5 /* FUTEX_??? */, 1) = 1
futex(0x8055d30, FUTEX_WAKE, 1) = 1
futex(0x8055c90, FUTEX_WAKE, 1) = 0
getsockopt(3, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xb7fb
write(1, ..., 61) = 61
write(1, Server listening on TCP port 500..., 34) = 34
write(1, TCP window size: 85.3 KByte (def..., 38) = 38
write(1, ..., 61) = 61
nanosleep({0, 0}, NULL) = 0
futex(0x8055d04, 0x5 /* FUTEX_??? */, 1) = 1
futex(0x8055d30, FUTEX_WAKE, 1) = 1
futex(0x8055c64, FUTEX_WAIT, 3, NULL) = 0
futex(0x8055c90, FUTEX_WAIT, 2, NULL) = -1 EAGAIN (Resource temporarily 
unavailable)
futex(0x8055d04, 0x5 /* FUTEX_??? */, 1)   = 1
futex(0x8055d30, FUTEX_WAKE, 1)  = 0
futex(0x8055c90, FUTEX_WAKE, 1)   = 0
write(1, [  4] local 192.168.0.12 port 50..., 74)  = 
74
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL)  = 0
nanosleep({0, 0}, NULL)   = 0
nanosleep({0, 0}, NULL) = 0

...

nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
write(1, [  4]  0.0-30.2 sec336 MByte..., 50) = 50
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
futex(0x8055d04, 0x5 /* FUTEX_??? */, 1) = 1
futex(0x8055d30, FUTEX_WAKE, 1) = 0
futex(0x8055c64, FUTEX_WAIT,

Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Thomas Gleixner

On Mon, 2007-06-04 at 09:59 -0700, Stephen Hemminger wrote:
   gettimeofday({1180973726, 982754}, NULL) = 0
   recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 
   8192, 0) = 8192
   gettimeofday({1180973726, 983790}, NULL) = 0
  
  Well, gettimeofday() is not affected by the highres code, but
  
   nanosleep({0, 0}, NULL) = 0
   nanosleep({0, 0}, NULL) = 0
  
  is. The nanosleep call with a relative timeout of 0 returns immediately
  with highres enabled, while it sleeps at least until the next tick
  arrives when highres is off. Are there more of those stupid sleeps in
  the code ?
 
 GLIBC pthread_mutex does it, YES it is a problem!
 Looks like the old behavior is required for ABI compatibility.

 iperf server has several threads. One thread is using pthread_mutex_lock
 to wait for the other thread.  It looks like pthread_mutex_lock is using
 nanosleep as yield().

I doubt that. This is in the iperf code itself.

void thread_rest ( void ) {
#if defined( HAVE_THREAD )
#if defined( HAVE_POSIX_THREAD )
// TODO add checks for sched_yield or pthread_yield and call that
// if available
usleep( 0 );

--

It results in a nanosleep({0,0}, NULL)

tglx


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Stephen Hemminger

On Mon, 04 Jun 2007 19:32:48 +0200
Thomas Gleixner [EMAIL PROTECTED] wrote:

 On Mon, 2007-06-04 at 09:59 -0700, Stephen Hemminger wrote:
gettimeofday({1180973726, 982754}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 
8192, 0) = 8192
gettimeofday({1180973726, 983790}, NULL) = 0
   
   Well, gettimeofday() is not affected by the highres code, but
   
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
   
   is. The nanosleep call with a relative timeout of 0 returns immediately
   with highres enabled, while it sleeps at least until the next tick
   arrives when highres is off. Are there more of those stupid sleeps in
   the code ?
  
  GLIBC pthread_mutex does it, YES it is a problem!
  Looks like the old behavior is required for ABI compatibility.
 
  iperf server has several threads. One thread is using pthread_mutex_lock
  to wait for the other thread.  It looks like pthread_mutex_lock is using
  nanosleep as yield().
 
 I doubt that. This is in the iperf code itself.
 
 void thread_rest ( void ) {
 #if defined( HAVE_THREAD )
 #if defined( HAVE_POSIX_THREAD )
 // TODO add checks for sched_yield or pthread_yield and call that
 // if available
 usleep( 0 );
 
 --
 
 It results in a nanosleep({0,0}, NULL)
 
   tglx
 

Yes, the following patch makes iperf work better than ever.
But are other broken applications going to have same problem.
Sounds like the old who runs first fork() problems.

--- iperf-2.0.2/compat/Thread.c.orig2005-05-03 08:15:51.0 -0700
+++ iperf-2.0.2/compat/Thread.c 2007-06-04 10:54:21.0 -0700
@@ -405,9 +405,13 @@
 void thread_rest ( void ) {
 #if defined( HAVE_THREAD )
 #if defined( HAVE_POSIX_THREAD )
-// TODO add checks for sched_yield or pthread_yield and call that
-// if available
+
+#if defined( _POSIX_PRIORITY_SCHEDULING )
+sched_yield();
+#else
 usleep( 0 );
+#endif
+
 #else // Win32
 SwitchToThread( );
 #endif




-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 5/7] CAN: Add virtual CAN netdevice driver

2007-06-04 Thread Oliver Hartkopp

Oliver Hartkopp wrote:
 I think, it goes into the right direction to use skb-pkt_type. The flag
 should really be somewhere inside the skb as all back references into
 the sk would become sticky in the implementation.

 But regarding the use of skb-pkt_type i would suggest to take a closer
 look on the definitions in include/linux/if_packet.h and how the
 pkt_type is to be used inside the kernel. In my opinion we should use ...

 * TX-Path:
 PACKET_OTHERHOST: send the CAN frame without loopback
 PACKET_BROADCAST : send the CAN frame with loopback (aka local broadcast)

 See an example of this approach in drivers/net/arcnet/rfc1051.c :
 http://www.linux-m32r.org/lxr/http/source/drivers/net/arcnet/rfc1051.c?a=i386#L99

 * RX-Path:
 PACKET_HOST : just an incoming CAN frame for this host

 Any comments? ACKs?

 Best regards,
 Oliver


   

The updated changes would look like this. Of course it has been tested
and is working fine :-)

Regards,
Oliver

Index: net/can/af_can.c
===
--- net/can/af_can.c(revision 325)
+++ net/can/af_can.c(working copy)
@@ -257,7 +257,6 @@
  */
 int can_send(struct sk_buff *skb, int loop)
 {
-struct sock **tx_sk = (struct sock **)skb-cb;
 int err;
 
 if (skb-dev-type != ARPHRD_CAN) {
@@ -265,30 +264,43 @@
 return -EPERM;
 }
 
+skb-protocol = htons(ETH_P_CAN);
+
 if (loop) {
 /* local loopback of sent CAN frames (default) */
 
-/* indication for the CAN driver: do loopback */
-*tx_sk = skb-sk;
+/*
+ * This packet is not only sent on the CAN bus but
+ * also broadcasted to local subscribers on this host.
+ */
+skb-pkt_type = PACKET_BROADCAST;
 
 /*
- * The reference to the originating sock may be also required
+ * The reference to the originating sock may be required
  * by the receiving socket to indicate (and ignore) his own
- * sent data. Example: can_raw sockopt CAN_RAW_RECV_OWN_MSGS
+ * sent data. Example: can_raw sockopt CAN_RAW_RECV_OWN_MSGS.
+ * Therefore we have to ensure that skb-sk remains the
+ * reference to the originating sock by restoring skb-sk
+ * after each skb_clone() or skb_orphan() usage.
+ * skb-sk is usually unused and unset in the rx path.
  */
 
 /* interface not capabable to do the loopback itself? */
 if (!(skb-dev-flags  IFF_LOOPBACK)) {
+struct sock *srcsk = skb-sk;
 struct sk_buff *newskb = skb_clone(skb, GFP_ATOMIC);
 
 /* perform the local loopback here */
-newskb-protocol  = htons(ETH_P_CAN);
 newskb-ip_summed = CHECKSUM_UNNECESSARY;
+newskb-sk = srcsk;
 netif_rx(newskb);
 }
 } else {
-/* indication for the CAN driver: no loopback required */
-*tx_sk = NULL;
+/*
+ * Indication for the CAN driver: No loopback required!
+ * This packet is only transmitted to the CAN bus.
+ */
+skb-pkt_type = PACKET_OTHERHOST;
 }
 
 if (!(skb-dev-flags  IFF_UP))
@@ -581,10 +593,12 @@
 
 static inline void deliver(struct sk_buff *skb, struct receiver *r)
 {
+struct sock *srcsk = skb-sk;
 struct sk_buff *clone = skb_clone(skb, GFP_ATOMIC);
 
 DBG(skbuff %p cloned to %p\n, skb, clone);
 if (clone) {
+clone-sk = srcsk;
 r-func(clone, r-data);
 r-matches++;
 }
Index: net/can/raw.c
===
--- net/can/raw.c(revision 325)
+++ net/can/raw.c(working copy)
@@ -154,7 +154,7 @@
 
 if (!ro-recv_own_msgs) {
 /* check the received tx sock reference */
-if (*(struct sock **)skb-cb == sk) {
+if (skb-sk == sk) {
 DBG(trashed own tx msg\n);
 kfree_skb(skb);
 return;
Index: drivers/net/can/vcan.c
===
--- drivers/net/can/vcan.c(revision 325)
+++ drivers/net/can/vcan.c(working copy)
@@ -133,6 +133,7 @@
 skb-protocol  = htons(ETH_P_CAN);
 skb-dev   = dev;
 skb-ip_summed = CHECKSUM_UNNECESSARY;
+skb-pkt_type  = PACKET_HOST;
 
 DBG(received skbuff on interface %d\n, dev-ifindex);
 
@@ -149,8 +150,8 @@
 stats-tx_packets++;
 stats-tx_bytes += skb-len;
 
-/* tx socket reference pointer: Loopback required if not NULL */
-loop = *(struct sock **)skb-cb != NULL;
+/* indication for CAN netdevice drivers that loopback is required */
+loop = (skb-pkt_type == PACKET_BROADCAST);
 
 if (!loopback) {
 /* no loopback handling available inside this driver */
@@ -170,6 +171,8 @@
 /* perform standard loopback handling for CAN network interfaces */
 
 if (loop) {
+struct sock *srcsk = skb-sk;
+
 if (atomic_read(skb-users) != 1) {
 struct sk_buff

2.6.22-rc3-mm1 - pppd hanging in netdev_run_todo while holding mutex

2007-06-04 Thread Valdis . Kletnieks

On Wed, 30 May 2007 23:58:23 PDT, Andrew Morton said:
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc3/2.6.22-rc3-mm1/

Under 22-rc2-mm1, if my VPN connection got reset, ppp0 just quietly went away.

Under 22-rc3-mm1, it seems to end up wedged and waiting for references to
go away:

Jun  4 09:23:01 turing-police kernel: [90089.270707] unregister_netdevice: 
waiting for ppp0 to become free. Usage count = 8
Jun  4 09:23:11 turing-police kernel: [90099.396121] unregister_netdevice: 
waiting for ppp0 to become free. Usage count = 8
Jun  4 09:23:21 turing-police kernel: [90109.520574] unregister_netdevice: 
waiting for ppp0 to become free. Usage count = 8
Jun  4 09:23:32 turing-police kernel: [90119.653129] unregister_netdevice: 
waiting for ppp0 to become free. Usage count = 8

'echo t  /proc/sysrq_trigger' shows pppd hung up here:

Jun  4 10:52:57 turing-police kernel: [95478.047892] pppd  D 
000105ad3830  4968  3815  1 (NOTLB)
Jun  4 10:52:57 turing-police kernel: [95478.047902]  810008d5fd78 
0086  81000349
Jun  4 10:52:57 turing-police kernel: [95478.047911]  810008d5fd28 
810008d4a040 810003461820 810008d4a2b0
Jun  4 10:52:57 turing-police kernel: [95478.047920]  000105ad3733 
0202 00ff 80239795
Jun  4 10:52:57 turing-police kernel: [95478.047928] Call Trace:
Jun  4 10:52:57 turing-police kernel: [95478.047936]  [805207a2] 
schedule_timeout+0x8d/0xb4
Jun  4 10:52:57 turing-police kernel: [95478.047945]  [805207e2] 
schedule_timeout_uninterruptible+0x19/0x1b
Jun  4 10:52:57 turing-police kernel: [95478.047954]  [802397bb] 
msleep+0x14/0x1e
Jun  4 10:52:57 turing-police kernel: [95478.047963]  [8048aa4e] 
netdev_run_todo+0x12f/0x234 
Jun  4 10:52:57 turing-police kernel: [95478.047972]  [8049166f] 
rtnl_unlock+0x35/0x37
Jun  4 10:52:57 turing-police kernel: [95478.047981]  [804894a9] 
unregister_netdev+0x1e/0x23
Jun  4 10:52:57 turing-police kernel: [95478.047994]  [88a5f2c2] 
:ppp_generic:ppp_shutdown_interface+0x67/0xbb
Jun  4 10:52:57 turing-police kernel: [95478.048018]  [88a5f5b8] 
:ppp_generic:ppp_release+0x33/0x65
Jun  4 10:52:57 turing-police kernel: [95478.048028]  [8028d54a] 
__fput+0xac/0x176
Jun  4 10:52:57 turing-police kernel: [95478.048036]  [8028d628] 
fput+0x14/0x16
Jun  4 10:52:57 turing-police kernel: [95478.048045]  [8028a9c6] 
filp_close+0x66/0x71
Jun  4 10:52:57 turing-police kernel: [95478.048054]  [8028bd54] 
sys_close+0x98/0xd7
Jun  4 10:52:57 turing-police kernel: [95478.048062]  [8020a03c] 
tracesys+0xdc/0xe1
Jun  4 10:52:57 turing-police kernel: [95478.048073]  [2b45cd2429a0]

Which in itself wouldn't be so bad, except that it's holding a mutex and
lots of other stuff gets wedged up waiting for it (here's 1 of 6 processes
that was wedged this morning):

Jun  4 10:52:58 turing-police kernel: [95478.051129] ifconfig  D 
810005e19820  5800  9787  20510 (NOTLB)
Jun  4 10:52:58 turing-police kernel: [95478.051141]  81000868fd08 
0082 81000868fec8 0246
Jun  4 10:52:58 turing-police kernel: [95478.051150]  00010101 
810005e19820 810003fe0820 810005e19a90
Jun  4 10:52:58 turing-police kernel: [95478.051159]  0a3f26c0 
0006 81000868ff28 8028aacc
Jun  4 10:52:58 turing-police kernel: [95478.051167] Call Trace:
Jun  4 10:52:58 turing-police kernel: [95478.051176]  [80520bc4] 
__mutex_lock_slowpath+0x74/0xb6
Jun  4 10:52:58 turing-police kernel: [95478.051185]  [805209f3] 
mutex_lock+0xe/0x10
Jun  4 10:52:58 turing-police kernel: [95478.051193]  [8048a938] 
netdev_run_todo+0x19/0x234
Jun  4 10:52:58 turing-police kernel: [95478.051202]  [8049166f] 
rtnl_unlock+0x35/0x37
Jun  4 10:52:58 turing-police kernel: [95478.051210]  [8048a3f2] 
dev_ioctl+0x3e3/0x483
Jun  4 10:52:58 turing-police kernel: [95478.051218]  [8047df30] 
sock_ioctl+0x1ef/0x1fc
Jun  4 10:52:58 turing-police kernel: [95478.051227]  [802989be] 
do_ioctl+0x2a/0x77
Jun  4 10:52:58 turing-police kernel: [95478.051235]  [80298c52] 
vfs_ioctl+0x247/0x264
Jun  4 10:52:58 turing-police kernel: [95478.051243]  [80298cce] 
sys_ioctl+0x5f/0x85
Jun  4 10:52:58 turing-police kernel: [95478.051252]  [8020a03c] 
tracesys+0xdc/0xe1

(And of course, you can't shutdown cleanly, because /etc/init.d/network tries
to down other interfaces on the way out, and)

I'd bisect this, except I don't have a better way to replicate it than wait for
our VPN box to reset the connection after 24 hours of connect - basically means
I get 2 tries per weekend..)

An hour or so of digging through the -rc3-mm1 broken-out/ didn't find any
obvious-to-me culprits.  Any ideas/suggestions?


pgpgLKOKJ5mzu.pgp
Description: PGP signature

Re: [PATCH 3/5] NetXen: Add NETXEN prefix to macros

2007-06-04 Thread Andy Gospodarek

On Sun, Jun 03, 2007 at 11:50:29AM -0400, Jeff Garzik wrote:
 Mithlesh Thukral wrote:
 NetXen: Add NETXEN prefix to a macro
 This patch will add the NETXEN prefix to USER_START macro.
 
 Signed-off by: Wen Xiong [EMAIL PROTECTED]
 Signed-off by: Mithlesh Thukral [EMAIL PROTECTED]
 ---
 
  drivers/net/netxen/netxen_nic.h |4 ++--
  drivers/net/netxen/netxen_nic_ethtool.c |2 +-
  drivers/net/netxen/netxen_nic_hw.c  |4 ++--
  drivers/net/netxen/netxen_nic_init.c|4 ++--
  4 files changed, 7 insertions(+), 7 deletions(-)
 
 Your patch description is useless.  Clearly we know -what- it does, 
 simply by reading the patch.
 
 But it does not answer the simple question:  why?  why is this needed in 
 a bug fix Release Candidate series?
 

I can't see why this is needed in an RC branch, but the use of
generically named macros/variables has been problematic with this driver
on more 'obscure' arches.  

I submitted a bigger cleanup patch a in March that never got taken so
this is still a problem.  Here is that patch:

Signed-off-by: Andy Gospodarek [EMAIL PROTECTED]
---

 netxen_nic.h |   47 ---
 netxen_nic_ethtool.c |8 
 netxen_nic_hw.c  |   10 +-
 netxen_nic_init.c|   23 ---
 4 files changed, 45 insertions(+), 43 deletions(-)

diff --git a/drivers/net/netxen/netxen_nic.h b/drivers/net/netxen/netxen_nic.h
index dd8ce35..d5f0c06 100644
--- a/drivers/net/netxen/netxen_nic.h
+++ b/drivers/net/netxen/netxen_nic.h
@@ -68,9 +68,10 @@
 #define _NETXEN_NIC_LINUX_SUBVERSION 3
 #define NETXEN_NIC_LINUX_VERSIONID  3.3.3
 
-#define NUM_FLASH_SECTORS (64)
-#define FLASH_SECTOR_SIZE (64 * 1024)
-#define FLASH_TOTAL_SIZE  (NUM_FLASH_SECTORS * FLASH_SECTOR_SIZE)
+#define NETXEN_NUM_FLASH_SECTORS (64)
+#define NETXEN_FLASH_SECTOR_SIZE (64 * 1024)
+#define NETXEN_FLASH_TOTAL_SIZE  (NETXEN_NUM_FLASH_SECTORS \
+   * NETXEN_FLASH_SECTOR_SIZE)
 
 #define PHAN_VENDOR_ID 0x4040
 
@@ -671,28 +672,28 @@ struct netxen_new_user_info {
 
 /* Flash memory map */
 typedef enum {
-   CRBINIT_START = 0,  /* Crbinit section */
-   BRDCFG_START = 0x4000,  /* board config */
-   INITCODE_START = 0x6000,/* pegtune code */
-   BOOTLD_START = 0x1, /* bootld */
-   IMAGE_START = 0x43000,  /* compressed image */
-   SECONDARY_START = 0x20, /* backup images */
-   PXE_START = 0x3E,   /* user defined region */
-   USER_START = 0x3E8000,  /* User defined region for new boards */
-   FIXED_START = 0x3F  /* backup of crbinit */
+   NETXEN_CRBINIT_START = 0,   /* Crbinit section */
+   NETXEN_BRDCFG_START = 0x4000,   /* board config */
+   NETXEN_INITCODE_START = 0x6000, /* pegtune code */
+   NETXEN_BOOTLD_START = 0x1,  /* bootld */
+   NETXEN_IMAGE_START = 0x43000,   /* compressed image */
+   NETXEN_SECONDARY_START = 0x20,  /* backup images */
+   NETXEN_PXE_START = 0x3E,/* user defined region */
+   NETXEN_USER_START = 0x3E8000,   /* User defined region for new boards */
+   NETXEN_FIXED_START = 0x3F   /* backup of crbinit */
 } netxen_flash_map_t;
 
-#define USER_START_OLD PXE_START   /* for backward compatibility */
-
-#define FLASH_START(CRBINIT_START)
-#define INIT_SECTOR(0)
-#define PRIMARY_START  (BOOTLD_START)
-#define FLASH_CRBINIT_SIZE (0x4000)
-#define FLASH_BRDCFG_SIZE  (sizeof(struct netxen_board_info))
-#define FLASH_USER_SIZE(sizeof(struct 
netxen_user_info)/sizeof(u32))
-#define FLASH_SECONDARY_SIZE   (USER_START-SECONDARY_START)
-#define NUM_PRIMARY_SECTORS(0x20)
-#define NUM_CONFIG_SECTORS (1)
+#define NETXEN_USER_START_OLD NETXEN_PXE_START /* for backward compatibility */
+
+#define NETXEN_FLASH_START (NETXEN_CRBINIT_START)
+#define NETXEN_INIT_SECTOR (0)
+#define NETXEN_PRIMARY_START   (NETXEN_BOOTLD_START)
+#define NETXEN_FLASH_CRBINIT_SIZE  (0x4000)
+#define NETXEN_FLASH_BRDCFG_SIZE   (sizeof(struct netxen_board_info))
+#define NETXEN_FLASH_USER_SIZE (sizeof(struct 
netxen_user_info)/sizeof(u32))
+#define NETXEN_FLASH_SECONDARY_SIZE
(NETXEN_USER_START-NETXEN_SECONDARY_START)
+#define NETXEN_NUM_PRIMARY_SECTORS (0x20)
+#define NETXEN_NUM_CONFIG_SECTORS  (1)
 #define PFX NetXen: 
 extern char netxen_nic_driver_name[];
 
diff --git a/drivers/net/netxen/netxen_nic_ethtool.c 
b/drivers/net/netxen/netxen_nic_ethtool.c
index ee1b5a2..4dfa76b 100644
--- a/drivers/net/netxen/netxen_nic_ethtool.c
+++ b/drivers/net/netxen/netxen_nic_ethtool.c
@@ -94,7 +94,7 @@ static const char netxen_nic_gstrings_test[][ETH_GSTRING_LEN] 
= {
 
 static int netxen_nic_get_eeprom_len(struct net_device *dev)
 {
-   return FLASH_TOTAL_SIZE;
+   return NETXEN_FLASH_TOTAL_SIZE;
 }
 
 static void
@@ -475,7 +475,7 @@ netxen_nic_set_eeprom(struct

Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Thomas Gleixner

On Mon, 2007-06-04 at 10:51 -0700, Stephen Hemminger wrote:
  I doubt that. This is in the iperf code itself.
  
  void thread_rest ( void ) {
  #if defined( HAVE_THREAD )
  #if defined( HAVE_POSIX_THREAD )
  // TODO add checks for sched_yield or pthread_yield and call that
  // if available
  usleep( 0 );
  
  --
  
  It results in a nanosleep({0,0}, NULL)
  
  tglx
  
 
 Yes, the following patch makes iperf work better than ever.
 But are other broken applications going to have same problem.
 Sounds like the old who runs first fork() problems.

Not really. The fork() who runs first problem is nowhere specified.

usleep(0) is well defined:

 If the value of useconds is 0, then the call has no effect.

So the call into the kernel has been wrong for quite a time.

tglx


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: Add security check before flushing SAD/SPD

2007-06-04 Thread Eric Paris

Some time ago this thread bounced back and forth and seemed to come to
rest with the patch below, I cleaned up the comments and put all the
ACKs it received in one place below, but so much time has passed I doubt
if they should still count for free.  I also rediffed the patch against
the latest miller tree.  Is the idea or patch in any way flawed or
unacceptable to people at the moment?

Anyone willing to step up an re-ack the patch to get it moving into the
tree?

-Eric

Currently we check for permission before deleting entries from SAD and
SPD, (see security_xfrm_policy_delete() security_xfrm_state_delete())
However we are not checking for authorization when flushing the SPD and
the SAD completely. It was perhaps missed in the original security hooks
patch.

This patch adds a security check when flushing entries from the SAD and
SPD.  It runs the entire database and checks each entry for a denial.
If the process attempting the flush is unable to remove all of the
entries a denial is logged the the flush function returns an error
without removing anything.

This is particularly useful when a process may need to create or delete
its own xfrm entries used for things like labeled networking but that
same process should not be able to delete other entries or flush the
entire database.

WAS Signed-off-by: Signed-off-by: Joy Latten[EMAIL PROTECTED] NOT NOW
WAS Acked-by: James Morris [EMAIL PROTECTED] NOT NOW
WAS Acked-by: Eric Paris [EMAIL PROTECTED] NOT NOW

---

 include/net/xfrm.h |6 ++--
 net/key/af_key.c   |   10 ++-
 net/xfrm/xfrm_policy.c |   63 +--
 net/xfrm/xfrm_state.c  |   46 --
 net/xfrm/xfrm_user.c   |9 +-
 5 files changed, 121 insertions(+), 13 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 90185e8..311f25a 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -964,7 +964,7 @@ struct xfrmk_spdinfo {
 
 extern struct xfrm_state *xfrm_find_acq_byseq(u32 seq);
 extern int xfrm_state_delete(struct xfrm_state *x);
-extern void xfrm_state_flush(u8 proto, struct xfrm_audit *audit_info);
+extern int xfrm_state_flush(u8 proto, struct xfrm_audit *audit_info);
 extern void xfrm_sad_getinfo(struct xfrmk_sadinfo *si);
 extern void xfrm_spd_getinfo(struct xfrmk_spdinfo *si);
 extern int xfrm_replay_check(struct xfrm_state *x, __be32 seq);
@@ -1020,13 +1020,13 @@ struct xfrm_policy *xfrm_policy_bysel_ctx(u8 type, int 
dir,
  struct xfrm_sec_ctx *ctx, int delete,
  int *err);
 struct xfrm_policy *xfrm_policy_byid(u8, int dir, u32 id, int delete, int 
*err);
-void xfrm_policy_flush(u8 type, struct xfrm_audit *audit_info);
+int xfrm_policy_flush(u8 type, struct xfrm_audit *audit_info);
 u32 xfrm_get_acqseq(void);
 void xfrm_alloc_spi(struct xfrm_state *x, __be32 minspi, __be32 maxspi);
 struct xfrm_state * xfrm_find_acq(u8 mode, u32 reqid, u8 proto,
  xfrm_address_t *daddr, xfrm_address_t *saddr,
  int create, unsigned short family);
-extern void xfrm_policy_flush(u8 type, struct xfrm_audit *audit_info);
+extern int xfrm_policy_flush(u8 type, struct xfrm_audit *audit_info);
 extern int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy 
*pol);
 extern int xfrm_bundle_ok(struct xfrm_policy *pol, struct xfrm_dst *xdst,
  struct flowi *fl, int family, int strict);
diff --git a/net/key/af_key.c b/net/key/af_key.c
index d302dda..0f8304b 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -1682,6 +1682,7 @@ static int pfkey_flush(struct sock *sk, struct sk_buff 
*skb, struct sadb_msg *hd
unsigned proto;
struct km_event c;
struct xfrm_audit audit_info;
+   int err;
 
proto = pfkey_satype2proto(hdr-sadb_msg_satype);
if (proto == 0)
@@ -1689,7 +1690,9 @@ static int pfkey_flush(struct sock *sk, struct sk_buff 
*skb, struct sadb_msg *hd
 
audit_info.loginuid = audit_get_loginuid(current-audit_context);
audit_info.secid = 0;
-   xfrm_state_flush(proto, audit_info);
+   err = xfrm_state_flush(proto, audit_info);
+   if (err)
+   return err;
c.data.proto = proto;
c.seq = hdr-sadb_msg_seq;
c.pid = hdr-sadb_msg_pid;
@@ -2683,10 +2686,13 @@ static int pfkey_spdflush(struct sock *sk, struct 
sk_buff *skb, struct sadb_msg
 {
struct km_event c;
struct xfrm_audit audit_info;
+   int err;
 
audit_info.loginuid = audit_get_loginuid(current-audit_context);
audit_info.secid = 0;
-   xfrm_policy_flush(XFRM_POLICY_TYPE_MAIN, audit_info);
+   err = xfrm_policy_flush(XFRM_POLICY_TYPE_MAIN, audit_info);
+   if (err)
+   return err;
c.data.type = XFRM_POLICY_TYPE_MAIN;
c.event = XFRM_MSG_FLUSHPOLICY;
c.pid = hdr-sadb_msg_pid;
diff --git

Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Thomas Gleixner

On Mon, 2007-06-04 at 21:00 +0200, Thomas Gleixner wrote:
  
  Yes, the following patch makes iperf work better than ever.
  But are other broken applications going to have same problem.
  Sounds like the old who runs first fork() problems.
 
 Not really. The fork() who runs first problem is nowhere specified.
 
 usleep(0) is well defined:
 
  If the value of useconds is 0, then the call has no effect.
 
 So the call into the kernel has been wrong for quite a time.
 

Just for clarification: I'm not saying that we should break the (broken)
user space ABI. I'm going to work out a patch which prints out a warning
(limited number per boot) and emulating the old behavior by a call to
yield() along with an entry into (mis)feature-removal.txt.

tglx


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Ingo Molnar


* Stephen Hemminger [EMAIL PROTECTED] wrote:

 Yes, the following patch makes iperf work better than ever. But are 
 other broken applications going to have same problem. Sounds like the 
 old who runs first fork() problems.

this is the first such app and really, and even for this app: i've been 
frequently running iperf on -rt kernels for _years_ and never noticed 
how buggy its 'locking' code was, and that it would under some 
circumstances use up the whole CPU on high-res timers.

If this were a widespread problem then the right solution would be to 
preserve the old behavior. The child-runs-first thing was an unspecified 
detail and many apps grew to depend on how the kernel did it - and when 
we changed it all hell broke lose. Even today, when i switch off 
child-runs-first, Gnome would not start up because some of its startup 
threads are racy and hang :-/

I googled for usleep(0) quickly (on google.com/codesearch and on 
google.com) and it didnt seem to suggest that the problem is widespread. 
(3 hits on the code-search site, all were false positives.)

so ... if this situation were even just a little bit more serious i'd 
argue for working this around and implementing some API quirk. But right 
now i'm leaning towards just saying that the iperf fix exists and fixes 
the problem, and that we already have kernels out with the new behavior. 
Maybe we should add a once-per-boot printk to flesh out any other apps? 
I'd be surprised if there was more than iperf.

Ingo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Maximilian Engelhardt

On Monday 04 June 2007, Ingo Molnar wrote:
 * Stephen Hemminger [EMAIL PROTECTED] wrote:
  Yes, the following patch makes iperf work better than ever. But are
  other broken applications going to have same problem. Sounds like the
  old who runs first fork() problems.

 this is the first such app and really, and even for this app: i've been
 frequently running iperf on -rt kernels for _years_ and never noticed
 how buggy its 'locking' code was, and that it would under some
 circumstances use up the whole CPU on high-res timers.

I must admit I don't know much about that topic, but there is one thing I 
don't understand. Why is iperf (even if it's buggy) able to use up the whole 
cpu? I didn't run it as root but as my normal user so it should have limited 
rights. Shouldn't the linux scheduler distribute cpu time among all running 
processes?

Maxi


signature.asc
Description: This is a digitally signed message part.

Re: [PATCH]: Add security check before flushing SAD/SPD

2007-06-04 Thread James Morris

On Mon, 4 Jun 2007, Eric Paris wrote:

 Some time ago this thread bounced back and forth and seemed to come to
 rest with the patch below, I cleaned up the comments and put all the
 ACKs it received in one place below, but so much time has passed I doubt
 if they should still count for free.  I also rediffed the patch against
 the latest miller tree.  Is the idea or patch in any way flawed or
 unacceptable to people at the moment?
 
 Anyone willing to step up an re-ack the patch to get it moving into the
 tree?

Looks good to me.

Acked-by: James Morris [EMAIL PROTECTED]


-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Stephen Hemminger

On Mon, 4 Jun 2007 21:47:59 +0200
Maximilian Engelhardt [EMAIL PROTECTED] wrote:

 On Monday 04 June 2007, Ingo Molnar wrote:
  * Stephen Hemminger [EMAIL PROTECTED] wrote:
   Yes, the following patch makes iperf work better than ever. But are
   other broken applications going to have same problem. Sounds like the
   old who runs first fork() problems.
 
  this is the first such app and really, and even for this app: i've been
  frequently running iperf on -rt kernels for _years_ and never noticed
  how buggy its 'locking' code was, and that it would under some
  circumstances use up the whole CPU on high-res timers.
 
 I must admit I don't know much about that topic, but there is one thing I 
 don't understand. Why is iperf (even if it's buggy) able to use up the whole 
 cpu? I didn't run it as root but as my normal user so it should have limited 
 rights. Shouldn't the linux scheduler distribute cpu time among all running 
 processes?

In this case, there are two threads. One is receiving data and the other
is spinning checking on progress. If the spinning thread doesn't yield,
it will end up using it's whole quantum (10ms at 100hz), before the
scheduler lets the receiver run again. If the receiving thread doesn't
get to run then on a UP the performance stinks.

The problem only showed up laptop because most of my other systems are
SMP (or fake SMP/HT), and usually set HZ to 1000 not 100.

It kind of reminds me of the family road trip with the kids in the back
seat going Are we there yet?

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/4] b44 driver improvements

2007-06-04 Thread Stephen Hemminger

While researching why the wired networking was so slow
on my laptop (not a driver problem); spotted these small
changes to b44 driver. Nothing urgent, maybe 2.6.23
or later material.

-- 
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/4] b44: timer power saving

2007-06-04 Thread Stephen Hemminger

Make the PHY and statistic timer run on one second boundary
for powersaving.

On resume, the driver should check for link up immediately, to
get online faster (rather than waiting for the next second).

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


---
 drivers/net/b44.c |9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

--- a/drivers/net/b44.c 2007-06-04 12:31:27.0 -0700
+++ b/drivers/net/b44.c 2007-06-04 12:31:34.0 -0700
@@ -599,8 +599,7 @@ static void b44_timer(unsigned long __op
 
spin_unlock_irq(bp-lock);
 
-   bp-timer.expires = jiffies + HZ;
-   add_timer(bp-timer);
+   mod_timer(bp-timer, round_jiffies(jiffies + HZ));
 }
 
 static void b44_tx(struct b44 *bp)
@@ -2348,11 +2347,11 @@ static int b44_resume(struct pci_dev *pd
netif_device_attach(bp-dev);
spin_unlock_irq(bp-lock);
 
-   bp-timer.expires = jiffies + HZ;
-   add_timer(bp-timer);
-
b44_enable_ints(bp);
netif_wake_queue(dev);
+
+   mod_timer(bp-timer, jiffies + 1);
+
return 0;
 }
 

-- 
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/4] b44: packet offset is constant

2007-06-04 Thread Stephen Hemminger

The receive buffer offset is constant in this driver.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/b44.c |   29 -
 drivers/net/b44.h |2 --
 2 files changed, 12 insertions(+), 19 deletions(-)

Index: lifebook/drivers/net/b44.c
===
--- lifebook.orig/drivers/net/b44.c 2007-06-04 13:22:12.0 -0700
+++ lifebook/drivers/net/b44.c  2007-06-04 13:27:47.0 -0700
@@ -15,6 +15,7 @@
 #include linux/ethtool.h
 #include linux/mii.h
 #include linux/if_ether.h
+#include linux/if_vlan.h
 #include linux/etherdevice.h
 #include linux/pci.h
 #include linux/delay.h
@@ -68,7 +69,8 @@
  (BP)-tx_cons - (BP)-tx_prod - TX_RING_GAP(BP))
 #define NEXT_TX(N) (((N) + 1)  (B44_TX_RING_SIZE - 1))
 
-#define RX_PKT_BUF_SZ  (1536 + bp-rx_offset + 64)
+#define RX_PKT_OFFSET  30
+#define RX_PKT_BUF_SZ  (1536 + RX_PKT_OFFSET + 64)
 
 /* minimum number of free TX descriptors required to wake up TX process */
 #define B44_TX_WAKEUP_THRESH   (B44_TX_RING_SIZE / 4)
@@ -683,10 +685,9 @@
}
 
skb-dev = bp-dev;
-   skb_reserve(skb, bp-rx_offset);
+   rh = (struct rx_header *) skb-data;
+   skb_reserve(skb, RX_PKT_OFFSET);
 
-   rh = (struct rx_header *)
-   (skb-data - bp-rx_offset);
rh-len = 0;
rh-flags = 0;
 
@@ -696,13 +697,13 @@
if (src_map != NULL)
src_map-skb = NULL;
 
-   ctrl  = (DESC_CTRL_LEN  (RX_PKT_BUF_SZ - bp-rx_offset));
+   ctrl  = (DESC_CTRL_LEN  (RX_PKT_BUF_SZ - RX_PKT_OFFSET));
if (dest_idx == (B44_RX_RING_SIZE - 1))
ctrl |= DESC_CTRL_EOT;
 
dp = bp-rx_ring[dest_idx];
dp-ctrl = cpu_to_le32(ctrl);
-   dp-addr = cpu_to_le32((u32) mapping + bp-rx_offset + bp-dma_offset);
+   dp-addr = cpu_to_le32((u32) mapping + RX_PKT_OFFSET + bp-dma_offset);
 
if (bp-flags  B44_FLAG_RX_RING_HACK)
b44_sync_dma_desc_for_device(bp-pdev, bp-rx_ring_dma,
@@ -781,7 +782,7 @@
PCI_DMA_FROMDEVICE);
rh = (struct rx_header *) skb-data;
len = le16_to_cpu(rh-len);
-   if ((len  (RX_PKT_BUF_SZ - bp-rx_offset)) ||
+   if ((len  (RX_PKT_BUF_SZ - RX_PKT_OFFSET)) ||
(rh-flags  cpu_to_le16(RX_FLAG_ERRORS))) {
drop_it:
b44_recycle_rx(bp, cons, bp-rx_prod);
@@ -813,8 +814,8 @@
pci_unmap_single(bp-pdev, map,
 skb_size, PCI_DMA_FROMDEVICE);
/* Leave out rx_header */
-   skb_put(skb, len+bp-rx_offset);
-   skb_pull(skb,bp-rx_offset);
+   skb_put(skb, len + RX_PKT_OFFSET);
+   skb_pull(skb, RX_PKT_OFFSET);
} else {
struct sk_buff *copy_skb;
 
@@ -826,7 +827,7 @@
skb_reserve(copy_skb, 2);
skb_put(copy_skb, len);
/* DMA sync done above, copy just the actual packet */
-   skb_copy_from_linear_data_offset(skb, bp-rx_offset,
+   skb_copy_from_linear_data_offset(skb, RX_PKT_OFFSET,
 copy_skb-data, len);
skb = copy_skb;
}
@@ -1393,12 +1394,12 @@
bw32(bp, B44_TX_WMARK, 56); /* XXX magic */
if (reset_kind == B44_PARTIAL_RESET) {
bw32(bp, B44_DMARX_CTRL, (DMARX_CTRL_ENABLE |
- (bp-rx_offset  DMARX_CTRL_ROSHIFT)));
+ (RX_PKT_OFFSET  DMARX_CTRL_ROSHIFT)));
} else {
bw32(bp, B44_DMATX_CTRL, DMATX_CTRL_ENABLE);
bw32(bp, B44_DMATX_ADDR, bp-tx_ring_dma + bp-dma_offset);
bw32(bp, B44_DMARX_CTRL, (DMARX_CTRL_ENABLE |
- (bp-rx_offset  DMARX_CTRL_ROSHIFT)));
+ (RX_PKT_OFFSET  DMARX_CTRL_ROSHIFT)));
bw32(bp, B44_DMARX_ADDR, bp-rx_ring_dma + bp-dma_offset);
 
bw32(bp, B44_DMARX_PTR, bp-rx_pending);
@@ -2090,11 +2091,6 @@
 
bp-phy_addr = eeprom[90]  0x1f;
 
-   /* With this, plus the rx_header prepended to the data by the
-* hardware, we'll land the ethernet header on a 2-byte boundary.
-*/
-   bp-rx_offset = 30;
-
bp-imask = IMASK_DEF;
 
bp-core_unit = ssb_core_unit(bp);
Index: lifebook/drivers/net/b44.h
===
--- lifebook.orig/drivers/net/b44.h 2007-06-04 13:18:25.0 -0700
+++ lifebook/drivers/net/b44.h  2007-06-04 13:22:15.0 -0700
@@ -443,8 +443,6 @@
 #define B44_FLAG_TX_RING_HACK  0x4000

[PATCH 4/4] b44: use netdev_alloc_skb

2007-06-04 Thread Stephen Hemminger

Use netdev_alloc_skb rather than dev_alloc_skb when allocating
receive buffers.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/b44.c |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/drivers/net/b44.c 2007-06-04 12:49:11.0 -0700
+++ b/drivers/net/b44.c 2007-06-04 12:49:26.0 -0700
@@ -653,7 +653,7 @@ static int b44_alloc_rx_skb(struct b44 *
src_map = bp-rx_buffers[src_idx];
dest_idx = dest_idx_unmasked  (B44_RX_RING_SIZE - 1);
map = bp-rx_buffers[dest_idx];
-   skb = dev_alloc_skb(RX_PKT_BUF_SZ);
+   skb = netdev_alloc_skb(bp-dev, RX_PKT_BUF_SZ);
if (skb == NULL)
return -ENOMEM;
 
@@ -669,7 +669,7 @@ static int b44_alloc_rx_skb(struct b44 *
if (!dma_mapping_error(mapping))
pci_unmap_single(bp-pdev, mapping, 
RX_PKT_BUF_SZ,PCI_DMA_FROMDEVICE);
dev_kfree_skb_any(skb);
-   skb = __dev_alloc_skb(RX_PKT_BUF_SZ,GFP_DMA);
+   skb = __netdev_alloc_skb(bp-dev, RX_PKT_BUF_SZ, 
GFP_ATOMIC|GFP_DMA);
if (skb == NULL)
return -ENOMEM;
mapping = pci_map_single(bp-pdev, skb-data,
@@ -684,7 +684,6 @@ static int b44_alloc_rx_skb(struct b44 *
}
}
 
-   skb-dev = bp-dev;
rh = (struct rx_header *) skb-data;
skb_reserve(skb, RX_PKT_OFFSET);
 

-- 
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/4] b44: tx bounce sizing.

2007-06-04 Thread Stephen Hemminger

No need to grap full size MTU buffer for possibly small transmit
bounce buffers.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/b44.c |7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

--- a/drivers/net/b44.c 2007-06-04 12:46:54.0 -0700
+++ b/drivers/net/b44.c 2007-06-04 12:48:21.0 -0700
@@ -69,7 +69,6 @@
 #define NEXT_TX(N) (((N) + 1)  (B44_TX_RING_SIZE - 1))
 
 #define RX_PKT_BUF_SZ  (1536 + bp-rx_offset + 64)
-#define TX_PKT_BUF_SZ  (B44_MAX_MTU + ETH_HLEN + 8)
 
 /* minimum number of free TX descriptors required to wake up TX process */
 #define B44_TX_WAKEUP_THRESH   (B44_TX_RING_SIZE / 4)
@@ -968,7 +967,6 @@ static void b44_tx_timeout(struct net_de
 static int b44_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
struct b44 *bp = netdev_priv(dev);
-   struct sk_buff *bounce_skb;
int rc = NETDEV_TX_OK;
dma_addr_t mapping;
u32 len, entry, ctrl;
@@ -986,12 +984,13 @@ static int b44_start_xmit(struct sk_buff
 
mapping = pci_map_single(bp-pdev, skb-data, len, PCI_DMA_TODEVICE);
if (dma_mapping_error(mapping) || mapping + len  DMA_30BIT_MASK) {
+   struct sk_buff *bounce_skb;
+
/* Chip can't handle DMA to/from 1GB, use bounce buffer */
if (!dma_mapping_error(mapping))
pci_unmap_single(bp-pdev, mapping, len, 
PCI_DMA_TODEVICE);
 
-   bounce_skb = __dev_alloc_skb(TX_PKT_BUF_SZ,
-GFP_ATOMIC|GFP_DMA);
+   bounce_skb = __dev_alloc_skb(len, GFP_ATOMIC | GFP_DMA);
if (!bounce_skb)
goto err_out;
 
@@ -1000,13 +999,12 @@ static int b44_start_xmit(struct sk_buff
if (dma_mapping_error(mapping) || mapping + len  
DMA_30BIT_MASK) {
if (!dma_mapping_error(mapping))
pci_unmap_single(bp-pdev, mapping,
-len, PCI_DMA_TODEVICE);
+len, PCI_DMA_TODEVICE);
dev_kfree_skb_any(bounce_skb);
goto err_out;
}
 
-   skb_copy_from_linear_data(skb, skb_put(bounce_skb, len),
- skb-len);
+   skb_copy_from_linear_data(skb, skb_put(bounce_skb, len), len);
dev_kfree_skb_any(skb);
skb = bounce_skb;
}

-- 
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Maximilian Engelhardt

On Monday 04 June 2007, Stephen Hemminger wrote:
 On Mon, 4 Jun 2007 21:47:59 +0200

 Maximilian Engelhardt [EMAIL PROTECTED] wrote:
  On Monday 04 June 2007, Ingo Molnar wrote:
   * Stephen Hemminger [EMAIL PROTECTED] wrote:
Yes, the following patch makes iperf work better than ever. But are
other broken applications going to have same problem. Sounds like the
old who runs first fork() problems.
  
   this is the first such app and really, and even for this app: i've been
   frequently running iperf on -rt kernels for _years_ and never noticed
   how buggy its 'locking' code was, and that it would under some
   circumstances use up the whole CPU on high-res timers.
 
  I must admit I don't know much about that topic, but there is one thing I
  don't understand. Why is iperf (even if it's buggy) able to use up the
  whole cpu? I didn't run it as root but as my normal user so it should
  have limited rights. Shouldn't the linux scheduler distribute cpu time
  among all running processes?

 In this case, there are two threads. One is receiving data and the other
 is spinning checking on progress. If the spinning thread doesn't yield,
 it will end up using it's whole quantum (10ms at 100hz), before the
 scheduler lets the receiver run again. If the receiving thread doesn't
 get to run then on a UP the performance stinks.

Ok, let's see if I got this right:
If there are other processes that want cpu time they will get it after the 
quantum for the iperf thread is used up. So cpu time will be distributed 
among other processes, but it takes some time until they get it and this 
increases latency.

 The problem only showed up laptop because most of my other systems are
 SMP (or fake SMP/HT), and usually set HZ to 1000 not 100.

Hm, on my laptop (Pentium M) I have configured CONFIG_HZ_300 and CONFIG_NO_HZ.
On my desktop PC (Athlon 2000+, also UP) I also have CONFIG_HZ_300 and 
CONFIG_NO_HZ but didn't notice the problem.

Maxi


signature.asc
Description: This is a digitally signed message part.

Re: r8169 in 2.6.18 silently corrupting data, 2.6.22-rc3 link not detected at all

2007-06-04 Thread Francois Romieu

Francois Romieu [EMAIL PROTECTED] :
[...]

Andrew, have you had any time to do some testing with an uniform 7200
bytes MTU through your LAN ?

You will find a backport of the r8169 driver for the 2.6.18 kernel at
http://www.fr.zoreil.com/people/francois/backport/r8169/20070604-00

Could you do a binary search of the patch which breaks the link detection ?
From a different report that I have received, the current testing branch of
the 8169 driver would improve things for MTU beyond 1500 bytes: nailing down
your broken link regression would be a welcome improvement.

If you are confortable with git, you can do a git bisect after pulling
the branch 'r8169-backport-test-20070604' at:
git://electric-eye.fr.zoreil.com/home/romieu/linux/linux-2.6-out

(please do the initial clone from elsewhere if you use git, thanks)

-- 
Ueimor
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC][PATCH] Add suspend and resume support to uli526x

2007-06-04 Thread Nigel Cunningham

Hi.

On Mon, 2007-06-04 at 15:49 +0200, Rafael J. Wysocki wrote:
 On Monday, 4 June 2007 13:11, Pavel Machek wrote:
   From: Rafael J. Wysocki [EMAIL PROTECTED]
   
   Add suspend/resume support to the uli526x network driver (tested on 
   x86_64,
   with 'Ethernet controller: ALi Corporation M5263 Ethernet Controller, rev 
   40').
   
   Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED]
  
  Looks ok to me.
  
   +#ifdef CONFIG_PM
   +
   +/*
   + *   Suspend the interface.
   + */
   +
   +static int uli526x_suspend(struct pci_dev *pdev, pm_message_t state)
   +{
   + struct net_device *dev = pci_get_drvdata(pdev);
   +
   + ULI526X_DBUG(0, uli526x_suspend, 0);
   +
   + if (dev  netdev_priv(dev)) {
   + if (netif_running(dev)) {
   + netif_device_detach(dev);
   + uli526x_down(dev);
   + }
   + pci_save_state(pdev);
   + pci_enable_wake(pdev, pci_choose_state(pdev, state), 0);
   + pci_disable_device(pdev);
   + pci_set_power_state(pdev, pci_choose_state(pdev, state));
   + }
   + return 0;
   +}
   +
   +/*
   + *   Resume the interface.
   + */
   +
   +static int uli526x_resume(struct pci_dev *pdev)
   +{
   + struct net_device *dev = pci_get_drvdata(pdev);
   + struct uli526x_board_info *db = netdev_priv(dev);
   + int err;
   +
   + ULI526X_DBUG(0, uli526x_resume, 0);
   +
   + if (dev  db) {
   + pci_set_power_state(pdev, PCI_D0);
   + err = pci_enable_device(pdev);
   + if (err) {
   + printk(KERN_WARNING %s: Could not enable device \n,
   + dev-name);
   + return err;
   + }
   + pci_restore_state(pdev);
   + pci_set_master(pdev);
   + if (netif_running(dev)) {
   + uli526x_up(dev);
   + netif_device_attach(dev);
   + }
   + }
   + return 0;
   +}
   +
  #else 
  #define *_resume NULL
  #define *_suspend NULL
 
   +#endif /* CONFIG_PM */
  
   @@ -1689,6 +1775,10 @@ static struct pci_driver uli526x_driver 
 .id_table   = uli526x_pci_tbl,
 .probe  = uli526x_init_one,
 .remove = __devexit_p(uli526x_remove_one),
   +#ifdef CONFIG_PM
   + .suspend= uli526x_suspend,
   + .resume = uli526x_resume,
   +#endif
  
  ...so that this ifdef is not needed?
 
 OK, why not.

Because it's uglier and #ifdef is the established way of doing things?

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part

Re: [PATCH 0/4] b44 driver improvements

2007-06-04 Thread Jeff Garzik

On Mon, Jun 04, 2007 at 01:25:36PM -0700, Stephen Hemminger wrote:
 While researching why the wired networking was so slow
 on my laptop (not a driver problem); spotted these small
 changes to b44 driver. Nothing urgent, maybe 2.6.23
 or later material.

Seems sane to me.  I'm travelling and will merge this at the beginning
of next week.  I agree netdev#upstream is appropriate, though the
b44_resume() fix may want upstreaming for 2.6.22.

Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC][PATCH] Add suspend and resume support to uli526x

2007-06-04 Thread Rafael J. Wysocki

Hi,

On Monday, 4 June 2007 23:16, Nigel Cunningham wrote:
 Hi.
 
 On Mon, 2007-06-04 at 15:49 +0200, Rafael J. Wysocki wrote:
  On Monday, 4 June 2007 13:11, Pavel Machek wrote:
From: Rafael J. Wysocki [EMAIL PROTECTED]

Add suspend/resume support to the uli526x network driver (tested on 
x86_64,
with 'Ethernet controller: ALi Corporation M5263 Ethernet Controller, 
rev 40').

Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED]
   
   Looks ok to me.
   
+#ifdef CONFIG_PM
+
+/*
+ * Suspend the interface.
+ */
+
+static int uli526x_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+   struct net_device *dev = pci_get_drvdata(pdev);
+
+   ULI526X_DBUG(0, uli526x_suspend, 0);
+
+   if (dev  netdev_priv(dev)) {
+   if (netif_running(dev)) {
+   netif_device_detach(dev);
+   uli526x_down(dev);
+   }
+   pci_save_state(pdev);
+   pci_enable_wake(pdev, pci_choose_state(pdev, state), 0);
+   pci_disable_device(pdev);
+   pci_set_power_state(pdev, pci_choose_state(pdev, 
state));
+   }
+   return 0;
+}
+
+/*
+ * Resume the interface.
+ */
+
+static int uli526x_resume(struct pci_dev *pdev)
+{
+   struct net_device *dev = pci_get_drvdata(pdev);
+   struct uli526x_board_info *db = netdev_priv(dev);
+   int err;
+
+   ULI526X_DBUG(0, uli526x_resume, 0);
+
+   if (dev  db) {
+   pci_set_power_state(pdev, PCI_D0);
+   err = pci_enable_device(pdev);
+   if (err) {
+   printk(KERN_WARNING %s: Could not enable 
device \n,
+   dev-name);
+   return err;
+   }
+   pci_restore_state(pdev);
+   pci_set_master(pdev);
+   if (netif_running(dev)) {
+   uli526x_up(dev);
+   netif_device_attach(dev);
+   }
+   }
+   return 0;
+}
+
   #else 
   #define *_resume NULL
   #define *_suspend NULL
  
+#endif /* CONFIG_PM */
   
@@ -1689,6 +1775,10 @@ static struct pci_driver uli526x_driver 
.id_table   = uli526x_pci_tbl,
.probe  = uli526x_init_one,
.remove = __devexit_p(uli526x_remove_one),
+#ifdef CONFIG_PM
+   .suspend= uli526x_suspend,
+   .resume = uli526x_resume,
+#endif
   
   ...so that this ifdef is not needed?
  
  OK, why not.
 
 Because it's uglier and #ifdef is the established way of doing things?

Hmm, I have no strong opinion.  I slightly prefer the #ifdef, but well.

Perhaps I'll send it to Andrew 'as is' if no one else has any more comments. ;-)

Greetings,
Rafael


-- 
Premature optimization is the root of all evil. - Donald Knuth
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

skb-priority on arp packets

2007-06-04 Thread Marc Donner

hi @all,

is it possible to set the skb-priority on arp packets generated by the 
kernel?
I want to to set the 802.1p priority on arp and ip packets on an interface. On 
ip packets, this can be done by the iptables CLASSIFY target and the 
skb-priority mapping from the vlan implementation.

any ideas?

thx
Marc
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC] NET: Multiple queue hardware support

2007-06-04 Thread PJ Waskiewicz

This patchset is an updated version of previous multiqueue network device
support patches.  The general approach of introducing a new API for multiqueue
network devices to register with the stack has remained.  The changes include
adding a round-robin qdisc, heavily based on sch_prio, which will allow
queueing to hardware with no OS-enforced queuing policy.  sch_prio still has
the multiqueue code in it, but has a Kconfig option to compile it out of the
qdisc.  This allows people with hardware containing scheduling policies to
use sch_rr (round-robin), and others without scheduling policies in hardware
to continue using sch_prio if they wish to have some notion of scheduling
priority.

The patches to iproute2 for tc will be sent separately, to support sch_rr.

I'm soliciting feedback for a 2.6.23 submission.  Thanks.

-- 
PJ Waskiewicz [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] NET: Multiqueue network device support.

2007-06-04 Thread PJ Waskiewicz

API added to support multiple hardware queues on an ethernet device.
Round-robin scheduler added (sch_rr) to provide a no-scheduling policy
qdisc for hardware with multiple queues.

Signed-off-by: Peter P Waskiewicz Jr [EMAIL PROTECTED]
---

 include/linux/etherdevice.h |3 
 include/linux/netdevice.h   |   62 +
 include/linux/pkt_sched.h   |   11 +
 include/linux/skbuff.h  |2 
 net/core/dev.c  |   27 ++
 net/core/skbuff.c   |3 
 net/ethernet/eth.c  |9 -
 net/sched/Kconfig   |   22 ++
 net/sched/Makefile  |1 
 net/sched/sch_generic.c |4 
 net/sched/sch_prio.c|   66 +-
 net/sched/sch_rr.c  |  516 +++
 12 files changed, 706 insertions(+), 20 deletions(-)

diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
index 071c67a..283e687 100644
--- a/include/linux/etherdevice.h
+++ b/include/linux/etherdevice.h
@@ -39,7 +39,8 @@ extern void   eth_header_cache_update(struct hh_cache 
*hh, struct net_device *dev
 extern int eth_header_cache(struct neighbour *neigh,
 struct hh_cache *hh);
 
-extern struct net_device *alloc_etherdev(int sizeof_priv);
+extern struct net_device *alloc_etherdev_mq(int sizeof_priv, int queue_count);
+#define alloc_etherdev(sizeof_priv) alloc_etherdev_mq(sizeof_priv, 1)
 static inline void eth_copy_and_sum (struct sk_buff *dest, 
 const unsigned char *src, 
 int len, int base)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f671cd2..376a0d2 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -108,6 +108,14 @@ struct wireless_dev;
 #define MAX_HEADER (LL_MAX_HEADER + 48)
 #endif
 
+struct net_device_subqueue
+{
+   /* Give a control state for each queue.  This struct may contain
+* per-queue locks in the future.
+*/
+   unsigned long   state;
+};
+
 /*
  * Network device statistics. Akin to the 2.0 ether stats but
  * with byte counters.
@@ -325,6 +333,7 @@ struct net_device
 #define NETIF_F_VLAN_CHALLENGED1024/* Device cannot handle VLAN 
packets */
 #define NETIF_F_GSO2048/* Enable software GSO. */
 #define NETIF_F_LLTX   4096/* LockLess TX */
+#define NETIF_F_MULTI_QUEUE16384   /* Has multiple TX/RX queues */
 
/* Segmentation offload features */
 #define NETIF_F_GSO_SHIFT  16
@@ -540,6 +549,10 @@ struct net_device
struct device   dev;
/* space for optional statistics and wireless sysfs groups */
struct attribute_group  *sysfs_groups[3];
+
+   /* The TX queue control structures */
+   struct net_device_subqueue  *egress_subqueue;
+   int egress_subqueue_count;
 };
 #define to_net_dev(d) container_of(d, struct net_device, dev)
 
@@ -702,6 +715,48 @@ static inline int netif_running(const struct net_device 
*dev)
return test_bit(__LINK_STATE_START, dev-state);
 }
 
+/*
+ * Routines to manage the subqueues on a device.  We only need start
+ * stop, and a check if it's stopped.  All other device management is
+ * done at the overall netdevice level.
+ * Also test the device if we're multiqueue.
+ */
+static inline void netif_start_subqueue(struct net_device *dev, u16 
queue_index)
+{
+   clear_bit(__LINK_STATE_XOFF, dev-egress_subqueue[queue_index].state);
+}
+
+static inline void netif_stop_subqueue(struct net_device *dev, u16 queue_index)
+{
+#ifdef CONFIG_NETPOLL_TRAP
+   if (netpoll_trap())
+   return;
+#endif
+   set_bit(__LINK_STATE_XOFF, dev-egress_subqueue[queue_index].state);
+}
+
+static inline int netif_subqueue_stopped(const struct net_device *dev,
+ u16 queue_index)
+{
+   return test_bit(__LINK_STATE_XOFF,
+   dev-egress_subqueue[queue_index].state);
+}
+
+static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
+{
+#ifdef CONFIG_NETPOLL_TRAP
+   if (netpoll_trap())
+   return;
+#endif
+   if (test_and_clear_bit(__LINK_STATE_XOFF,
+  dev-egress_subqueue[queue_index].state))
+   __netif_schedule(dev);
+}
+
+static inline int netif_is_multiqueue(const struct net_device *dev)
+{
+   return (!!(NETIF_F_MULTI_QUEUE  dev-features));
+}
 
 /* Use this variant when it is known for sure that it
  * is executing from interrupt context.
@@ -995,8 +1050,11 @@ static inline void netif_tx_disable(struct net_device 
*dev)
 extern voidether_setup(struct net_device *dev);
 
 /* Support for loadable net-drivers */
-extern struct net_device *alloc_netdev(int sizeof_priv, const char *name,
-  void (*setup)(struct net_device *));
+extern struct net_device

[RFC] iproute2: sch_rr support in tc

2007-06-04 Thread PJ Waskiewicz

This patch is to support the new sch_rr (round-robin) qdisc being proposed
in NET for multiqueue network device support in the Linux network stack.
It uses q_prio.c as the template, since the qdiscs are nearly identical,
outside of the -dequeue() routine.

I'm soliciting feedback for a 2.6.23 multiqueue submission.  Thanks.

-- 
PJ Waskiewicz [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] iproute2: Added support for RR qdisc (sch_rr)

2007-06-04 Thread PJ Waskiewicz

Add tc support for the sch_rr qdisc.  This qdisc supports multiple queues
on hardware.  The syntax for sch_rr is the same as sch_prio.

Signed-off-by: Peter P Waskiewicz Jr [EMAIL PROTECTED]
---

 include/linux/pkt_sched.h |   11 
 tc/Makefile   |1 
 tc/q_rr.c |  113 +
 3 files changed, 125 insertions(+), 0 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index d10f353..907412b 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -22,6 +22,7 @@
 #define TC_PRIO_CONTROL7
 
 #define TC_PRIO_MAX15
+#define TC_RR_MAX  15
 
 /* Generic queue statistics, available for all the elements.
Particular schedulers may have also their private records.
@@ -90,6 +91,16 @@ struct tc_fifo_qopt
__u32   limit;  /* Queue length: bytes for bfifo, packets for pfifo */
 };
 
+/* RR section */
+#define TCQ_RR_BANDS   16
+#define TCQ_MIN_RR_BANDS 2
+
+struct tc_rr_qopt
+{
+   int bands;
+   __u8priomap[TC_RR_MAX + 1];
+};
+
 /* PRIO section */
 
 #define TCQ_PRIO_BANDS 16
diff --git a/tc/Makefile b/tc/Makefile
index 9d618ff..62e2697 100644
--- a/tc/Makefile
+++ b/tc/Makefile
@@ -9,6 +9,7 @@ TCMODULES += q_fifo.o
 TCMODULES += q_sfq.o
 TCMODULES += q_red.o
 TCMODULES += q_prio.o
+TCMODULES += q_rr.o
 TCMODULES += q_tbf.o
 TCMODULES += q_cbq.o
 TCMODULES += f_rsvp.o
diff --git a/tc/q_rr.c b/tc/q_rr.c
new file mode 100644
index 000..8eecac9
--- /dev/null
+++ b/tc/q_rr.c
@@ -0,0 +1,113 @@
+/*
+ * q_rr.c  RR.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * Authors:PJ Waskiewicz, [EMAIL PROTECTED]
+ * Original Authors:   Alexey Kuznetsov, [EMAIL PROTECTED] (from PRIO)
+ *
+ * Changes:
+ *
+ * Ole Husgaard [EMAIL PROTECTED]: 990513: prio2band map was always reset.
+ * J Hadi Salim [EMAIL PROTECTED]: 990609: priomap fix.
+ */
+
+#include stdio.h
+#include stdlib.h
+#include unistd.h
+#include syslog.h
+#include fcntl.h
+#include sys/socket.h
+#include netinet/in.h
+#include arpa/inet.h
+#include string.h
+
+#include utils.h
+#include tc_util.h
+
+static void explain(void)
+{
+   fprintf(stderr, Usage: ... rr bands NUMBER priomap P1 P2...\n);
+}
+
+#define usage() return(-1)
+
+static int rr_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct 
nlmsghdr *n)
+{
+   int ok = 0;
+   int pmap_mode = 0;
+   int idx = 0;
+   struct tc_rr_qopt opt={3,{ 1, 2, 2, 2, 1, 2, 0, 0, 1, 1, 1, 1, 1, 1, 1, 
1 }};
+
+   while (argc  0) {
+   if (strcmp(*argv, bands) == 0) {
+   if (pmap_mode)
+   explain();
+   NEXT_ARG();
+   if (get_integer(opt.bands, *argv, 10)) {
+   fprintf(stderr, Illegal \bands\\n);
+   return -1;
+   }
+   ok++;
+   } else if (strcmp(*argv, priomap) == 0) {
+   if (pmap_mode) {
+   fprintf(stderr, Error: duplicate priomap\n);
+   return -1;
+   }
+   pmap_mode = 1;
+   } else if (strcmp(*argv, help) == 0) {
+   explain();
+   return -1;
+   } else {
+   unsigned band;
+   if (!pmap_mode) {
+   fprintf(stderr, What is \%s\?\n, *argv);
+   explain();
+   return -1;
+   }
+   if (get_unsigned(band, *argv, 10)) {
+   fprintf(stderr, Illegal \priomap\ 
element\n);
+   return -1;
+   }
+   if (band  opt.bands) {
+   fprintf(stderr, \priomap\ element is out of 
bands\n);
+   return -1;
+   }
+   if (idx  TC_RR_MAX) {
+   fprintf(stderr, \priomap\ index  
TC_RR_MAX=%u\n, TC_RR_MAX);
+   return -1;
+   }
+   opt.priomap[idx++] = band;
+   }
+   argc--; argv++;
+   }
+
+   addattr_l(n, 1024, TCA_OPTIONS, opt, sizeof(opt));
+   return 0;
+}
+
+int rr_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
+{
+   int i;
+   struct tc_rr_qopt *qopt;
+
+   if (opt == NULL)
+   return 0;
+
+   if

Re: r8169: hard freezes on TX

2007-06-04 Thread Francois Romieu

Rolf Eike Beer [EMAIL PROTECTED] :
[...]
 I just had another freeze using your patches. After 512kB over smb it was 
 dead.

In-kernel smb/cifs ?

-- 
Ueimor
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/4] b44 driver improvements

2007-06-04 Thread John W. Linville

On Mon, Jun 04, 2007 at 01:25:36PM -0700, Stephen Hemminger wrote:
 While researching why the wired networking was so slow
 on my laptop (not a driver problem); spotted these small
 changes to b44 driver. Nothing urgent, maybe 2.6.23
 or later material.

ACK to the lot

John
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: skb-priority on arp packets

2007-06-04 Thread Waskiewicz Jr, Peter P

 is it possible to set the skb-priority on arp packets 
 generated by the kernel?
 I want to to set the 802.1p priority on arp and ip packets on 
 an interface. On ip packets, this can be done by the iptables 
 CLASSIFY target and the 
 skb-priority mapping from the vlan implementation.
 
 any ideas?

Currently unclassified ip packets have skb-priority set in ipsockglue,
by translating the IP TOS field into a Linux priority.  This is not the
same priority as 802.1p, rather, it's a Linux-based OS priority
classification for dequeuing priority in schedulers such as sch_prio.
802.1p lives in the VLAN tag, which is separate from skb-priority.  I'd
suggest reading http://lartc.org/howto/lartc.qdisc.classless.html to see
what exactly skb-priority is used for in sch_prio and pfifo_fast, and
how the ip layer determines a packet's priority.  The skb-priority
field is also used in other ways in other qdiscs I'm not completely
familiar with, but it's not directly related to priority of the packet.
I'd need to look closer at those qdiscs to be specific.

That being said, the only way I can think of manipulating skb-priority
on arp packets (and actually setting it) is modifying a qdisc to set
skb-priority if it matches a filter created using the tc command.

Cheers,
-PJ Waskiewicz
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: skb-priority on arp packets

2007-06-04 Thread Ben Greear


Waskiewicz Jr, Peter P wrote:
is it possible to set the skb-priority on arp packets 
generated by the kernel?
I want to to set the 802.1p priority on arp and ip packets on 
an interface. On ip packets, this can be done by the iptables 
CLASSIFY target and the 
skb-priority mapping from the vlan implementation.


any ideas?


Currently unclassified ip packets have skb-priority set in ipsockglue,
by translating the IP TOS field into a Linux priority.  This is not the
same priority as 802.1p, rather, it's a Linux-based OS priority
classification for dequeuing priority in schedulers such as sch_prio.
802.1p lives in the VLAN tag, which is separate from skb-priority.  I'd
suggest reading http://lartc.org/howto/lartc.qdisc.classless.html to see
what exactly skb-priority is used for in sch_prio and pfifo_fast, and
how the ip layer determines a packet's priority.  The skb-priority
field is also used in other ways in other qdiscs I'm not completely
familiar with, but it's not directly related to priority of the packet.
I'd need to look closer at those qdiscs to be specific.

That being said, the only way I can think of manipulating skb-priority
on arp packets (and actually setting it) is modifying a qdisc to set
skb-priority if it matches a filter created using the tc command.


If you do manage to get the skb-priority set, then you can use vconfig
to set up mapping between skb-priority and the VLAN priority field.

Thanks,
Ben



Cheers,
-PJ Waskiewicz
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-mm patch] e1000: #if 0 two functions

2007-06-04 Thread Adrian Bunk

e1000_{read,write}_pci_cfg() are no longer used.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---

 drivers/net/e1000/e1000_hw.h   |2 --
 drivers/net/e1000/e1000_main.c |4 
 2 files changed, 4 insertions(+), 2 deletions(-)

--- linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_hw.h.old   2007-06-04 
22:03:05.0 +0200
+++ linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_hw.h   2007-06-04 
22:03:14.0 +0200
@@ -421,8 +421,6 @@ void e1000_tbi_adjust_stats(struct e1000
 void e1000_get_bus_info(struct e1000_hw *hw);
 void e1000_pci_set_mwi(struct e1000_hw *hw);
 void e1000_pci_clear_mwi(struct e1000_hw *hw);
-void e1000_read_pci_cfg(struct e1000_hw *hw, uint32_t reg, uint16_t * value);
-void e1000_write_pci_cfg(struct e1000_hw *hw, uint32_t reg, uint16_t * value);
 int32_t e1000_read_pcie_cap_reg(struct e1000_hw *hw, uint32_t reg, uint16_t 
*value);
 void e1000_pcix_set_mmrbc(struct e1000_hw *hw, int mmrbc);
 int e1000_pcix_get_mmrbc(struct e1000_hw *hw);
--- linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_main.c.old 2007-06-04 
22:03:24.0 +0200
+++ linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_main.c 2007-06-04 
22:03:40.0 +0200
@@ -4888,6 +4888,8 @@ e1000_pci_clear_mwi(struct e1000_hw *hw)
pci_clear_mwi(adapter-pdev);
 }
 
+#if 0
+
 void
 e1000_read_pci_cfg(struct e1000_hw *hw, uint32_t reg, uint16_t *value)
 {
@@ -4904,6 +4906,8 @@ e1000_write_pci_cfg(struct e1000_hw *hw,
pci_write_config_word(adapter-pdev, reg, *value);
 }
 
+#endif  /*  0  */
+
 int
 e1000_pcix_get_mmrbc(struct e1000_hw *hw)
 {

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC][PATCH] Add suspend and resume support to uli526x

2007-06-04 Thread Pavel Machek

Hi!

+#endif /* CONFIG_PM */
   
@@ -1689,6 +1775,10 @@ static struct pci_driver uli526x_driver 
.id_table   = uli526x_pci_tbl,
.probe  = uli526x_init_one,
.remove = __devexit_p(uli526x_remove_one),
+#ifdef CONFIG_PM
+   .suspend= uli526x_suspend,
+   .resume = uli526x_resume,
+#endif
   
   ...so that this ifdef is not needed?
  
  OK, why not.
 
 Because it's uglier and #ifdef is the established way of doing things?

Actually the way I suggested is nicer, IIRC akpm invented it. It keeps
ifdefs localized around the block that _needs_ to be ifdefed.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC][PATCH] Add suspend and resume support to uli526x

2007-06-04 Thread Nigel Cunningham

Hi.

On Tue, 2007-06-05 at 00:41 +0200, Pavel Machek wrote:
 Hi!
 
 +#endif /* CONFIG_PM */

 @@ -1689,6 +1775,10 @@ static struct pci_driver uli526x_driver 
   .id_table   = uli526x_pci_tbl,
   .probe  = uli526x_init_one,
   .remove = __devexit_p(uli526x_remove_one),
 +#ifdef CONFIG_PM
 + .suspend= uli526x_suspend,
 + .resume = uli526x_resume,
 +#endif

...so that this ifdef is not needed?
   
   OK, why not.
  
  Because it's uglier and #ifdef is the established way of doing things?
 
 Actually the way I suggested is nicer, IIRC akpm invented it. It keeps
 ifdefs localized around the block that _needs_ to be ifdefed.

The localised point is true. I'll also admit that 'nicer'/'uglier' is a
matter of aesthetics and therefore personal opinion.

I guess that leaves the question, What's the precedent to follow? or
Is there a driver that's already got this new format?.

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part

Re: [PATCH]: Add security check before flushing SAD/SPD

2007-06-04 Thread James Morris

I've applied this patch to 

git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6.git#for-davem


Dave, feel free to pull from that branch.



- James
-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[3/5][BNX2]: Enable DMA on 5709.

2007-06-04 Thread Michael Chan

[BNX2]: Enable DMA on 5709.

Add missing code to enable DMA on 5709 A1.  The bit is a no-op on
A0 and therefore can be set on all 5709 chips.

Signed-off-by: Michael Chan [EMAIL PROTECTED]

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index daa62d9..c03282c 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -3810,6 +3810,11 @@ bnx2_init_chip(struct bnx2 *bp)
/* Initialize the receive filter. */
bnx2_set_rx_mode(bp-dev);
 
+   if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+   val = REG_RD(bp, BNX2_MISC_NEW_CORE_CTL);
+   val |= BNX2_MISC_NEW_CORE_CTL_DMA_ENABLE;
+   REG_WR(bp, BNX2_MISC_NEW_CORE_CTL, val);
+   }
rc = bnx2_fw_sync(bp, BNX2_DRV_MSG_DATA_WAIT2 | BNX2_DRV_MSG_CODE_RESET,
  0);
 
diff --git a/drivers/net/bnx2.h b/drivers/net/bnx2.h
index bd6288d..49a5de2 100644
--- a/drivers/net/bnx2.h
+++ b/drivers/net/bnx2.h
@@ -1373,6 +1373,7 @@ struct l2_fhdr {
 #define BNX2_MISC_NEW_CORE_CTL 0x08c8
 #define BNX2_MISC_NEW_CORE_CTL_LINK_HOLDOFF_SUCCESS (1L0)
 #define BNX2_MISC_NEW_CORE_CTL_LINK_HOLDOFF_REQ (1L1)
+#define BNX2_MISC_NEW_CORE_CTL_DMA_ENABLE   (1L16)
 #define BNX2_MISC_NEW_CORE_CTL_RESERVED_CMN (0x3fffL2)
 #define BNX2_MISC_NEW_CORE_CTL_RESERVED_TC  (0xL16)
 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[1/5][BNX2]: Fix netdev watchdog on 5708.

2007-06-04 Thread Michael Chan

[BNX2]: Fix netdev watchdog on 5708.

There's a bug in the driver that only initializes half of the context
memory on the 5708.  Surprisingly, this works most of the time except
for some occasional netdev watchdogs when sending a lot of 64-byte
packets.  The fix is to add the missing code to initialize the 2nd
halves of all context memory.

Signed-off-by: Michael Chan [EMAIL PROTECTED]

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 88b33c6..5d69e5b 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -1811,6 +1811,7 @@ bnx2_init_context(struct bnx2 *bp)
vcid = 96;
while (vcid) {
u32 vcid_addr, pcid_addr, offset;
+   int i;
 
vcid--;
 
@@ -1831,16 +1832,20 @@ bnx2_init_context(struct bnx2 *bp)
pcid_addr = vcid_addr;
}
 
-   REG_WR(bp, BNX2_CTX_VIRT_ADDR, 0x00);
-   REG_WR(bp, BNX2_CTX_PAGE_TBL, pcid_addr);
+   for (i = 0; i  (CTX_SIZE / PHY_CTX_SIZE); i++) {
+   vcid_addr += (i  PHY_CTX_SHIFT);
+   pcid_addr += (i  PHY_CTX_SHIFT);
 
-   /* Zero out the context. */
-   for (offset = 0; offset  PHY_CTX_SIZE; offset += 4) {
-   CTX_WR(bp, 0x00, offset, 0);
-   }
+   REG_WR(bp, BNX2_CTX_VIRT_ADDR, 0x00);
+   REG_WR(bp, BNX2_CTX_PAGE_TBL, pcid_addr);
 
-   REG_WR(bp, BNX2_CTX_VIRT_ADDR, vcid_addr);
-   REG_WR(bp, BNX2_CTX_PAGE_TBL, pcid_addr);
+   /* Zero out the context. */
+   for (offset = 0; offset  PHY_CTX_SIZE; offset += 4)
+   CTX_WR(bp, 0x00, offset, 0);
+
+   REG_WR(bp, BNX2_CTX_VIRT_ADDR, vcid_addr);
+   REG_WR(bp, BNX2_CTX_PAGE_TBL, pcid_addr);
+   }
}
 }
 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[5/5][BNX2]: Update version and reldate.

2007-06-04 Thread Michael Chan

[BNX2]: Update version and reldate.

Update to version 1.5.11.

Signed-off-by: Michael Chan [EMAIL PROTECTED]

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 58e6f49..a9b833b 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -54,8 +54,8 @@
 
 #define DRV_MODULE_NAMEbnx2
 #define PFX DRV_MODULE_NAME: 
-#define DRV_MODULE_VERSION 1.5.10
-#define DRV_MODULE_RELDATE May 1, 2007
+#define DRV_MODULE_VERSION 1.5.11
+#define DRV_MODULE_RELDATE June 4, 2007
 
 #define RUN_AT(x) (jiffies + (x))
 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[4/5][BNX2]: Fix occasional counter corruption on 5708.

2007-06-04 Thread Michael Chan

[BNX2]: Fix occasional counter corruption on 5708.

The statistics block DMA on 5708 can be messed up occasionally on the
average of about once per hour.  If the user is reading the counters
within one second after the corruption, the counters will be all
messed up.  One second later, the counters will be ok again until the
next corruption occurs.

The workaround is to disable the periodic statistics DMA.  Instead,
we manually trigger the DMA once a second in bnx2_timer().  This
manual trigger of the DMA avoids the problem.

As a consequence, we can only allow 0 or 1 second settings for
ethtool -C statistics block.

Thanks to Jean-Daniel Pauget [EMAIL PROTECTED] and
CaT [EMAIL PROTECTED] for reporting this rare problem.

Signed-off-by: Michael Chan [EMAIL PROTECTED]

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index c03282c..5654cc7 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -3783,7 +3783,10 @@ bnx2_init_chip(struct bnx2 *bp)
REG_WR(bp, BNX2_HC_CMD_TICKS,
   (bp-cmd_ticks_int  16) | bp-cmd_ticks);
 
-   REG_WR(bp, BNX2_HC_STATS_TICKS, bp-stats_ticks  0x00);
+   if (CHIP_NUM(bp) == CHIP_NUM_5708)
+   REG_WR(bp, BNX2_HC_STATS_TICKS, 0);
+   else
+   REG_WR(bp, BNX2_HC_STATS_TICKS, bp-stats_ticks  0x00);
REG_WR(bp, BNX2_HC_STAT_COLLECT_TICKS, 0xbb8);  /* 3ms */
 
if (CHIP_ID(bp) == CHIP_ID_5706_A1)
@@ -4636,6 +4639,11 @@ bnx2_timer(unsigned long data)
 
bp-stats_blk-stat_FwRxDrop = REG_RD_IND(bp, BNX2_FW_RX_DROP_COUNT);
 
+   /* workaround occasional corrupted counters */
+   if (CHIP_NUM(bp) == CHIP_NUM_5708  bp-stats_ticks)
+   REG_WR(bp, BNX2_HC_COMMAND, bp-hc_cmd |
+   BNX2_HC_COMMAND_STATS_NOW);
+
if (bp-phy_flags  PHY_SERDES_FLAG) {
if (CHIP_NUM(bp) == CHIP_NUM_5706)
bnx2_5706_serdes_timer(bp);
@@ -5446,6 +5454,10 @@ bnx2_set_coalesce(struct net_device *dev, struct 
ethtool_coalesce *coal)
0xff;
 
bp-stats_ticks = coal-stats_block_coalesce_usecs;
+   if (CHIP_NUM(bp) == CHIP_NUM_5708) {
+   if (bp-stats_ticks != 0  bp-stats_ticks != USEC_PER_SEC)
+   bp-stats_ticks = USEC_PER_SEC;
+   }
if (bp-stats_ticks  0x00) bp-stats_ticks = 0x00;
bp-stats_ticks = 0x00;
 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/5][BNX2]: Fix netdev watchdog on 5708.

2007-06-04 Thread Jeff Garzik

On Mon, Jun 04, 2007 at 05:00:03PM -0700, Michael Chan wrote:
 [BNX2]: Fix netdev watchdog on 5708.
 
 There's a bug in the driver that only initializes half of the context
 memory on the 5708.  Surprisingly, this works most of the time except
 for some occasional netdev watchdogs when sending a lot of 64-byte
 packets.  The fix is to add the missing code to initialize the 2nd
 halves of all context memory.
 
 Signed-off-by: Michael Chan [EMAIL PROTECTED]

ACK patch 1, and 3-5


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [2/5][BNX2]: Add missing wait in bnx2_init_5709_context().

2007-06-04 Thread Jeff Garzik

On Mon, Jun 04, 2007 at 05:00:35PM -0700, Michael Chan wrote:
 [BNX2]: Add missing wait in bnx2_init_5709_context().
 
 For correctness, we need to wait for the MEM_INIT bit to be cleared
 in the BNX2_CTX_COMMAND register before proceeding.
 
 Signed-off-by: Michael Chan [EMAIL PROTECTED]
 
 diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
 index 5d69e5b..daa62d9 100644
 --- a/drivers/net/bnx2.c
 +++ b/drivers/net/bnx2.c
 @@ -1778,6 +1778,12 @@ bnx2_init_5709_context(struct bnx2 *bp)
   val = BNX2_CTX_COMMAND_ENABLED | BNX2_CTX_COMMAND_MEM_INIT | (1  12);
   val |= (BCM_PAGE_BITS - 8)  16;
   REG_WR(bp, BNX2_CTX_COMMAND, val);
 + for (i = 0; i  10; i++) {
 + val = REG_RD(bp, BNX2_CTX_COMMAND);
 + if (!(val  BNX2_CTX_COMMAND_MEM_INIT))
 + break;
 + udelay(2);
 + }
   for (i = 0; i  bp-ctx_pages; i++) {
   int j;

If the bit does not clear after 10 loops, it seems to me you should
complain or somehow otherwise make note of the failure.

Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[2/5][BNX2]: Add missing wait in bnx2_init_5709_context().

2007-06-04 Thread Michael Chan

[BNX2]: Add missing wait in bnx2_init_5709_context().

For correctness, we need to wait for the MEM_INIT bit to be cleared
in the BNX2_CTX_COMMAND register before proceeding.

Signed-off-by: Michael Chan [EMAIL PROTECTED]

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 5d69e5b..daa62d9 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -1778,6 +1778,12 @@ bnx2_init_5709_context(struct bnx2 *bp)
val = BNX2_CTX_COMMAND_ENABLED | BNX2_CTX_COMMAND_MEM_INIT | (1  12);
val |= (BCM_PAGE_BITS - 8)  16;
REG_WR(bp, BNX2_CTX_COMMAND, val);
+   for (i = 0; i  10; i++) {
+   val = REG_RD(bp, BNX2_CTX_COMMAND);
+   if (!(val  BNX2_CTX_COMMAND_MEM_INIT))
+   break;
+   udelay(2);
+   }
for (i = 0; i  bp-ctx_pages; i++) {
int j;
 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: Add security check before flushing SAD/SPD

2007-06-04 Thread David Miller

From: James Morris [EMAIL PROTECTED]
Date: Mon, 4 Jun 2007 19:13:10 -0400 (EDT)

 I've applied this patch to 

 git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6.git#for-davem

 Dave, feel free to pull from that branch.

Thanks James, I'll pull that in for my next round of
networking fixes.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [2/5 revised][BNX2]: Add missing wait in bnx2_init_5709_context().

2007-06-04 Thread Michael Chan

[BNX2]: Add missing wait in bnx2_init_5709_context().

For correctness, we need to wait for the MEM_INIT bit to be cleared
in the BNX2_CTX_COMMAND register before proceeding.

[Added return -EBUSY when the MEM_INIT bit doesn't clear, suggested
by Jeff Garzik.]

Signed-off-by: Michael Chan [EMAIL PROTECTED]

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 5d69e5b..0f861e2 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -1778,6 +1778,15 @@ bnx2_init_5709_context(struct bnx2 *bp)
val = BNX2_CTX_COMMAND_ENABLED | BNX2_CTX_COMMAND_MEM_INIT | (1  12);
val |= (BCM_PAGE_BITS - 8)  16;
REG_WR(bp, BNX2_CTX_COMMAND, val);
+   for (i = 0; i  10; i++) {
+   val = REG_RD(bp, BNX2_CTX_COMMAND);
+   if (!(val  BNX2_CTX_COMMAND_MEM_INIT))
+   break;
+   udelay(2);
+   }
+   if (val  BNX2_CTX_COMMAND_MEM_INIT)
+   return -EBUSY;
+
for (i = 0; i  bp-ctx_pages; i++) {
int j;
 
@@ -3696,9 +3705,11 @@ bnx2_init_chip(struct bnx2 *bp)
 
/* Initialize context mapping and zero out the quick contexts.  The
 * context block must have already been enabled. */
-   if (CHIP_NUM(bp) == CHIP_NUM_5709)
-   bnx2_init_5709_context(bp);
-   else
+   if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+   rc = bnx2_init_5709_context(bp);
+   if (rc)
+   return rc;
+   } else
bnx2_init_context(bp);
 
if ((rc = bnx2_init_cpus(bp)) != 0)


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: r8169 in 2.6.18 silently corrupting data, 2.6.22-rc3 link not detected at all

2007-06-04 Thread Andrew Paprocki


Francois,

Using the latest patch you provided in the previous e-mail under
2.6.22-rc3 I still see data corruption after I dropped the entire
network to MTU 1500. (The receive corruption is way more frequent than
under 2.6.18.) I'll attempt to find out what caused the link detection
to stop working. Do you have any idea about the data corruption? I can
easily put in trace anywhere in the driver if you'd like to see any
output. My latest build was without NAPI support, and running with or
without it didn't seem to affect the receive corruption.

Thanks, -Andrew

On 6/4/07, Francois Romieu [EMAIL PROTECTED] wrote:

Francois Romieu [EMAIL PROTECTED] :
[...]

Andrew, have you had any time to do some testing with an uniform 7200
bytes MTU through your LAN ?

You will find a backport of the r8169 driver for the 2.6.18 kernel at
http://www.fr.zoreil.com/people/francois/backport/r8169/20070604-00

Could you do a binary search of the patch which breaks the link detection ?
From a different report that I have received, the current testing branch of
the 8169 driver would improve things for MTU beyond 1500 bytes: nailing down
your broken link regression would be a welcome improvement.

If you are confortable with git, you can do a git bisect after pulling
the branch 'r8169-backport-test-20070604' at:
git://electric-eye.fr.zoreil.com/home/romieu/linux/linux-2.6-out

(please do the initial clone from elsewhere if you use git, thanks)

--
Ueimor



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/7] sky2 version 1.15 (88e8071) support

2007-06-04 Thread Stephen Hemminger

These changes are to enable the Yukon Extreme (88e8071)
chipset. This chip is similar to earlier chip but has a different
set of offloading operations and some other minor quirks.

Marvell has given me some evaluation boards with the 88e8071
chip set. The chip is available now, but haven't seen a systems
with the hardware yet. The support fot this is exprerimental
at this point, but haven't encountered any new problems.

-- 
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/7] sky2: avoid reserved regions on ethtool reg dump

2007-06-04 Thread Stephen Hemminger

On Yukon EX reading some of the undocumented places in the
memory space will cause a hang. Since they don't provide useful
information, just skip the reserved areas.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/sky2.c |   14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

--- a/drivers/net/sky2.c2007-06-04 08:45:25.0 -0700
+++ b/drivers/net/sky2.c2007-06-04 08:46:34.0 -0700
@@ -3330,7 +3330,7 @@ static int sky2_get_regs_len(struct net_
 
 /*
  * Returns copy of control register region
- * Note: access to the RAM address register set will cause timeouts.
+ * Note: ethtool_get_regs always provides full size (16k) buffer
  */
 static void sky2_get_regs(struct net_device *dev, struct ethtool_regs *regs,
  void *p)
@@ -3338,15 +3338,19 @@ static void sky2_get_regs(struct net_dev
const struct sky2_port *sky2 = netdev_priv(dev);
const void __iomem *io = sky2-hw-regs;
 
-   BUG_ON(regs-len  B3_RI_WTO_R1);
regs-version = 1;
memset(p, 0, regs-len);
 
memcpy_fromio(p, io, B3_RAM_ADDR);
 
-   memcpy_fromio(p + B3_RI_WTO_R1,
- io + B3_RI_WTO_R1,
- regs-len - B3_RI_WTO_R1);
+   /* skip diagnostic ram region */
+   memcpy_fromio(p + B3_RI_WTO_R1, io + B3_RI_WTO_R1, 0x2000 - 
B3_RI_WTO_R1);
+
+   /* copy GMAC registers */
+   memcpy_fromio(p + BASE_GMAC_1, io + BASE_GMAC_1, 0x1000);
+   if (sky2-hw-ports  1)
+   memcpy_fromio(p + BASE_GMAC_2, io + BASE_GMAC_2, 0x1000);
+
 }
 
 /* In order to do Jumbo packets on these chips, need to turn off the

-- 
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/7] sky2: rename BMU register

2007-06-04 Thread Stephen Hemminger

This register is more of a test and control register on Yukon2.
So rename it to Q_TEST and give some bit definitions.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/sky2.c |2 +-
 drivers/net/sky2.h |   29 +++--
 2 files changed, 12 insertions(+), 19 deletions(-)

--- a/drivers/net/sky2.c2007-06-04 08:46:41.0 -0700
+++ b/drivers/net/sky2.c2007-06-04 08:46:43.0 -0700
@@ -1140,7 +1140,7 @@ static int sky2_rx_start(struct sky2_por
if (hw-chip_id == CHIP_ID_YUKON_EC_U 
(hw-chip_rev == CHIP_REV_YU_EC_U_A1
 || hw-chip_rev == CHIP_REV_YU_EC_U_B0))
-   sky2_write32(hw, Q_ADDR(rxq, Q_F), F_M_RX_RAM_DIS);
+   sky2_write32(hw, Q_ADDR(rxq, Q_TEST), F_M_RX_RAM_DIS);
 
sky2_prefetch_init(hw, rxq, sky2-rx_le_map, RX_LE_SIZE - 1);
 
--- a/drivers/net/sky2.h2007-06-04 08:46:41.0 -0700
+++ b/drivers/net/sky2.h2007-06-04 08:46:43.0 -0700
@@ -592,23 +592,15 @@ enum {
 enum {
B8_Q_REGS = 0x0400, /* base of Queue registers */
Q_D = 0x00, /* 8*32 bit Current Descriptor */
-   Q_DA_L  = 0x20, /* 32 bit   Current Descriptor Address Low dWord */
-   Q_DA_H  = 0x24, /* 32 bit   Current Descriptor Address High dWord */
+   Q_VLAN  = 0x20, /* 16 bit   Current VLAN Tag */
+   Q_DONE  = 0x24, /* 16 bit   Done Index */
Q_AC_L  = 0x28, /* 32 bit   Current Address Counter Low dWord */
Q_AC_H  = 0x2c, /* 32 bit   Current Address Counter High dWord */
Q_BC= 0x30, /* 32 bit   Current Byte Counter */
Q_CSR   = 0x34, /* 32 bit   BMU Control/Status Register */
-   Q_F = 0x38, /* 32 bit   Flag Register */
-   Q_T1= 0x3c, /* 32 bit   Test Register 1 */
-   Q_T1_TR = 0x3c, /*  8 bit   Test Register 1 Transfer SM */
-   Q_T1_WR = 0x3d, /*  8 bit   Test Register 1 Write Descriptor SM */
-   Q_T1_RD = 0x3e, /*  8 bit   Test Register 1 Read Descriptor SM */
-   Q_T1_SV = 0x3f, /*  8 bit   Test Register 1 Supervisor SM */
-   Q_T2= 0x40, /* 32 bit   Test Register 2 */
-   Q_T3= 0x44, /* 32 bit   Test Register 3 */
+   Q_TEST  = 0x38, /* 32 bit   Test/Control Register */
 
 /* Yukon-2 */
-   Q_DONE  = 0x24, /* 16 bit   Done Index  (Yukon-2 only) 
*/
Q_WM= 0x40, /* 16 bit   FIFO Watermark */
Q_AL= 0x42, /*  8 bit   FIFO Alignment */
Q_RSP   = 0x44, /* 16 bit   FIFO Read Shadow Pointer */
@@ -622,15 +614,16 @@ enum {
 };
 #define Q_ADDR(reg, offs) (B8_Q_REGS + (reg) + (offs))
 
-/* Q_F 32 bit  Flag Register */
+/* Q_TEST  32 bit  Test Register */
 enum {
-   F_ALM_FULL  = 127, /* Rx FIFO: almost full */
-   F_EMPTY = 127, /* Tx FIFO: empty flag */
-   F_FIFO_EOF  = 126, /* Tag (EOF Flag) bit in FIFO */
-   F_WM_REACHED= 125, /* Watermark reached */
+   /* Transmit */
+   F_TX_CHK_AUTO_OFF = 131, /* Tx checksum auto calc off (Yukon EX) */
+   F_TX_CHK_AUTO_ON  = 130, /* Tx checksum auto calc off (Yukon EX) */
+
+   /* Receive */
F_M_RX_RAM_DIS  = 124, /* MAC Rx RAM Read Port disable */
-   F_FIFO_LEVEL= 0x1fL16, /* Bit 23..16: # of Qwords in FIFO */
-   F_WATER_MARK= 0x0007ffL, /* Bit 10.. 0: Watermark */
+
+   /* Hardware testbits not used */
 };
 
 /* Queue Prefetch Unit Offsets, use Y2_QADDR() to address (Yukon-2 only)*/

-- 
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/7] sky2: enable clocks before probe

2007-06-04 Thread Stephen Hemminger

Catch-22: On Yukon EX (88E8071) need to have internal clocks enabled
before reading chip id. It is harmless on other chips.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/sky2.c |7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

--- a/drivers/net/sky2.c2007-06-04 08:46:43.0 -0700
+++ b/drivers/net/sky2.c2007-06-04 08:46:45.0 -0700
@@ -2519,6 +2519,9 @@ static int __devinit sky2_init(struct sk
 {
u8 t8;
 
+   /* Enable all clocks */
+   sky2_pci_write32(hw, PCI_DEV_REG3, 0);
+
sky2_write8(hw, B0_CTST, CS_RST_CLR);
 
hw-chip_id = sky2_read8(hw, B2_CHIP_ID);
@@ -2532,10 +2535,6 @@ static int __devinit sky2_init(struct sk
dev_warn(hw-pdev-dev, this driver not yet tested on this 
chip type\n
 Please report success or failure to 
netdev@vger.kernel.org\n);
 
-   /* Make sure and enable all clocks */
-   if (hw-chip_id == CHIP_ID_YUKON_EX || hw-chip_id == 
CHIP_ID_YUKON_EC_U)
-   sky2_pci_write32(hw, PCI_DEV_REG3, 0);
-
hw-chip_rev = (sky2_read8(hw, B2_MAC_CFG)  CFG_CHIP_R_MSK)  4;
 
/* This rev is really old, and requires untested workarounds */

-- 
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/7] sky2: GPIO register

2007-06-04 Thread Stephen Hemminger

The General Purpose I/O register is yet another hardware workaround
catchall. Enable workaround that vendor driver does to stay
but for bug compatiable.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/sky2.c |5 +
 drivers/net/sky2.h |   14 ++
 2 files changed, 19 insertions(+)

--- a/drivers/net/sky2.h2007-06-04 08:46:43.0 -0700
+++ b/drivers/net/sky2.h2007-06-04 08:46:47.0 -0700
@@ -441,6 +441,20 @@ enum {
TST_CFG_WRITE_OFF= 10, /* Disable Config Reg WR */
 };
 
+/* B2_GPIO */
+enum {
+   GLB_GPIO_CLK_DEB_ENA = 131,   /* Clock Debug Enable */
+   GLB_GPIO_CLK_DBG_MSK = 0xf26, /* Clock Debug */
+
+   GLB_GPIO_INT_RST_D3_DIS = 115, /* Disable Internal Reset After D3 to 
D0 */
+   GLB_GPIO_LED_PAD_SPEED_UP = 114, /* LED PAD Speed Up */
+   GLB_GPIO_STAT_RACE_DIS  = 113, /* Status Race Disable */
+   GLB_GPIO_TEST_SEL_MSK   = 311, /* Testmode Select */
+   GLB_GPIO_TEST_SEL_BASE  = 111,
+   GLB_GPIO_RAND_ENA   = 110, /* Random Enable */
+   GLB_GPIO_RAND_BIT_1 = 19,  /* Random Bit 1 */
+};
+
 /* B2_MAC_CFG   8 bit  MAC Configuration / Chip Revision */
 enum {
CFG_CHIP_R_MSK= 0xf4, /* Bit 7.. 4: Chip Revision */
--- a/drivers/net/sky2.c2007-06-04 08:46:45.0 -0700
+++ b/drivers/net/sky2.c2007-06-04 08:46:47.0 -0700
@@ -230,6 +230,11 @@ static void sky2_power_on(struct sky2_hw
sky2_pci_write32(hw, PCI_DEV_REG5, reg);
 
sky2_pci_write32(hw, PCI_CFG_REG_1, 0);
+
+   /* Enable workaround for dev 4.107 on Yukon-Ultra  Extreme */
+   reg = sky2_read32(hw, B2_GP_IO);
+   reg |= GLB_GPIO_STAT_RACE_DIS;
+   sky2_write32(hw, B2_GP_IO, reg);
}
 }
 

-- 
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 7/7] sky2: version 1.15

2007-06-04 Thread Stephen Hemminger

New version because of new chip support.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/sky2.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/sky2.c2007-05-31 11:27:07.0 -0700
+++ b/drivers/net/sky2.c2007-05-31 11:27:08.0 -0700
@@ -50,7 +50,7 @@
 #include sky2.h
 
 #define DRV_NAME   sky2
-#define DRV_VERSION1.14
+#define DRV_VERSION1.15
 #define PFXDRV_NAME  
 
 /*

-- 
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/7] sky2: Add PCI device specfic register 4 5

2007-06-04 Thread Stephen Hemminger

Need to setup more PCI control control registers are on Yukon EX.
Some of these also exist on Yukon EC-U as well.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/sky2.c |   18 
 drivers/net/sky2.h |   77 +
 2 files changed, 89 insertions(+), 6 deletions(-)

--- a/drivers/net/sky2.c2007-06-04 08:46:34.0 -0700
+++ b/drivers/net/sky2.c2007-06-04 08:46:41.0 -0700
@@ -217,13 +217,19 @@ static void sky2_power_on(struct sky2_hw
sky2_write8(hw, B2_Y2_CLK_GATE, 0);
 
if (hw-chip_id == CHIP_ID_YUKON_EC_U || hw-chip_id == 
CHIP_ID_YUKON_EX) {
-   u32 reg1;
+   u32 reg;
 
-   sky2_pci_write32(hw, PCI_DEV_REG3, 0);
-   reg1 = sky2_pci_read32(hw, PCI_DEV_REG4);
-   reg1 = P_ASPM_CONTROL_MSK;
-   sky2_pci_write32(hw, PCI_DEV_REG4, reg1);
-   sky2_pci_write32(hw, PCI_DEV_REG5, 0);
+   reg = sky2_pci_read32(hw, PCI_DEV_REG4);
+   /* set all bits to 0 except bits 15..12 and 8 */
+   reg = P_ASPM_CONTROL_MSK;
+   sky2_pci_write32(hw, PCI_DEV_REG4, reg);
+
+   reg = sky2_pci_read32(hw, PCI_DEV_REG5);
+   /* set all bits to 0 except bits 28  27 */
+   reg = P_CTL_TIM_VMAIN_AV_MSK;
+   sky2_pci_write32(hw, PCI_DEV_REG5, reg);
+
+   sky2_pci_write32(hw, PCI_CFG_REG_1, 0);
}
 }
 
--- a/drivers/net/sky2.h2007-06-04 08:45:25.0 -0700
+++ b/drivers/net/sky2.h2007-06-04 08:46:41.0 -0700
@@ -14,6 +14,8 @@ enum {
PCI_DEV_REG3= 0x80,
PCI_DEV_REG4= 0x84,
PCI_DEV_REG5= 0x88,
+   PCI_CFG_REG_0   = 0x90,
+   PCI_CFG_REG_1   = 0x94,
 };
 
 enum {
@@ -28,6 +30,7 @@ enum {
 enum pci_dev_reg_1 {
PCI_Y2_PIG_ENA   = 131, /* Enable Plug-in-Go (YUKON-2) */
PCI_Y2_DLL_DIS   = 130, /* Disable PCI DLL (YUKON-2) */
+   PCI_SW_PWR_ON_RST= 130, /* SW Power on Reset (Yukon-EX) */
PCI_Y2_PHY2_COMA = 129, /* Set PHY 2 to Coma Mode (YUKON-2) */
PCI_Y2_PHY1_COMA = 128, /* Set PHY 1 to Coma Mode (YUKON-2) */
PCI_Y2_PHY2_POWD = 127, /* Set PHY 2 to Power Down (YUKON-2) */
@@ -67,6 +70,80 @@ enum pci_dev_reg_4 {
  | P_ASPM_CLKRUN_REQUEST | 
P_ASPM_INT_FIFO_EMPTY,
 };
 
+/* PCI_OUR_REG_5   32 bit  Our Register 5 (Yukon-ECU only) */
+enum pci_dev_reg_5 {
+   /* Bit 31..27:  for A3  later */
+   P_CTL_DIV_CORE_CLK_ENA  = 131, /* Divide Core Clock Enable */
+   P_CTL_SRESET_VMAIN_AV   = 130, /* Soft Reset for Vmain_av De-Glitch */
+   P_CTL_BYPASS_VMAIN_AV   = 129, /* Bypass En. for Vmain_av De-Glitch */
+   P_CTL_TIM_VMAIN_AV_MSK  = 327, /* Bit 28..27: Timer Vmain_av Mask */
+/* Bit 26..16: Release Clock on Event 
*/
+   P_REL_PCIE_RST_DE_ASS   = 126, /* PCIe Reset De-Asserted */
+   P_REL_GPHY_REC_PACKET   = 125, /* GPHY Received Packet */
+   P_REL_INT_FIFO_N_EMPTY  = 124, /* Internal FIFO Not Empty */
+   P_REL_MAIN_PWR_AVAIL= 123, /* Main Power Available */
+   P_REL_CLKRUN_REQ_REL= 122, /* CLKRUN Request Release */
+   P_REL_PCIE_RESET_ASS= 121, /* PCIe Reset Asserted */
+   P_REL_PME_ASSERTED  = 120, /* PME Asserted */
+   P_REL_PCIE_EXIT_L1_ST   = 119, /* PCIe Exit L1 State */
+   P_REL_LOADER_NOT_FIN= 118, /* EPROM Loader Not Finished */
+   P_REL_PCIE_RX_EX_IDLE   = 117, /* PCIe Rx Exit Electrical Idle State 
*/
+   P_REL_GPHY_LINK_UP  = 116, /* GPHY Link Up */
+
+   /* Bit 10.. 0: Mask for Gate Clock */
+   P_GAT_PCIE_RST_ASSERTED = 110,/* PCIe Reset Asserted */
+   P_GAT_GPHY_N_REC_PACKET = 19, /* GPHY Not Received Packet */
+   P_GAT_INT_FIFO_EMPTY= 18, /* Internal FIFO Empty */
+   P_GAT_MAIN_PWR_N_AVAIL  = 17, /* Main Power Not Available */
+   P_GAT_CLKRUN_REQ_REL= 16, /* CLKRUN Not Requested */
+   P_GAT_PCIE_RESET_ASS= 15, /* PCIe Reset Asserted */
+   P_GAT_PME_DE_ASSERTED   = 14, /* PME De-Asserted */
+   P_GAT_PCIE_ENTER_L1_ST  = 13, /* PCIe Enter L1 State */
+   P_GAT_LOADER_FINISHED   = 12, /* EPROM Loader Finished */
+   P_GAT_PCIE_RX_EL_IDLE   = 11, /* PCIe Rx Electrical Idle State */
+   P_GAT_GPHY_LINK_DOWN= 10, /* GPHY Link Down */
+
+   PCIE_OUR5_EVENT_CLK_D3_SET = P_REL_GPHY_REC_PACKET |
+P_REL_INT_FIFO_N_EMPTY |
+P_REL_PCIE_EXIT_L1_ST |
+P_REL_PCIE_RX_EX_IDLE |
+P_GAT_GPHY_N_REC_PACKET |
+P_GAT_INT_FIFO_EMPTY |
+P_GAT_PCIE_ENTER_L1_ST |
+

[PATCH 6/7] sky2: Yukon Extreme (88e8071) support.

2007-06-04 Thread Stephen Hemminger

Enable support for Yukon EX chipset (88e8071).
Most of changes are related to new commands to chip for transmit,
and change in status and checksumming.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/sky2.c |  176 +++--
 drivers/net/sky2.h |   43 
 2 files changed, 148 insertions(+), 71 deletions(-)

--- a/drivers/net/sky2.c2007-06-04 08:46:47.0 -0700
+++ b/drivers/net/sky2.c2007-06-04 08:46:54.0 -0700
@@ -130,7 +130,7 @@ static const struct pci_device_id sky2_i
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4368) }, /* 88EC034 */
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4369) }, /* 88EC042 */
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x436A) }, /* 88E8058 */
-// { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x436B) }, /* 88E8071 */
+   { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x436B) }, /* 88E8071 */
{ 0 }
 };
 
@@ -661,6 +661,30 @@ static void sky2_wol_init(struct sky2_po
 
 }
 
+static void sky2_set_tx_stfwd(struct sky2_hw *hw, unsigned port)
+{
+   if (hw-chip_id == CHIP_ID_YUKON_EX  hw-chip_rev != 
CHIP_REV_YU_EX_A0) {
+   sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T),
+TX_STFW_ENA |
+(hw-dev[port]-mtu  ETH_DATA_LEN) ? TX_JUMBO_ENA 
: TX_JUMBO_DIS);
+   } else {
+   if (hw-dev[port]-mtu  ETH_DATA_LEN) {
+   /* set Tx GMAC FIFO Almost Empty Threshold */
+   sky2_write32(hw, SK_REG(port, TX_GMF_AE_THR),
+(ECU_JUMBO_WM  16) | ECU_AE_THR);
+
+   sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T),
+TX_JUMBO_ENA | TX_STFW_DIS);
+
+   /* Can't do offload because of lack of store/forward */
+   hw-dev[port]-features = ~(NETIF_F_TSO | NETIF_F_SG
+| NETIF_F_ALL_CSUM);
+   } else
+   sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T),
+TX_JUMBO_DIS | TX_STFW_ENA);
+   }
+}
+
 static void sky2_mac_init(struct sky2_hw *hw, unsigned port)
 {
struct sky2_port *sky2 = netdev_priv(hw-dev[port]);
@@ -741,8 +765,11 @@ static void sky2_mac_init(struct sky2_hw
 
/* Configure Rx MAC FIFO */
sky2_write8(hw, SK_REG(port, RX_GMF_CTRL_T), GMF_RST_CLR);
-   sky2_write32(hw, SK_REG(port, RX_GMF_CTRL_T),
-GMF_OPER_ON | GMF_RX_F_FL_ON);
+   reg = GMF_OPER_ON | GMF_RX_F_FL_ON;
+   if (hw-chip_id == CHIP_ID_YUKON_EX)
+   reg |= GMF_RX_OVER_ON;
+
+   sky2_write32(hw, SK_REG(port, RX_GMF_CTRL_T), reg);
 
/* Flush Rx MAC FIFO on any flow control or error */
sky2_write16(hw, SK_REG(port, RX_GMF_FL_MSK), GMR_FS_ANY_ERR);
@@ -758,16 +785,7 @@ static void sky2_mac_init(struct sky2_hw
sky2_write8(hw, SK_REG(port, RX_GMF_LP_THR), 768/8);
sky2_write8(hw, SK_REG(port, RX_GMF_UP_THR), 1024/8);
 
-   /* set Tx GMAC FIFO Almost Empty Threshold */
-   sky2_write32(hw, SK_REG(port, TX_GMF_AE_THR),
-(ECU_JUMBO_WM  16) | ECU_AE_THR);
-
-   if (hw-dev[port]-mtu  ETH_DATA_LEN)
-   sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T),
-TX_JUMBO_ENA | TX_STFW_DIS);
-   else
-   sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T),
-TX_JUMBO_DIS | TX_STFW_ENA);
+   sky2_set_tx_stfwd(hw, port);
}
 
 }
@@ -950,14 +968,16 @@ static void rx_set_checksum(struct sky2_
 {
struct sky2_rx_le *le;
 
-   le = sky2_next_rx(sky2);
-   le-addr = cpu_to_le32((ETH_HLEN  16) | ETH_HLEN);
-   le-ctrl = 0;
-   le-opcode = OP_TCPSTART | HW_OWNER;
-
-   sky2_write32(sky2-hw,
-Q_ADDR(rxqaddr[sky2-port], Q_CSR),
-sky2-rx_csum ? BMU_ENA_RX_CHKSUM : BMU_DIS_RX_CHKSUM);
+   if (sky2-hw-chip_id != CHIP_ID_YUKON_EX) {
+   le = sky2_next_rx(sky2);
+   le-addr = cpu_to_le32((ETH_HLEN  16) | ETH_HLEN);
+   le-ctrl = 0;
+   le-opcode = OP_TCPSTART | HW_OWNER;
+
+   sky2_write32(sky2-hw,
+Q_ADDR(rxqaddr[sky2-port], Q_CSR),
+sky2-rx_csum ? BMU_ENA_RX_CHKSUM : 
BMU_DIS_RX_CHKSUM);
+   }
 
 }
 
@@ -1296,6 +1316,10 @@ static int sky2_up(struct net_device *de
 
sky2_qset(hw, txqaddr[port]);
 
+   /* This is copied from sk98lin 10.0.5.3; no one tells me about 
erratta's */
+   if (hw-chip_id == CHIP_ID_YUKON_EX  hw-chip_rev == 
CHIP_REV_YU_EX_B0)
+   sky2_write32(hw, Q_ADDR(txqaddr[port], Q_TEST), 
F_TX_CHK_AUTO_OFF);
+
/* Set almost empty threshold */
if (hw-chip_id ==

Re: [RFC][PATCH] Add suspend and resume support to uli526x

2007-06-04 Thread Valerie Henson

On Sun, Jun 03, 2007 at 12:37:35PM +0200, Rafael J. Wysocki wrote:
 From: Rafael J. Wysocki [EMAIL PROTECTED]
 
 Add suspend/resume support to the uli526x network driver (tested on x86_64,
 with 'Ethernet controller: ALi Corporation M5263 Ethernet Controller, rev 
 40').
 
 Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED]

Nice work!  Looks good to me.

Signed-off-by: Val Henson [EMAIL PROTECTED]

-VAL

  drivers/net/tulip/uli526x.c |  120 
 ++--
  1 file changed, 105 insertions(+), 15 deletions(-)
 
 Index: linux-2.6.22-rc3/drivers/net/tulip/uli526x.c
 ===
 --- linux-2.6.22-rc3.orig/drivers/net/tulip/uli526x.c 2007-06-03 
 12:10:41.0 +0200
 +++ linux-2.6.22-rc3/drivers/net/tulip/uli526x.c  2007-06-03 
 12:14:33.0 +0200
 @@ -220,6 +220,10 @@ static int mode = 8;
  static int uli526x_open(struct net_device *);
  static int uli526x_start_xmit(struct sk_buff *, struct net_device *);
  static int uli526x_stop(struct net_device *);
 +#ifdef CONFIG_PM
 +static int uli526x_suspend(struct pci_dev *pdev, pm_message_t state);
 +static int uli526x_resume(struct pci_dev *pdev);
 +#endif
  static struct net_device_stats * uli526x_get_stats(struct net_device *);
  static void uli526x_set_filter_mode(struct net_device *);
  static const struct ethtool_ops netdev_ethtool_ops;
 @@ -425,21 +429,14 @@ static void __devexit uli526x_remove_one
  
  
  /*
 - *   Open the interface.
 - *   The interface is opened whenever ifconfig activates it.
 + *   Bring the interface up.
 + *   Used by uli526x_open and uli526x_resume.
   */
  
 -static int uli526x_open(struct net_device *dev)
 +static void uli526x_up(struct net_device *dev)
  {
 - int ret;
   struct uli526x_board_info *db = netdev_priv(dev);
  
 - ULI526X_DBUG(0, uli526x_open, 0);
 -
 - ret = request_irq(dev-irq, uli526x_interrupt, IRQF_SHARED, dev-name, 
 dev);
 - if (ret)
 - return ret;
 -
   /* system variable init */
   db-cr6_data = CR6_DEFAULT | uli526x_cr6_user_set;
   db-tx_packet_cnt = 0;
 @@ -467,7 +464,25 @@ static int uli526x_open(struct net_devic
   db-timer.data = (unsigned long)dev;
   db-timer.function = uli526x_timer;
   add_timer(db-timer);
 +}
 +
 +
 +/*
 + *   Open the interface.
 + *   The interface is opened whenever ifconfig activates it.
 + */
  
 +static int uli526x_open(struct net_device *dev)
 +{
 + int ret;
 +
 + ULI526X_DBUG(0, uli526x_open, 0);
 +
 + ret = request_irq(dev-irq, uli526x_interrupt, IRQF_SHARED, dev-name, 
 dev);
 + if (ret)
 + return ret;
 +
 + uli526x_up(dev);
   return 0;
  }
  
 @@ -613,17 +628,15 @@ static int uli526x_start_xmit(struct sk_
  
  
  /*
 - *   Stop the interface.
 - *   The interface is stopped when it is brought.
 + *   Take the interface down.
 + *   Used by uli526x_stop and uli526x_suspend.
   */
  
 -static int uli526x_stop(struct net_device *dev)
 +static void uli526x_down(struct net_device *dev)
  {
   struct uli526x_board_info *db = netdev_priv(dev);
   unsigned long ioaddr = dev-base_addr;
  
 - ULI526X_DBUG(0, uli526x_stop, 0);
 -
   /* disable system */
   netif_stop_queue(dev);
  
 @@ -634,6 +647,21 @@ static int uli526x_stop(struct net_devic
   outl(ULI526X_RESET, ioaddr + DCR0);
   udelay(5);
   phy_write(db-ioaddr, db-phy_addr, 0, 0x8000, db-chip_id);
 +}
 +
 +
 +/*
 + *   Stop the interface.
 + *   The interface is stopped when it is brought.
 + */
 +
 +static int uli526x_stop(struct net_device *dev)
 +{
 + struct uli526x_board_info *db = netdev_priv(dev);
 +
 + ULI526X_DBUG(0, uli526x_stop, 0);
 +
 + uli526x_down(dev);
  
   /* free interrupt */
   free_irq(dev-irq, dev);
 @@ -654,6 +682,64 @@ static int uli526x_stop(struct net_devic
  }
  
  
 +#ifdef CONFIG_PM
 +
 +/*
 + *   Suspend the interface.
 + */
 +
 +static int uli526x_suspend(struct pci_dev *pdev, pm_message_t state)
 +{
 + struct net_device *dev = pci_get_drvdata(pdev);
 +
 + ULI526X_DBUG(0, uli526x_suspend, 0);
 +
 + if (dev  netdev_priv(dev)) {
 + if (netif_running(dev)) {
 + netif_device_detach(dev);
 + uli526x_down(dev);
 + }
 + pci_save_state(pdev);
 + pci_enable_wake(pdev, pci_choose_state(pdev, state), 0);
 + pci_disable_device(pdev);
 + pci_set_power_state(pdev, pci_choose_state(pdev, state));
 + }
 + return 0;
 +}
 +
 +/*
 + *   Resume the interface.
 + */
 +
 +static int uli526x_resume(struct pci_dev *pdev)
 +{
 + struct net_device *dev = pci_get_drvdata(pdev);
 + struct uli526x_board_info *db = netdev_priv(dev);
 + int err;
 +
 + ULI526X_DBUG(0, uli526x_resume, 0);
 +
 + if (dev  db) {
 + pci_set_power_state(pdev, PCI_D0);
 + err = pci_enable_device(pdev);
 + if (err) {
 +

Re: [2/5 revised][BNX2]: Add missing wait in bnx2_init_5709_context().

2007-06-04 Thread Jeff Garzik

On Mon, Jun 04, 2007 at 05:58:49PM -0700, Michael Chan wrote:
 [BNX2]: Add missing wait in bnx2_init_5709_context().
 
 For correctness, we need to wait for the MEM_INIT bit to be cleared
 in the BNX2_CTX_COMMAND register before proceeding.
 
 [Added return -EBUSY when the MEM_INIT bit doesn't clear, suggested
 by Jeff Garzik.]
 
 Signed-off-by: Michael Chan [EMAIL PROTECTED]

ACK


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] sundance: PHY address form 0, only for device ID 0x0200 (IP100A) (20070605)

2007-06-04 Thread Jesse Huang

From: Jesse Huang [EMAIL PROTECTED]

Change Logs:
Search PHY address form 0, only for device ID 0x0200 (IP100A). Other device are 
from PHY address 1.

Signed-off-by: Jesse Huang [EMAIL PROTECTED]
---

 drivers/net/sundance.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

a093fe5e3e7b9f58458171e481b5096cc1d3abfe
diff --git a/drivers/net/sundance.c b/drivers/net/sundance.c
index e1f912d..23d049d 100644
--- a/drivers/net/sundance.c
+++ b/drivers/net/sundance.c
@@ -562,7 +562,11 @@ #endif
 * It seems some phys doesn't deal well with address 0 being accessed
 * first, so leave address zero to the end of the loop (32  31).
 */
-   for (phy = 1; phy = 32  phy_idx  MII_CNT; phy++) {
+   if(sundance_pci_tbl[np-chip_id].device == 0x0200) 
+   phy = 0;
+   else 
+   phy = 1;
+   for (; phy = 32  phy_idx  MII_CNT; phy++) {
int phyx = phy  0x1f;
int mii_status = mdio_read(dev, phyx, MII_BMSR);
if (mii_status != 0xmii_status != 0x) {
-- 
1.3.GIT



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] sundance: PHY address form 0, only for device ID 0x0200 (IP100A) (20070605)

2007-06-04 Thread Jesse Huang

From: Jesse Huang [EMAIL PROTECTED]

Change Logs:
Search PHY address form 0, only for device ID 0x0200 (IP100A). Other device are 
from PHY address 1.

Signed-off-by: Jesse Huang [EMAIL PROTECTED]
---

 drivers/net/sundance.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

a093fe5e3e7b9f58458171e481b5096cc1d3abfe
diff --git a/drivers/net/sundance.c b/drivers/net/sundance.c
index e1f912d..23d049d 100644
--- a/drivers/net/sundance.c
+++ b/drivers/net/sundance.c
@@ -562,7 +562,11 @@ #endif
 * It seems some phys doesn't deal well with address 0 being accessed
 * first, so leave address zero to the end of the loop (32  31).
 */
-   for (phy = 1; phy = 32  phy_idx  MII_CNT; phy++) {
+   if(sundance_pci_tbl[np-chip_id].device == 0x0200) 
+   phy = 0;
+   else 
+   phy = 1;
+   for (; phy = 32  phy_idx  MII_CNT; phy++) {
int phyx = phy  0x1f;
int mii_status = mdio_read(dev, phyx, MII_BMSR);
if (mii_status != 0xmii_status != 0x) {
-- 
1.3.GIT

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] bugfix GFP_KERNEL - GFP_ATOMIC in spin_locked region

2007-06-04 Thread Andrew Morton

On Mon, 04 Jun 2007 18:25:28 +0200 Yoann Padioleau [EMAIL PROTECTED] wrote:

 
 In a few files a function such as usb_submit_urb is taking GFP_KERNEL
 as an argument whereas this function call is inside a
 spin_lock_irqsave region of code. Documentation says that it must be
 GFP_ATOMIC instead.
 
 Me and my colleagues have made a tool targeting program
 transformations in device drivers. We have designed a scripting
 language for specifying easily program transformations and a
 transformation engine for performing them. In the spirit of Linux
 development practice, the language is based on the patch syntax. We
 call our scripts semantic patches because as opposed to traditional
 patches, our semantic patches do not work at the line level but on a
 high level representation of the C program.
 
 FYI here is our semantic patch detecting invalid use of GFP_KERNEL and
 fixing the problem:
 
 @@
 identifier fn;
 @@
 
  spin_lock_irqsave(...)
  ... when != spin_unlock_irqrestore(...)
  fn(...,
 - GFP_KERNEL
 + GFP_ATOMIC
,...
)

I think I read that paper.

 And now the real patch resulting from the automated transformation:
 
  net/wan/lmc/lmc_main.c|2 +-
  scsi/megaraid/megaraid_mm.c   |2 +-
  usb/serial/io_ti.c|2 +-
  usb/serial/ti_usb_3410_5052.c |2 +-
  usb/serial/whiteheat.c|6 +++---
  5 files changed, 7 insertions(+), 7 deletions(-)

This patch covers three areas of maintainer responsibility, so poor old me
has to split it up and redo the changelogs.  Oh well.

 
 diff --git a/drivers/net/wan/lmc/lmc_main.c b/drivers/net/wan/lmc/lmc_main.c
 index ae132c1..750b3ef 100644
 --- a/drivers/net/wan/lmc/lmc_main.c
 +++ b/drivers/net/wan/lmc/lmc_main.c
 @@ -483,7 +483,7 @@ #endif /* end ifdef _DBG_EVENTLOG */
  break;
  }
  
 -data = kmalloc(xc.len, GFP_KERNEL);
 +data = kmalloc(xc.len, GFP_ATOMIC);
  if(data == 0x0){
  printk(KERN_WARNING %s: Failed to allocate 
 memory for copy\n, dev-name);
  ret = -ENOMEM;

A few lines later we do:

if(copy_from_user(data, xc.data, xc.len))

which also is illegal under spinlock.


Frankly, I think that a better use of this tool would be to just report on
the errors, let humans fix them up.

Nobody maintains this ATM code afaik.

 index e075a52..edee220 100644
 --- a/drivers/scsi/megaraid/megaraid_mm.c
 +++ b/drivers/scsi/megaraid/megaraid_mm.c
 @@ -547,7 +547,7 @@ mraid_mm_attach_buf(mraid_mmadp_t *adp, 
  
   kioc-pool_index= right_pool;
   kioc-free_buf  = 1;
 - kioc-buf_vaddr = pci_pool_alloc(pool-handle, GFP_KERNEL,
 + kioc-buf_vaddr = pci_pool_alloc(pool-handle, GFP_ATOMIC,
   kioc-buf_paddr);
   spin_unlock_irqrestore(pool-lock, flags);

Again, a better fix would probably be to move the pci_pool_alloc() to
before the spin_lock_irqsave(), so we can continue to use the stronger
GFP_KERNEL.

But the locking in there looks basically nonsensical or wrong anyway.  It
appears that local variable `right_pool' cannot be validly used unless
we're holding pool-lock for the whole duration.

Somebody does maintain the megaraid driver, but I'm not sure who, and 
the MAINTAINERS file isn't very useful.  So I'll spray it around a bit.
We definitely have bugs in there.

 diff --git a/drivers/usb/serial/io_ti.c b/drivers/usb/serial/io_ti.c
 index 544098d..9ec38e3 100644
 --- a/drivers/usb/serial/io_ti.c
 +++ b/drivers/usb/serial/io_ti.c
 @@ -2351,7 +2351,7 @@ static int restart_read(struct edgeport_
   urb-complete = edge_bulk_in_callback;
   urb-context = edge_port;
   urb-dev = edge_port-port-serial-dev;
 - status = usb_submit_urb(urb, GFP_KERNEL);
 + status = usb_submit_urb(urb, GFP_ATOMIC);
   }
   edge_port-ep_read_urb_state = EDGE_READ_URB_RUNNING;
   edge_port-shadow_mcr |= MCR_RTS;
 diff --git a/drivers/usb/serial/ti_usb_3410_5052.c 
 b/drivers/usb/serial/ti_usb_3410_5052.c
 index 4203e2b..10dc36f 100644
 --- a/drivers/usb/serial/ti_usb_3410_5052.c
 +++ b/drivers/usb/serial/ti_usb_3410_5052.c
 @@ -1559,7 +1559,7 @@ static int ti_restart_read(struct ti_por
   urb-complete = ti_bulk_in_callback;
   urb-context = tport;
   urb-dev = tport-tp_port-serial-dev;
 - status = usb_submit_urb(urb, GFP_KERNEL);
 + status = usb_submit_urb(urb, GFP_ATOMIC);
   }
   tport-tp_read_urb_state = TI_READ_URB_RUNNING;
  
 diff --git a/drivers/usb/serial/whiteheat.c b/drivers/usb/serial/whiteheat.c
 index 27c5f8f..1b01207 100644
 --- a/drivers/usb/serial/whiteheat.c
 +++ b/drivers/usb/serial/whiteheat.c
 @@ -1116,7 +1116,7 @@ static int firm_send_command (struct usb
   memcpy (transfer_buffer[1], data, datasize);

Re: [PATCH] bugfix GFP_KERNEL - GFP_ATOMIC in spin_locked region

2007-06-04 Thread Andrew Morton

On Mon, 4 Jun 2007 21:00:18 -0700 Andrew Morton [EMAIL PROTECTED] wrote:

  diff --git a/drivers/usb/serial/io_ti.c b/drivers/usb/serial/io_ti.c
  index 544098d..9ec38e3 100644
  --- a/drivers/usb/serial/io_ti.c
  +++ b/drivers/usb/serial/io_ti.c
  @@ -2351,7 +2351,7 @@ static int restart_read(struct edgeport_
  urb-complete = edge_bulk_in_callback;
  urb-context = edge_port;
  urb-dev = edge_port-port-serial-dev;
  -   status = usb_submit_urb(urb, GFP_KERNEL);
  +   status = usb_submit_urb(urb, GFP_ATOMIC);
  }
  edge_port-ep_read_urb_state = EDGE_READ_URB_RUNNING;
  edge_port-shadow_mcr |= MCR_RTS;
  diff --git a/drivers/usb/serial/ti_usb_3410_5052.c 
  b/drivers/usb/serial/ti_usb_3410_5052.c
  index 4203e2b..10dc36f 100644
  --- a/drivers/usb/serial/ti_usb_3410_5052.c
  +++ b/drivers/usb/serial/ti_usb_3410_5052.c
  @@ -1559,7 +1559,7 @@ static int ti_restart_read(struct ti_por
  urb-complete = ti_bulk_in_callback;
  urb-context = tport;
  urb-dev = tport-tp_port-serial-dev;
  -   status = usb_submit_urb(urb, GFP_KERNEL);
  +   status = usb_submit_urb(urb, GFP_ATOMIC);
  }
  tport-tp_read_urb_state = TI_READ_URB_RUNNING;
   
  diff --git a/drivers/usb/serial/whiteheat.c b/drivers/usb/serial/whiteheat.c
  index 27c5f8f..1b01207 100644
  --- a/drivers/usb/serial/whiteheat.c
  +++ b/drivers/usb/serial/whiteheat.c
  @@ -1116,7 +1116,7 @@ static int firm_send_command (struct usb
  memcpy (transfer_buffer[1], data, datasize);
  command_port-write_urb-transfer_buffer_length = datasize + 1;
  command_port-write_urb-dev = port-serial-dev;
  -   retval = usb_submit_urb (command_port-write_urb, GFP_KERNEL);
  +   retval = usb_submit_urb (command_port-write_urb, GFP_ATOMIC);
  if (retval) {
  dbg(%s - submit urb failed, __FUNCTION__);
  goto exit;
  @@ -1316,7 +1316,7 @@ static int start_command_port(struct usb
  usb_clear_halt(serial-dev, command_port-read_urb-pipe);
   
  command_port-read_urb-dev = serial-dev;
  -   retval = usb_submit_urb(command_port-read_urb, GFP_KERNEL);
  +   retval = usb_submit_urb(command_port-read_urb, GFP_ATOMIC);
  if (retval) {
  err(%s - failed submitting read urb, error %d, 
  __FUNCTION__, retval);
  goto exit;
  @@ -1363,7 +1363,7 @@ static int start_port_read(struct usb_se
  wrap = list_entry(tmp, struct whiteheat_urb_wrap, list);
  urb = wrap-urb;
  urb-dev = port-serial-dev;
  -   retval = usb_submit_urb(urb, GFP_KERNEL);
  +   retval = usb_submit_urb(urb, GFP_ATOMIC);
  if (retval) {
  list_add(tmp, info-rx_urbs_free);
  list_for_each_safe(tmp, tmp2, info-rx_urbs_submitted) 
  {
 
 This part might make sense so I'll queue it for the USB guys to look at.
 
 

Everything in USB appears to already be fixed, apart from the io_ti.c bug.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/5][BNX2]: Fix netdev watchdog on 5708.

2007-06-04 Thread David Miller

From: Michael Chan [EMAIL PROTECTED]
Date: Mon, 04 Jun 2007 17:00:03 -0700

 [BNX2]: Fix netdev watchdog on 5708.

 There's a bug in the driver that only initializes half of the context
 memory on the 5708.  Surprisingly, this works most of the time except
 for some occasional netdev watchdogs when sending a lot of 64-byte
 packets.  The fix is to add the missing code to initialize the 2nd
 halves of all context memory.

 Signed-off-by: Michael Chan [EMAIL PROTECTED]

Applied, thanks Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [2/5 revised][BNX2]: Add missing wait in bnx2_init_5709_context().

2007-06-04 Thread David Miller

From: Jeff Garzik [EMAIL PROTECTED]
Date: Mon, 4 Jun 2007 21:23:51 -0400

 On Mon, Jun 04, 2007 at 05:58:49PM -0700, Michael Chan wrote:
  [BNX2]: Add missing wait in bnx2_init_5709_context().

  For correctness, we need to wait for the MEM_INIT bit to be cleared
  in the BNX2_CTX_COMMAND register before proceeding.

  [Added return -EBUSY when the MEM_INIT bit doesn't clear, suggested
  by Jeff Garzik.]

  Signed-off-by: Michael Chan [EMAIL PROTECTED]

 ACK

Applied, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [3/5][BNX2]: Enable DMA on 5709.

2007-06-04 Thread David Miller

From: Michael Chan [EMAIL PROTECTED]
Date: Mon, 04 Jun 2007 17:00:48 -0700

 [BNX2]: Enable DMA on 5709.

 Add missing code to enable DMA on 5709 A1.  The bit is a no-op on
 A0 and therefore can be set on all 5709 chips.

 Signed-off-by: Michael Chan [EMAIL PROTECTED]

Applied, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [4/5][BNX2]: Fix occasional counter corruption on 5708.

2007-06-04 Thread David Miller

From: Michael Chan [EMAIL PROTECTED]
Date: Mon, 04 Jun 2007 17:01:08 -0700

 [BNX2]: Fix occasional counter corruption on 5708.

 The statistics block DMA on 5708 can be messed up occasionally on the
 average of about once per hour.  If the user is reading the counters
 within one second after the corruption, the counters will be all
 messed up.  One second later, the counters will be ok again until the
 next corruption occurs.

 The workaround is to disable the periodic statistics DMA.  Instead,
 we manually trigger the DMA once a second in bnx2_timer().  This
 manual trigger of the DMA avoids the problem.

 As a consequence, we can only allow 0 or 1 second settings for
 ethtool -C statistics block.

 Thanks to Jean-Daniel Pauget [EMAIL PROTECTED] and
 CaT [EMAIL PROTECTED] for reporting this rare problem.

 Signed-off-by: Michael Chan [EMAIL PROTECTED]

Applied.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [5/5][BNX2]: Update version and reldate.

2007-06-04 Thread David Miller

From: Michael Chan [EMAIL PROTECTED]
Date: Mon, 04 Jun 2007 17:01:18 -0700

 [BNX2]: Update version and reldate.

 Update to version 1.5.11.

 Signed-off-by: Michael Chan [EMAIL PROTECTED]

Also applied.

I'd like to note in passing since I saw it in the context of this diff
that I find the RUN_AT() macro confusing.

It's name implies that the argument given to it is absolute.  Instead,
the argument is relative and it's return value is absolute.

Maybe it's just me :-)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

92 matches

Mail list logo