FC5 iptables-restore failure

2007-02-15 Thread Andrew Morton

I've recently been noticing nasty messages come out of FC5:

sony:/home/akpm# service iptables stop
Flushing firewall rules:   [  OK  ]
Setting chains to policy ACCEPT: filter[  OK  ]
Unloading iptables modules:[  OK  ]
sony:/home/akpm# service iptables start
Applying iptables firewall rules: iptables-restore: line 20 failed
   [FAILED]

Dunno when it started happening, but it's in mainline now.

It's a pretty stupid error message.  line 20 of what?

sony:/home/akpm# rpm -q iptables
iptables-1.3.5-1.2
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FC5 iptables-restore failure

2007-02-15 Thread Dave Jones
On Thu, Feb 15, 2007 at 02:45:07AM -0800, Andrew Morton wrote:
  
  I've recently been noticing nasty messages come out of FC5:
  
  sony:/home/akpm# service iptables stop
  Flushing firewall rules:   [  OK  ]
  Setting chains to policy ACCEPT: filter[  OK  ]
  Unloading iptables modules:[  OK  ]
  sony:/home/akpm# service iptables start
  Applying iptables firewall rules: iptables-restore: line 20 failed
 [FAILED]
  
  Dunno when it started happening, but it's in mainline now.
  
  It's a pretty stupid error message.  line 20 of what?

2.6.18 - 2.6.19 changes a bunch of netfilter config option names.
Sure you weren't bitten by that ?

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FC5 iptables-restore failure

2007-02-15 Thread Andrew Morton
On Thu, 15 Feb 2007 06:20:22 -0500 Dave Jones [EMAIL PROTECTED] wrote:

 On Thu, Feb 15, 2007 at 02:45:07AM -0800, Andrew Morton wrote:
   
   I've recently been noticing nasty messages come out of FC5:
   
   sony:/home/akpm# service iptables stop
   Flushing firewall rules:   [  OK  ]
   Setting chains to policy ACCEPT: filter[  OK  ]
   Unloading iptables modules:[  OK  ]
   sony:/home/akpm# service iptables start
   Applying iptables firewall rules: iptables-restore: line 20 failed
  [FAILED]
   
   Dunno when it started happening, but it's in mainline now.
   
   It's a pretty stupid error message.  line 20 of what?
 
 2.6.18 - 2.6.19 changes a bunch of netfilter config option names.
 Sure you weren't bitten by that ?

Yeah, going and madly turning 1000 things on seemed to make it happy.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FC5 iptables-restore failure

2007-02-15 Thread David Hollis
On Thu, 2007-02-15 at 04:10 -0800, Andrew Morton wrote:
 On Thu, 15 Feb 2007 06:20:22 -0500 Dave Jones [EMAIL PROTECTED] wrote:
 
  On Thu, Feb 15, 2007 at 02:45:07AM -0800, Andrew Morton wrote:

I've recently been noticing nasty messages come out of FC5:

sony:/home/akpm# service iptables stop
Flushing firewall rules:   [  OK  ]
Setting chains to policy ACCEPT: filter[  OK  ]
Unloading iptables modules:[  OK  ]
sony:/home/akpm# service iptables start
Applying iptables firewall rules: iptables-restore: line 20 failed
   [FAILED]

Dunno when it started happening, but it's in mainline now.

It's a pretty stupid error message.  line 20 of what?
  
  2.6.18 - 2.6.19 changes a bunch of netfilter config option names.
  Sure you weren't bitten by that ?
 
 Yeah, going and madly turning 1000 things on seemed to make it happy.

If you ran system-config-securitylevel to do that, that probably made it
re-generate the /etc/sysconfig/iptables file which is dumped to
iptables.  

-- 
David Hollis [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG?/SCHED] netem and tbf seem to busy wait with some clock sources

2007-02-15 Thread Lucas Nussbaum
Hi,

While experimenting with netem and tbf, I ran into some strange results.

Experimental setup:
tc qdisc add dev eth2 root netem delay 10ms
Linux 2.6.20-rc6 ; HZ=250

I measured the latency using a modified ping implementation, to allow
for high frequency measurement (one measure every 0.1ms). I compared the
results using different clock sources.

See http://www-id.imag.fr/~nussbaum/sched/clk-latency.png

The results with CLK_JIFFIES are the ones I expected: one clearly sees
the influence of HZ, with latency varying around 10ms +/- (1/2)*(1000/HZ).

On the other hand, the results with CLK_GETTIMEOFDAY or CLK_CPU don't
seem to be bound to the HZ setting. Looking at the source, I suspect
that netem is actually sort of busy-waiting, by re-setting the timer to
the old value.

I instrumented netem_dequeue to confirm this (see [1]), and
PSCHED_US2JIFFIE(delay)) returns 0, causing the timer to be rescheduled
at the same jiffie. I could see netem_dequeue being called up to 150
times during a jiffie (at HZ=250).

So, my question is: Is this the expected behaviour ? Wouldn't it be
better to set the timer to jiffies + 1 if the delay is 0 ? Or to send
the packet immediately if the delay is 0, instead of waiting ?

I got similar results with tbf (frames being spaced too well, instead
of bursting).

[1] http://www-id.imag.fr/~nussbaum/sched/netem_dequeue.diff

Thank you,
-- 
| Lucas NussbaumPhD student |
| [EMAIL PROTECTED]LIG / Projet MESCAL |
| jabber: [EMAIL PROTECTED]+33 (0)6 64 71 41 65 |
| homepage:http://www-id.imag.fr/~nussbaum/ |


signature.asc
Description: Digital signature


[PATCH v2, resend] gianfar: don't duplicate gfar_error()

2007-02-15 Thread Sergei Shtylyov
It was hardly necessary to repeat most of the code from gfar_error() in
gfar_interrupt(), especially having some inconsistencies between the two.
So, make the gfar_interrupt() just call gfar_error(), and not acknowledge
the interrupts itself as gfar_{receive/transmit/error}() do it anyway.
While at it, also clarify/cleanup debug messages in gfar_error()...

Signed-off-by: Sergei Shtylyov [EMAIL PROTECTED]

---
The patch survived netperf stressing on MPC8540ADS realtime kernel. :-)

Sorry, forgot to remove the obsolete regs argument from gfar_error() call,
call, so the previous version wasn't even compilable -- I've tested the patch
in the older kernel. Resending now with better placed comments which you won't
have to edit out... :-

 drivers/net/gianfar.c |   85 --
 1 files changed, 15 insertions(+), 70 deletions(-)

Index: linux-2.6/drivers/net/gianfar.c
===
--- linux-2.6.orig/drivers/net/gianfar.c
+++ linux-2.6/drivers/net/gianfar.c
@@ -10,6 +10,7 @@
  * Maintainer: Kumar Gala
  *
  * Copyright (c) 2002-2006 Freescale Semiconductor, Inc.
+ * Copyright (c) 2007 MontaVista Software, Inc.
  *
  * This program is free software; you can redistribute  it and/or modify it
  * under  the terms of  the GNU General  Public License as published by the
@@ -1613,71 +1614,17 @@ static irqreturn_t gfar_interrupt(int ir
/* Save ievent for future reference */
u32 events = gfar_read(priv-regs-ievent);
 
-   /* Clear IEVENT */
-   gfar_write(priv-regs-ievent, events);
-
/* Check for reception */
-   if ((events  IEVENT_RXF0) || (events  IEVENT_RXB0))
+   if (events  IEVENT_RX_MASK)
gfar_receive(irq, dev_id);
 
/* Check for transmit completion */
-   if ((events  IEVENT_TXF) || (events  IEVENT_TXB))
+   if (events  IEVENT_TX_MASK)
gfar_transmit(irq, dev_id);
 
-   /* Update error statistics */
-   if (events  IEVENT_TXE) {
-   priv-stats.tx_errors++;
-
-   if (events  IEVENT_LC)
-   priv-stats.tx_window_errors++;
-   if (events  IEVENT_CRL)
-   priv-stats.tx_aborted_errors++;
-   if (events  IEVENT_XFUN) {
-   if (netif_msg_tx_err(priv))
-   printk(KERN_WARNING %s: tx underrun. dropped 
packet\n, dev-name);
-   priv-stats.tx_dropped++;
-   priv-extra_stats.tx_underrun++;
-
-   /* Reactivate the Tx Queues */
-   gfar_write(priv-regs-tstat, TSTAT_CLEAR_THALT);
-   }
-   }
-   if (events  IEVENT_BSY) {
-   priv-stats.rx_errors++;
-   priv-extra_stats.rx_bsy++;
-
-   gfar_receive(irq, dev_id);
-
-#ifndef CONFIG_GFAR_NAPI
-   /* Clear the halt bit in RSTAT */
-   gfar_write(priv-regs-rstat, RSTAT_CLEAR_RHALT);
-#endif
-
-   if (netif_msg_rx_err(priv))
-   printk(KERN_DEBUG %s: busy error (rhalt: %x)\n,
-   dev-name,
-   gfar_read(priv-regs-rstat));
-   }
-   if (events  IEVENT_BABR) {
-   priv-stats.rx_errors++;
-   priv-extra_stats.rx_babr++;
-
-   if (netif_msg_rx_err(priv))
-   printk(KERN_DEBUG %s: babbling error\n, dev-name);
-   }
-   if (events  IEVENT_EBERR) {
-   priv-extra_stats.eberr++;
-   if (netif_msg_rx_err(priv))
-   printk(KERN_DEBUG %s: EBERR\n, dev-name);
-   }
-   if ((events  IEVENT_RXC)  (netif_msg_rx_err(priv)))
-   printk(KERN_DEBUG %s: control frame\n, dev-name);
-
-   if (events  IEVENT_BABT) {
-   priv-extra_stats.tx_babt++;
-   if (netif_msg_rx_err(priv))
-   printk(KERN_DEBUG %s: babt error\n, dev-name);
-   }
+   /* Check for errors */
+   if (events  IEVENT_ERR_MASK)
+   gfar_error(irq, dev_id);
 
return IRQ_HANDLED;
 }
@@ -1939,7 +1886,7 @@ static irqreturn_t gfar_error(int irq, v
/* Hmm... */
if (netif_msg_rx_err(priv) || netif_msg_tx_err(priv))
printk(KERN_DEBUG %s: error interrupt (ievent=0x%08x 
imask=0x%08x)\n,
-   dev-name, events, 
gfar_read(priv-regs-imask));
+  dev-name, events, gfar_read(priv-regs-imask));
 
/* Update the error counters */
if (events  IEVENT_TXE) {
@@ -1951,8 +1898,8 @@ static irqreturn_t gfar_error(int irq, v
priv-stats.tx_aborted_errors++;
if (events  IEVENT_XFUN) {
if (netif_msg_tx_err(priv))
-   printk(KERN_DEBUG %s: underrun.  packet 
dropped.\n,
-  

Re: [PATCH] sk98lin: planned removal

2007-02-15 Thread Jeff Garzik

Andrew Morton wrote:

On Wed, 7 Feb 2007 09:18:30 -0800 Stephen Hemminger [EMAIL PROTECTED] wrote:


Document planned removal of sk98lin driver.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 Documentation/feature-removal-schedule.txt |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/Documentation/feature-removal-schedule.txt 
b/Documentation/feature-removal-schedule.txt
index 0ba6af0..d08a4af 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -325,3 +325,10 @@ Why:	Unmaintained for years, superceded 
 Who:	Jeff Garzik [EMAIL PROTECTED]
 
 ---

+
+What:   sk98lin network driver
+When:   July 2007
+Why:In kernel tree version of driver is unmaintained. Sk98lin driver
+	replaced by the skge driver. 
+Who:Stephen Hemminger [EMAIL PROTECTED]

+


People don't read that file.  I'd suggest the addition of a warning printk
to the driver's open() method.


Fine with me.  Stephen, wanna cook up a patch?

Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.21-rc1] ehea: dynamic add / remove port

2007-02-15 Thread John Rose
 Second, the probe and remove functions do not communicate whether an add
 or remove was successful.  Combine this with the lack of port
 information in the adapter sysfs directory, and the userspace tool has
 no way of verifying a dynamic add/remove.

One way to communicate a return code is by making the sysfs interface
file read/write, and having the read callback give a return code.  For
an example of this, you can look at drivers/pci/rpadlpar_sysfs.c and
rpadlpar_core.c.

It would still be nice to have some way from the adapter sysfs directory
to list/examine logical ports.  Symlinks would work, but it would be
even nicer to have a new kobject per port with attribute files for
logical id, state, etc.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


sky2 bonding problem, 802.3ad

2007-02-15 Thread Holger Eitzenberger
Hi Steven,

I have problems using sky2 v1.10 with with bonding driver (802.3ad),
on 'Marvell 88E8053 PCI-E Gigabit Ethernet Controller'.  I have attached
the full lspci output.

My test was to setup a bond of two physical links (both links same
hardware) and ping 192.168.11.10, which is the address of the switch
itself.

I have tested v1.10 with kernel 2.6.19 and 2.6.16.36 (own backport),
which despite the bonding problem runs fine.  Both, kernel 2.6.19 and
2.6.16.36 show the same behaviour.  The 802.3ad aware switch is a Dell
PowerConnect 5324.  VLAN is not configured on all switch ports.  Another
test on a host running kernel 2.6.18.2 with two e1000's bonded runs
fine.  Using sk98lin (v8.41  v10.0.4) worked also.

The host containing the Yukon-II has a total of 8 NICs, 4 PIC and 4
PCI-E, see attached lspci output.  The failed bond was created from two
PCI-E interfaces.

Find attached a short script which I use to set up the bond on both hosts,
also attached is a procfile (/proc/net/bonding/bond0) from
the e1000 host with a working bond as well as the procfile from the host
with the Yukon-II cards.

When looking at the working bond I see that both slave interfaces are
IFF_UP, the load is shared over both links.  When looking at the failing
sky2 bond I see that the bond is not IFF_UP, whereas both slave
interfaces are IFF_UP.  The 802.3ad partner MAC address is left all zero's, 
also both interfaces have different Aggregator IDs (1  2).  One of the
two failing interfaces always has IFF_NOARP set, caused by code
calling bond_main.c:bond_set_slave_inactive_flags().

I used both use_carrier=1 (default) as well as miimon=50 without luck.

Going through the bonding code, and comparing sky2 source to the e100
code, which I am quite familiar with, I see that sky2 does not use the
generic MII interface, which might point in the right direction.

I am currently going through the bonding code and try to understand the
master - slave - sky2 interaction, basically this is either through
calling the sky2 net_device ops and through the ethtool ops.

If you need further info or further testing from my side: i will gladly
do that.

Besides that, thanks for a great driver!

   /holger



bonding-prob.tgz
Description: bonding-prob.tgz


Re: [BUG] RTNL and flush_scheduled_work deadlocks

2007-02-15 Thread Ben Greear

Francois Romieu wrote:

Ben Greear [EMAIL PROTECTED] :
[...]

I seem to be able to trigger this within about 1 minute on a
particular 2.6.18.2 system with some 8139too devices, so if someone
has a patch that could be tested, I'll gladly test it.  For
whatever reason, I haven't hit this problem on 2.6.20 yet, but
that could easily be dumb luck, and I haven't been running .20
very much.


Bandaid below. It is not complete if your device hits the tx_watchdog
hard but it should help.


So far, I've been running several hours without problems, so
this does appear to be helping.

Thanks,
Ben

--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 bonding problem, 802.3ad

2007-02-15 Thread Stephen Hemminger

 I used both use_carrier=1 (default) as well as miimon=50 without luck.

use_carrier should work (since device reports carrier transistions).

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 bonding problem, 802.3ad

2007-02-15 Thread Jay Vosburgh
Holger Eitzenberger [EMAIL PROTECTED] wrote:

I have tested v1.10 with kernel 2.6.19 and 2.6.16.36 (own backport),
which despite the bonding problem runs fine.  Both, kernel 2.6.19 and
2.6.16.36 show the same behaviour.  The 802.3ad aware switch is a Dell
PowerConnect 5324.  VLAN is not configured on all switch ports.  Another
test on a host running kernel 2.6.18.2 with two e1000's bonded runs
fine.  Using sk98lin (v8.41  v10.0.4) worked also.

The log you included (with debug turned on) indicates that
bonding is at least attempting to send LACPDUs, but there are no log
entries for having received any LACPDUs.

I'm unfamiliar with your particular switch, but usually this
kind of problem with bonding 802.3ad is in the switch interaction.  The
switches I have (Cisco) require that 802.3ad mode be explicitly enabled
on whichever ports it is desired on, so it may be worthwhile to check
your switch and make sure that it really is configured for 802.3ad on
the sky2 ports.

If the switch is configured, you may want to also check to see
if it has counters for LACPDUs sent and received.  If the switch is not
sending and receiving LACPDUs on the appropriate ports, then it's more
likely to be a communications problem somewhere (vs. an 802.3ad
negotiation problem).

When looking at the working bond I see that both slave interfaces are
IFF_UP, the load is shared over both links.  When looking at the failing
sky2 bond I see that the bond is not IFF_UP, whereas both slave
interfaces are IFF_UP.  The 802.3ad partner MAC address is left all zero's, 
also both interfaces have different Aggregator IDs (1  2).  One of the
two failing interfaces always has IFF_NOARP set, caused by code
calling bond_main.c:bond_set_slave_inactive_flags().

For the version of bonding in your dmesg log, the IFF_NOARP is
expected; 802.3ad will select one aggregator as the active one, the
other aggregators will be marked inactive, and that sets IFF_NOARP.
Since no LACPDUs have been exchanged, bonding is leaving each interface
as a separate aggregator.  Versions of bonding later than February 2006
(your proc-bond0-ok for example) don't set the IFF_NOARP on inactive
slaves (a new mechanism is used that doesn't mess with the flags).

-J

---
-Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 bonding problem, 802.3ad

2007-02-15 Thread Holger Eitzenberger
Stephen Hemminger [EMAIL PROTECTED] writes:

 I used both use_carrier=1 (default) as well as miimon=50 without luck.

 use_carrier should work (since device reports carrier transistions).

As you can see in the script I used both use_carrier and miimon (in
combinations) without success.  In fact use_carrier is the default if no
other options are set.

  /holger
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7974] New: BUG: scheduling while atomic: swapper/0x10000100/0

2007-02-15 Thread Andy Gospodarek
On Tue, Feb 13, 2007 at 06:11:15PM -0800, Jay Vosburgh wrote:
 Andy Gospodarek [EMAIL PROTECTED] wrote:
 
 This is exactly the problem I've got, Jay.  I'd love to come up with
 something that will be a smaller patch to solve this in the near term
 and then focus on a larger set of changes down the road but it doesn't
 seem likely.
 
   That's more or less the conclusion I've reached as well.  The
 locking model needs to be redone from scratch; there are too many
 players and the conditions have changed since things were originally
 done (in terms of side effects, particularly the places that may sleep
 today that didn't in the past).  It's tricky to redo the bulk of the
 innards in small modular steps.
 

That might be the case, but I feel like the code is in a place where it
*could* work tomorrow with just some small tweaks (getting everything
into process context).  I'm reluctant to cram a whole bunch of changes
down someone's throat immediately (with a big patch) because it will be
difficult to learn incrementally what is being done well and what isn't
based on user feedback.

 Andy, one thought: do you think it would work better to simplify
  the locking that is there first, i.e., convert the timers to work
  queues, have a single dispatcher that handles everything (and can be
  suspended for mutexing purposes), as in the patch I sent you?  The
  problem isn't just rtnl; there also has to be a release of the bonding
  locks themselves (to handle the might sleep issues), and that's tricky
  to do with so many entities operating concurrently.  Reducing the number
  of involved parties should make the problem simpler.
  
 
 I really don't feel like there are that many operations happening
 concurrently, but having a workqueue that managed and dispatched the
 operations and detected current link status would probably be helpful
 for long term maintenance.  It would probably be wise to have individual
 workqueues that managed any mode-specific operations, so their processing
 doesn't interfere with any link-checking operations.
 
   I like having the mode-specific monitors simply be special case
 monitors in a unified monitor system. It resolves the link-check
 vs. mode-specific conflict, and allows all of the periodic things to be
 paused as a set for mutexing purposes.  I'm happy to be convinced that
 I'm just blowing hooey here, but that's how it seems to me.
 

I went back and looked at my patch from a few months ago and I actually
use a single-threaded workqueue for each bond device.  I did not use
different queues to replace each timer -- just different types of work
that is placed on the queue.  So it sounds like we agree -- now we just
need to pick a decent starting point?

   For balance-alb (the biggest offender in terms of locking
 violations), the link monitor, alb mode monitor, transmit activity, and
 user initiated stuff (add or remove slave, for example) all need to be
 mutexed against one another.  The user initiated stuff comes in with
 rtnl and the link monitor needs rtnl if it has to fail over.  All of
 them need the regular bonding lock, but the user initiated stuff and the
 link monitor both need to do things (change mac addresses, call
 dev_open) with that lock released.
 
   Do you have any work in progress patches or descendents of the
 big rework thing I sent you?  I need to get back to this, and whatever
 we do it's probably better if we're at least a little bit coordinated.
 

I've updated my original patch with some changes that take the rtnetlink
lock deeper into the code without adding any conditional locking.  I'm
going to work with it more today and let you know if I think it is
better or worse that what I had before.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


sk98lin: 2 Pair Downshift detected

2007-02-15 Thread Tony Chung
Hi,

I got the following message:

Class:  Hardware failure
Nr:  0x270
Msg:  2 Pair Downshift detected

It is from the sk98lin driver and my research indicated that it may be
caused by bad Ethernet cable.  The gigabit port is now became 100Mbps.

My questions are:
1. What is 0x270 mean? Is there any link or reference for it?
2. How do you recover from it (i.e. negotiate back to 1000Mbps)?
Reboot? Reload the driver?  Can the driver do it automatically?


Thanks.
- Tony
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 bonding problem, 802.3ad

2007-02-15 Thread Holger Eitzenberger
Jay Vosburgh [EMAIL PROTECTED] writes:

   I'm unfamiliar with your particular switch, but usually this
 kind of problem with bonding 802.3ad is in the switch interaction.  The
 switches I have (Cisco) require that 802.3ad mode be explicitly enabled
 on whichever ports it is desired on, so it may be worthwhile to check
 your switch and make sure that it really is configured for 802.3ad on
 the sky2 ports.

I am currently using port 12 and port 910 for bonding and have
configured all four ports for the same aggregator ID 1, LCAP enabled.
I also switched ports, that is, I changed host1 from using port 12 to
use port 910 and vice versa.  Note that I also used sk98lin which
worked in my setup also.  Do you still think it is a misconfigured
switch?

   If the switch is configured, you may want to also check to see
 if it has counters for LACPDUs sent and received.  If the switch is not
 sending and receiving LACPDUs on the appropriate ports, then it's more
 likely to be a communications problem somewhere (vs. an 802.3ad
 negotiation problem).

I will check tomorrow morning whether I see the LACPDUs in the log and
report.

Any more tests which may be helpfull?

Thanks.  /holger
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 bonding problem, 802.3ad

2007-02-15 Thread Andy Gospodarek
On Thu, Feb 15, 2007 at 07:55:42PM +0100, Holger Eitzenberger wrote:
 Hi Steven,
 
 I have problems using sky2 v1.10 with with bonding driver (802.3ad),
 on 'Marvell 88E8053 PCI-E Gigabit Ethernet Controller'.  I have attached
 the full lspci output.
 
 My test was to setup a bond of two physical links (both links same
 hardware) and ping 192.168.11.10, which is the address of the switch
 itself.
 
 I have tested v1.10 with kernel 2.6.19 and 2.6.16.36 (own backport),
 which despite the bonding problem runs fine.  Both, kernel 2.6.19 and
 2.6.16.36 show the same behaviour.  The 802.3ad aware switch is a Dell
 PowerConnect 5324.  VLAN is not configured on all switch ports.  Another
 test on a host running kernel 2.6.18.2 with two e1000's bonded runs
 fine.  Using sk98lin (v8.41  v10.0.4) worked also.


I get the impression that sky2 has never worked for you.  Is that
correct?  There was an skge problem I noticed a while ago where on reset
the multicast membership list was cleared.  

commit 758140900a82e3ed3bb2be1d4705dd352fe44825
Author: Stephen Hemminger [EMAIL PROTECTED]
Date:   Fri Dec 1 11:41:08 2006 -0800

[PATCH] skge: don't clear MC state on link down

I would rather fix Andy's problem by not clearing
multicast information on link down.

Also, add code to restore multicast state after ethtool phy reset.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]



A patch Having this list cleared could stop you from receiving 802.3ad
PDUs.  I'll check skge and see if it has the same problem (I'm betting
on it).





-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sk98lin: 2 Pair Downshift detected

2007-02-15 Thread Stephen Hemminger
On Thu, 15 Feb 2007 12:49:26 -0800
Tony Chung [EMAIL PROTECTED] wrote:

 Hi,
 
 I got the following message:
 
 Class:  Hardware failure
 Nr:  0x270
 Msg:  2 Pair Downshift detected
 
 It is from the sk98lin driver and my research indicated that it may be
 caused by bad Ethernet cable.  The gigabit port is now became 100Mbps.
 
 My questions are:
 1. What is 0x270 mean? Is there any link or reference for it?

It is a error code used inside driver. It looks like a bad job of 
internationalization
so every message is encoded as a number, then printed out.


 2. How do you recover from it (i.e. negotiate back to 1000Mbps)?
 Reboot? Reload the driver?  Can the driver do it automatically?

Use 'ethtool -r eth0' to force PHY renegotiation or bring device
down then back up.

 
 Thanks.
 - Tony
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 bonding problem, 802.3ad

2007-02-15 Thread Neil Horman
On Thu, Feb 15, 2007 at 10:09:40PM +0100, Holger Eitzenberger wrote:
 Jay Vosburgh [EMAIL PROTECTED] writes:
 
  I'm unfamiliar with your particular switch, but usually this
  kind of problem with bonding 802.3ad is in the switch interaction.  The
  switches I have (Cisco) require that 802.3ad mode be explicitly enabled
  on whichever ports it is desired on, so it may be worthwhile to check
  your switch and make sure that it really is configured for 802.3ad on
  the sky2 ports.
 
 I am currently using port 12 and port 910 for bonding and have
 configured all four ports for the same aggregator ID 1, LCAP enabled.
 I also switched ports, that is, I changed host1 from using port 12 to
 use port 910 and vice versa.  Note that I also used sk98lin which
 worked in my setup also.  Do you still think it is a misconfigured
 switch?
 
  If the switch is configured, you may want to also check to see
  if it has counters for LACPDUs sent and received.  If the switch is not
  sending and receiving LACPDUs on the appropriate ports, then it's more
  likely to be a communications problem somewhere (vs. an 802.3ad
  negotiation problem).
 
 I will check tomorrow morning whether I see the LACPDUs in the log and
 report.
 
 Any more tests which may be helpfull?
 
If I had to guess I'd say that sky2 wasn't setting its multicast list properly,
or the bonding driver wasn't telling it too.  IIRC LACPDU's are received on a
reserved multicast MAC address, which the hardware needs to be told to receive.
If the bonding driver isn't receving those frames (which I think you should be
able to tell by looking at the sky2 rx_multicast stat with ethtool.  If the
value isn't going up then you aren't getting LACPDU frames). The hardware should
have that the lacpdu multicast address added during the enslaving process (via
bond enslave).  I'd start instrumenting that part of the driver, as well as
sky2.c's set_multicast_list method, to see if anything is going awry.

Regards
Neil

 Thanks.  /holger
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 bonding problem, 802.3ad

2007-02-15 Thread Andy Gospodarek
On Thu, Feb 15, 2007 at 04:31:36PM -0500, Andy Gospodarek wrote:
 On Thu, Feb 15, 2007 at 07:55:42PM +0100, Holger Eitzenberger wrote:
  Hi Steven,
  
  I have problems using sky2 v1.10 with with bonding driver (802.3ad),
  on 'Marvell 88E8053 PCI-E Gigabit Ethernet Controller'.  I have attached
  the full lspci output.
  
  My test was to setup a bond of two physical links (both links same
  hardware) and ping 192.168.11.10, which is the address of the switch
  itself.
  
  I have tested v1.10 with kernel 2.6.19 and 2.6.16.36 (own backport),
  which despite the bonding problem runs fine.  Both, kernel 2.6.19 and
  2.6.16.36 show the same behaviour.  The 802.3ad aware switch is a Dell
  PowerConnect 5324.  VLAN is not configured on all switch ports.  Another
  test on a host running kernel 2.6.18.2 with two e1000's bonded runs
  fine.  Using sk98lin (v8.41  v10.0.4) worked also.
 
 
 I get the impression that sky2 has never worked for you.  Is that
 correct?  There was an skge problem I noticed a while ago where on reset
 the multicast membership list was cleared.  
 
 commit 758140900a82e3ed3bb2be1d4705dd352fe44825
 Author: Stephen Hemminger [EMAIL PROTECTED]
 Date:   Fri Dec 1 11:41:08 2006 -0800
 
 [PATCH] skge: don't clear MC state on link down
 
 I would rather fix Andy's problem by not clearing
 multicast information on link down.
 
 Also, add code to restore multicast state after ethtool phy reset.
 
 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
 Signed-off-by: Jeff Garzik [EMAIL PROTECTED]
 
 
 
 A patch Having this list cleared could stop you from receiving 802.3ad
 PDUs.  I'll check skge and see if it has the same problem (I'm betting
 on it).
 
 

After a quick peek this doesn't look like it's the issue.  The skge
problem was apparent because when you pulled the link the multicast
memberships disappeared
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 bonding problem, 802.3ad

2007-02-15 Thread Holger Eitzenberger
Andy Gospodarek [EMAIL PROTECTED] writes:

 I get the impression that sky2 has never worked for you.  Is that
 correct?  There was an skge problem I noticed a while ago where on reset
 the multicast membership list was cleared.  

Well, when it comes to bonding: yes, almost :).  When I noticed the fact
that IFF_NOARP was set on the other interface I experimented a bit with
these flags, ala

  ip l set dev eth0 arp on|off

and I 2-3 times had a link.

  /holger
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] sis190: RTNL and flush_scheduled_work deadlock

2007-02-15 Thread Francois Romieu
Signed-off-by: Francois Romieu [EMAIL PROTECTED]
---
 drivers/net/sis190.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sis190.c b/drivers/net/sis190.c
index 45d91b1..b08508b 100644
--- a/drivers/net/sis190.c
+++ b/drivers/net/sis190.c
@@ -909,6 +909,9 @@ static void sis190_phy_task(struct work_struct *work)
 
rtnl_lock();
 
+   if (!netif_running(dev))
+   goto out_unlock;
+
val = mdio_read(ioaddr, phy_id, MII_BMCR);
if (val  BMCR_RESET) {
// FIXME: needlessly high ?  -- FR 02/07/2005
@@ -981,6 +984,7 @@ static void sis190_phy_task(struct work_struct *work)
netif_carrier_on(dev);
}
 
+out_unlock:
rtnl_unlock();
 }
 
@@ -1102,8 +1106,6 @@ static void sis190_down(struct net_device *dev)
 
netif_stop_queue(dev);
 
-   flush_scheduled_work();
-
do {
spin_lock_irq(tp-lock);
 
@@ -1857,6 +1859,7 @@ static void __devexit sis190_remove_one(struct pci_dev 
*pdev)
struct net_device *dev = pci_get_drvdata(pdev);
 
sis190_mii_remove(dev);
+   flush_scheduled_work();
unregister_netdev(dev);
sis190_release_board(pdev);
pci_set_drvdata(pdev, NULL);
-- 
1.4.4.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] r8169: RTNL and flush_scheduled_work deadlock

2007-02-15 Thread Francois Romieu
flush_scheduled_work() in net_device-close has a slight tendency
to deadlock with tasks on the workqueue that hold RTNL.

rtl8169_close/down simply need the recovery tasks to not meddle
with the hardware while the device is going down.

Signed-off-by: Francois Romieu [EMAIL PROTECTED]
---
 drivers/net/r8169.c |   25 ++---
 1 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 5598d86..13cf06e 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1733,6 +1733,8 @@ rtl8169_remove_one(struct pci_dev *pdev)
assert(dev != NULL);
assert(tp != NULL);
 
+   flush_scheduled_work();
+
unregister_netdev(dev);
rtl8169_release_board(pdev, dev, tp-mmio_addr);
pci_set_drvdata(pdev, NULL);
@@ -2161,10 +2163,13 @@ static void rtl8169_reinit_task(struct work_struct 
*work)
struct net_device *dev = tp-dev;
int ret;
 
-   if (netif_running(dev)) {
-   rtl8169_wait_for_quiescence(dev);
-   rtl8169_close(dev);
-   }
+   rtnl_lock();
+
+   if (!netif_running(dev))
+   goto out_unlock;
+
+   rtl8169_wait_for_quiescence(dev);
+   rtl8169_close(dev);
 
ret = rtl8169_open(dev);
if (unlikely(ret  0)) {
@@ -2179,6 +2184,9 @@ static void rtl8169_reinit_task(struct work_struct *work)
}
rtl8169_schedule_work(dev, rtl8169_reinit_task);
}
+
+out_unlock:
+   rtnl_unlock();
 }
 
 static void rtl8169_reset_task(struct work_struct *work)
@@ -2187,8 +2195,10 @@ static void rtl8169_reset_task(struct work_struct *work)
container_of(work, struct rtl8169_private, task.work);
struct net_device *dev = tp-dev;
 
+   rtnl_lock();
+
if (!netif_running(dev))
-   return;
+   goto out_unlock;
 
rtl8169_wait_for_quiescence(dev);
 
@@ -2210,6 +2220,9 @@ static void rtl8169_reset_task(struct work_struct *work)
}
rtl8169_schedule_work(dev, rtl8169_reset_task);
}
+
+out_unlock:
+   rtnl_unlock();
 }
 
 static void rtl8169_tx_timeout(struct net_device *dev)
@@ -2722,8 +2735,6 @@ static void rtl8169_down(struct net_device *dev)
 
netif_stop_queue(dev);
 
-   flush_scheduled_work();
-
 core_down:
spin_lock_irq(tp-lock);
 
-- 
1.4.4.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] s2io: RTNL and flush_scheduled_work deadlock

2007-02-15 Thread Francois Romieu
Mantra: don't use flush_scheduled_work with RTNL held.

Signed-off-by: Francois Romieu [EMAIL PROTECTED]
---
 drivers/net/s2io.c |   21 ++---
 1 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
index e8e0d94..fd85648 100644
--- a/drivers/net/s2io.c
+++ b/drivers/net/s2io.c
@@ -3758,7 +3758,6 @@ static int s2io_close(struct net_device *dev)
 {
struct s2io_nic *sp = dev-priv;
 
-   flush_scheduled_work();
netif_stop_queue(dev);
/* Reset card, kill tasklet and free Tx and Rx buffers. */
s2io_card_down(sp);
@@ -5847,9 +5846,14 @@ static void s2io_set_link(struct work_struct *work)
register u64 val64;
u16 subid;
 
+   rtnl_lock();
+
+   if (!netif_running(dev))
+   goto out_unlock;
+
if (test_and_set_bit(0, (nic-link_state))) {
/* The card is being reset, no point doing anything */
-   return;
+   goto out_unlock;
}
 
subid = nic-pdev-subsystem_device;
@@ -5903,6 +5907,9 @@ static void s2io_set_link(struct work_struct *work)
s2io_link(nic, LINK_DOWN);
}
clear_bit(0, (nic-link_state));
+
+out_unlock:
+   rtnl_lock();
 }
 
 static int set_rxd_buffer_pointer(struct s2io_nic *sp, struct RxD_t *rxdp,
@@ -6356,6 +6363,11 @@ static void s2io_restart_nic(struct work_struct *work)
struct s2io_nic *sp = container_of(work, struct s2io_nic, 
rst_timer_task);
struct net_device *dev = sp-dev;
 
+   rtnl_lock();
+
+   if (!netif_running(dev))
+   goto out_unlock;
+
s2io_card_down(sp);
if (s2io_card_up(sp)) {
DBG_PRINT(ERR_DBG, %s: Device bring up failed\n,
@@ -6364,7 +6376,8 @@ static void s2io_restart_nic(struct work_struct *work)
netif_wake_queue(dev);
DBG_PRINT(ERR_DBG, %s: was reset by Tx watchdog timer\n,
  dev-name);
-
+out_unlock:
+   rtnl_unlock();
 }
 
 /**
@@ -7173,6 +7186,8 @@ static void __devexit s2io_rem_nic(struct pci_dev *pdev)
return;
}
 
+   flush_scheduled_work();
+
sp = dev-priv;
unregister_netdev(dev);
 
-- 
1.4.4.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] 8139too: RTNL and flush_scheduled_work deadlock

2007-02-15 Thread Francois Romieu
Your usual dont-flush_scheduled_work-with-RTNL-held stuff.

It is a bit different here since the thread runs permanently
or is only occasionally kicked for recovery depending on the
hardware revision.

Signed-off-by: Francois Romieu [EMAIL PROTECTED]
---
 drivers/net/8139too.c |   40 +---
 1 files changed, 17 insertions(+), 23 deletions(-)

diff --git a/drivers/net/8139too.c b/drivers/net/8139too.c
index 35ad5cf..99304b2 100644
--- a/drivers/net/8139too.c
+++ b/drivers/net/8139too.c
@@ -1109,6 +1109,8 @@ static void __devexit rtl8139_remove_one (struct pci_dev 
*pdev)
 
assert (dev != NULL);
 
+   flush_scheduled_work();
+
unregister_netdev (dev);
 
__rtl8139_cleanup_dev (dev);
@@ -1603,18 +1605,21 @@ static void rtl8139_thread (struct work_struct *work)
struct net_device *dev = tp-mii.dev;
unsigned long thr_delay = next_tick;
 
+   rtnl_lock();
+
+   if (!netif_running(dev))
+   goto out_unlock;
+
if (tp-watchdog_fired) {
tp-watchdog_fired = 0;
rtl8139_tx_timeout_task(work);
-   } else if (rtnl_trylock()) {
-   rtl8139_thread_iter (dev, tp, tp-mmio_addr);
-   rtnl_unlock ();
-   } else {
-   /* unlikely race.  mitigate with fast poll. */
-   thr_delay = HZ / 2;
-   }
+   } else
+   rtl8139_thread_iter(dev, tp, tp-mmio_addr);
 
-   schedule_delayed_work(tp-thread, thr_delay);
+   if (tp-have_thread)
+   schedule_delayed_work(tp-thread, thr_delay);
+out_unlock:
+   rtnl_unlock ();
 }
 
 static void rtl8139_start_thread(struct rtl8139_private *tp)
@@ -1626,19 +1631,11 @@ static void rtl8139_start_thread(struct rtl8139_private 
*tp)
return;
 
tp-have_thread = 1;
+   tp-watchdog_fired = 0;
 
schedule_delayed_work(tp-thread, next_tick);
 }
 
-static void rtl8139_stop_thread(struct rtl8139_private *tp)
-{
-   if (tp-have_thread) {
-   cancel_rearming_delayed_work(tp-thread);
-   tp-have_thread = 0;
-   } else
-   flush_scheduled_work();
-}
-
 static inline void rtl8139_tx_clear (struct rtl8139_private *tp)
 {
tp-cur_tx = 0;
@@ -1696,12 +1693,11 @@ static void rtl8139_tx_timeout (struct net_device *dev)
 {
struct rtl8139_private *tp = netdev_priv(dev);
 
+   tp-watchdog_fired = 1;
if (!tp-have_thread) {
-   INIT_DELAYED_WORK(tp-thread, rtl8139_tx_timeout_task);
+   INIT_DELAYED_WORK(tp-thread, rtl8139_thread);
schedule_delayed_work(tp-thread, next_tick);
-   } else
-   tp-watchdog_fired = 1;
-
+   }
 }
 
 static int rtl8139_start_xmit (struct sk_buff *skb, struct net_device *dev)
@@ -2233,8 +2229,6 @@ static int rtl8139_close (struct net_device *dev)
 
netif_stop_queue (dev);
 
-   rtl8139_stop_thread(tp);
-
if (netif_msg_ifdown(tp))
printk(KERN_DEBUG %s: Shutting down ethercard, status was 
0x%4.4x.\n,
dev-name, RTL_R16 (IntrStatus));
-- 
1.4.4.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Strange connection slowdown on pcnet32

2007-02-15 Thread Lennart Sorensen
I have encountered a strange behaviour with the pcnet32.

I am transfering data from a server to a client routing it through my
router.  The router has 2 ethernet ports, both of which are amd 972
chips (pcnet32).  The transfer has so far been either http or ftp (both
see the same problem).  I transfer lots of data, and after a while (I
have seen anywhere from 200 to 700MB or so) the speed suddenly drops to
less than 1KB/s.  If I ping from the router to the server, the ping
requests go out normally (seen by tcpdump on the server) every second,
but on the router the replies are not seen by the kernel for multiple
seconds.  Sometimes I will see 3 ping replies together, sometimes 5 or
even 10.  The turn around times will show 10500, 9500, 8500, ..., 500ms
for the packets received in a batch.  ifconfig on the router shows the
packet receive counts showing up in lumps, just as ping does, and
tcpdump on the interface on the router.

Doing ifconfig down and up on the port connecting to the server makes
the problem clear and it can handle another pile of data before the
problem reappears.

The CPU on the router is not fast enough to ensure there won't ever be
dropped packets at 100Mbps.  When I force the port to the server to
10Mbps I have no problems at all.

Replacing the port to the server with an rtl8139 doesn't show any
problems at 100Mbps, although the transfer rate drops from 6500KBps to
4000KBps compared to using the pcnet32.

Kernel used so far is 2.6.16 and 2.6.18.

I have a tulip card I intend to try with as well just to see if it
affects anything other than the pcnet32.

Does anyone have any hints as to what part of the code to look at for
changes made by doing ifconfig eth1 down; ifconfig eth1 up?  Any ideas
as to what could make the reception of packets suddenly get very very
slow?

On one pass where I was running tcpdump on the router, I saw a wrap of
the sequence number right before the problem occoured, but that has not
been the case every time as far as I can tell, so I am not sure if that
is related to the problem at all.

--
Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7974] New: BUG: scheduling while atomic: swapper/0x10000100/0

2007-02-15 Thread Jay Vosburgh
Andy Gospodarek [EMAIL PROTECTED] wrote:

[...]
That might be the case, but I feel like the code is in a place where it
*could* work tomorrow with just some small tweaks (getting everything
into process context).  I'm reluctant to cram a whole bunch of changes
down someone's throat immediately (with a big patch) because it will be
difficult to learn incrementally what is being done well and what isn't
based on user feedback.

Fair enough.

  I like having the mode-specific monitors simply be special case
 monitors in a unified monitor system. It resolves the link-check
 vs. mode-specific conflict, and allows all of the periodic things to be
 paused as a set for mutexing purposes.  I'm happy to be convinced that
 I'm just blowing hooey here, but that's how it seems to me.
 

I went back and looked at my patch from a few months ago and I actually
use a single-threaded workqueue for each bond device.  I did not use
different queues to replace each timer -- just different types of work
that is placed on the queue.  So it sounds like we agree -- now we just
need to pick a decent starting point?

For the short term, yes, I don't have any disagreement with
switching the timer based stuff over to workqueues.  Basically a one for
one replacement to get the functions in a process context and tweak the
locking.

I do think we're having a little confusion over details of
terminology; if I'm not mistaken, you're thinking that workqueue means
single threaded: even though each individual monitor thingie is a
separate piece of work, they still can't collide.

That's true, but (unless I've missed a call somewhere) there
isn't a wq_pause_for_a_bit type of call (that, e.g., waits for
anything running to stop, then doesn't run any further work until we
later tell it to), so suspending all of the periodic things running for
the bond is more hassle than if there's just one schedulable work thing,
which internally calls the right functions to do the various things.
This is also single threaded, but easier to stop and start.  It seems to
be simpler to have multiple link monitors running in such a system as
well (without having them thrashing the link state as would happen now).

-J

---
-Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Use random32() in net/ipv4/multipath

2007-02-15 Thread Joe Perches
Einar Lueck's email addresses bounce
[EMAIL PROTECTED][EMAIL PROTECTED]

Removed local random number generator function

Signed-off-by: Joe Perches [EMAIL PROTECTED]

diff --git a/net/ipv4/multipath_random.c b/net/ipv4/multipath_random.c
index b8c289f..6448e6c 100644
--- a/net/ipv4/multipath_random.c
+++ b/net/ipv4/multipath_random.c
@@ -33,6 +33,7 @@
 #include linux/module.h
 #include linux/mroute.h
 #include linux/init.h
+#include linux/random.h
 #include net/ip.h
 #include net/protocol.h
 #include linux/skbuff.h
@@ -49,21 +50,6 @@
 
 #define MULTIPATH_MAX_CANDIDATES 40
 
-/* interface to random number generation */
-static unsigned int RANDOM_SEED = 93186752;
-
-static inline unsigned int random(unsigned int ubound)
-{
-   static unsigned int a = 1588635695,
-   q = 2,
-   r = 1117695901;
-
-   RANDOM_SEED = a*(RANDOM_SEED % q) - r*(RANDOM_SEED / q);
-
-   return RANDOM_SEED % ubound;
-}
-
-
 static void random_select_route(const struct flowi *flp,
struct rtable *first,
struct rtable **rp)
@@ -85,7 +71,7 @@ static void random_select_route(const struct flowi *flp,
if (candidate_count  1) {
unsigned char i = 0;
unsigned char candidate_no = (unsigned char)
-   random(candidate_count);
+   (random32() % candidate_count);
 
/* find chosen candidate and adjust GC data for all candidates
 * to ensure they stay in cache
diff --git a/net/ipv4/multipath_wrandom.c b/net/ipv4/multipath_wrandom.c
index 92b0482..d6115a3 100644
--- a/net/ipv4/multipath_wrandom.c
+++ b/net/ipv4/multipath_wrandom.c
@@ -33,6 +33,7 @@
 #include linux/module.h
 #include linux/mroute.h
 #include linux/init.h
+#include linux/random.h
 #include net/ip.h
 #include net/protocol.h
 #include linux/skbuff.h
@@ -85,18 +86,6 @@ struct multipath_route {
 /* state: primarily weight per route information */
 static struct multipath_bucket state[MULTIPATH_STATE_SIZE];
 
-/* interface to random number generation */
-static unsigned int RANDOM_SEED = 93186752;
-
-static inline unsigned int random(unsigned int ubound)
-{
-   static unsigned int a = 1588635695,
-   q = 2,
-   r = 1117695901;
-   RANDOM_SEED = a*(RANDOM_SEED % q) - r*(RANDOM_SEED / q);
-   return RANDOM_SEED % ubound;
-}
-
 static unsigned char __multipath_lookup_weight(const struct flowi *fl,
   const struct rtable *rt)
 {
@@ -194,7 +183,7 @@ static void wrandom_select_route(const struct flowi *flp,
 
/* choose a weighted random candidate */
decision = first;
-   selector = random(power);
+   selector = random32() % power;
last_power = 0;
 
/* select candidate, adjust GC data and cleanup local state */


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: [Bug 8013] New: select for write hangs on a socket after write returned ECONNRESET

2007-02-15 Thread Stephen Hemminger
Someone want to take a stab at fixing this??

Begin forwarded message:

Date: Wed, 14 Feb 2007 19:32:52 -0800
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bug 8013] New: select for write hangs on a socket after write
returned ECONNRESET


http://bugzilla.kernel.org/show_bug.cgi?id=8013

   Summary: select for write hangs on a socket after write
returned ECONNRESET
Kernel Version: 2.6.16
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Distribution: Debian
Also reproduced on: 2.4 based Redhat.

Hardware Environment:
i686/Xeon
Problem Description:

If you write() to a disconnected socket, write returns ECONNRESET.
If you then select() on that socket, checking for write, the select
never returns.

For example from strace:
write(4, fred, 4) = 4
...
write(4, fred, 4) = -1 ECONNRESET (Connection
reset by peer)
select(5, NULL, [4], NULL, NULL ... hung in select

The select documentation says those in writefds will be watched to see
if a write will not block.
A write on this socket will not block, therefore select should return 
immediately.

When the program is run on Solaris, AIX and HPUX, the select returns 
immediately.

--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/6] sky2: v1.13 patches

2007-02-15 Thread Stephen Hemminger
This set of patches fixes all the problems observed so far on
my machines. The biggest one was not doing transmit flow control
correctly.

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6] sky2: dont flush good pause frames

2007-02-15 Thread Stephen Hemminger
Don't mark pause frames as errors. This problem caused transmitter not
to pause and would effectively take out a gigabit switch because the
it can't handle overrun. 

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/sky2.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- sky2-dev.orig/drivers/net/sky2.h2007-02-13 15:08:30.0 -0800
+++ sky2-dev/drivers/net/sky2.h 2007-02-13 15:12:52.0 -0800
@@ -1589,7 +1589,7 @@
 
GMR_FS_ANY_ERR  = GMR_FS_RX_FF_OV | GMR_FS_CRC_ERR |
  GMR_FS_FRAGMENT | GMR_FS_LONG_ERR |
- GMR_FS_MII_ERR | GMR_FS_GOOD_FC | GMR_FS_BAD_FC |
+ GMR_FS_MII_ERR | GMR_FS_BAD_FC |
  GMR_FS_UN_SIZE | GMR_FS_JABBER,
 };
 

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/6] sky2: flow control negotiation for Yukon-FE

2007-02-15 Thread Stephen Hemminger
The Yukon-FE chip doesn't do gigabit and has a differen PHY internally.
On this chip, phy status register doesn't properly reflect the result
of flow control negotiation. To workaround the problem and avoid having
to have so much chip dependent code; compute the result of flow control
by looking at the local and remote advertised bits.

Signed-off-by: Stephen Hemmminger [EMAIL PROTECTED]

--- sky2-dev.orig/drivers/net/sky2.c2007-02-14 10:01:41.0 -0800
+++ sky2-dev/drivers/net/sky2.c 2007-02-14 13:32:00.0 -0800
@@ -1766,10 +1766,10 @@
 {
struct sky2_hw *hw = sky2-hw;
unsigned port = sky2-port;
-   u16 lpa;
+   u16 advert, lpa;
 
+   advert = gm_phy_read(hw, port, PHY_MARV_AUNE_ADV);
lpa = gm_phy_read(hw, port, PHY_MARV_AUNE_LP);
-
if (lpa  PHY_M_AN_RF) {
printk(KERN_ERR PFX %s: remote fault, sky2-netdev-name);
return -1;
@@ -1784,20 +1784,40 @@
sky2-speed = sky2_phy_speed(hw, aux);
sky2-duplex = (aux  PHY_M_PS_FULL_DUP) ? DUPLEX_FULL : DUPLEX_HALF;
 
-   /* Pause bits are offset (9..8) */
-   if (hw-chip_id == CHIP_ID_YUKON_XL
-   || hw-chip_id == CHIP_ID_YUKON_EC_U
-   || hw-chip_id == CHIP_ID_YUKON_EX)
-   aux = 6;
-
-   sky2-flow_status = sky2_flow(aux  PHY_M_PS_RX_P_EN,
- aux  PHY_M_PS_TX_P_EN);
+   /* Since the pause result bits seem to in different positions on
+* different chips. look at registers.
+*/
+   if (!sky2_is_copper(hw)) {
+   /* Shift for bits in fiber PHY */
+   advert = ~(ADVERTISE_PAUSE_CAP|ADVERTISE_PAUSE_ASYM);
+   lpa = ~(LPA_PAUSE_CAP|LPA_PAUSE_ASYM);
+
+   if (advert  ADVERTISE_1000XPAUSE)
+   advert |= ADVERTISE_PAUSE_CAP;
+   if (advert  ADVERTISE_1000XPSE_ASYM)
+   advert |= ADVERTISE_PAUSE_ASYM;
+   if (lpa  LPA_1000XPAUSE)
+   lpa |= LPA_PAUSE_CAP;
+   if (lpa  LPA_1000XPAUSE_ASYM)
+   lpa |= LPA_PAUSE_ASYM;
+   }
+
+   sky2-flow_status = FC_NONE;
+   if (advert  ADVERTISE_PAUSE_CAP) {
+   if (lpa  LPA_PAUSE_CAP)
+   sky2-flow_status = FC_BOTH;
+   else if (advert  ADVERTISE_PAUSE_ASYM)
+   sky2-flow_status = FC_RX;
+   } else if (advert  ADVERTISE_PAUSE_ASYM) {
+   if ((lpa  LPA_PAUSE_CAP)  (lpa  LPA_PAUSE_ASYM))
+   sky2-flow_status = FC_TX;
+   }
 
if (sky2-duplex == DUPLEX_HALF  sky2-speed  SPEED_1000
 !(hw-chip_id == CHIP_ID_YUKON_EC_U || hw-chip_id == 
CHIP_ID_YUKON_EX))
sky2-flow_status = FC_NONE;
 
-   if (aux  PHY_M_PS_RX_P_EN)
+   if (sky2-flow_status  FC_TX)
sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_ON);
else
sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_OFF);

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/6] sky2: no need to reset pause bits on shutdown

2007-02-15 Thread Stephen Hemminger
Resetting the pause bits on shutdown is not necessary.
The code was inherited from the vendor driver, and it is currently #ifdef'd
out there as well.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2-dev.orig/drivers/net/sky2.c2007-02-13 15:08:31.0 -0800
+++ sky2-dev/drivers/net/sky2.c 2007-02-13 15:13:03.0 -0800
@@ -1742,13 +1742,6 @@
reg = ~(GM_GPCR_RX_ENA | GM_GPCR_TX_ENA);
gma_write16(hw, port, GM_GP_CTRL, reg);
 
-   if (sky2-flow_status == FC_RX) {
-   /* restore Asymmetric Pause bit */
-   gm_phy_write(hw, port, PHY_MARV_AUNE_ADV,
-gm_phy_read(hw, port, PHY_MARV_AUNE_ADV)
-| PHY_M_AN_ASP);
-   }
-
netif_carrier_off(sky2-netdev);
netif_stop_queue(sky2-netdev);
 

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/6] sky2: v1.13

2007-02-15 Thread Stephen Hemminger
New version.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2-dev.orig/drivers/net/sky2.c2007-02-15 15:00:38.0 -0800
+++ sky2-dev/drivers/net/sky2.c 2007-02-15 15:00:56.0 -0800
@@ -49,7 +49,7 @@
 #include sky2.h
 
 #define DRV_NAME   sky2
-#define DRV_VERSION1.12
+#define DRV_VERSION1.13
 #define PFXDRV_NAME  
 
 /*

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6] sky2: transmit timeout

2007-02-15 Thread Stephen Hemminger
The transmit timeout code could hang, and it would not clear out
problems if the hardware was stuck.  Change the code to effectively do
a device down/up similar to the suspend/resume code.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2-dev.orig/drivers/net/sky2.c2007-02-15 11:44:39.0 -0800
+++ sky2-dev/drivers/net/sky2.c 2007-02-15 12:47:05.0 -0800
@@ -1866,16 +1866,13 @@
spin_unlock(sky2-phy_lock);
 }
 
-
 /* Transmit timeout is only called if we are running, carrier is up
  * and tx queue is full (stopped).
- * Called with netif_tx_lock held.
  */
 static void sky2_tx_timeout(struct net_device *dev)
 {
struct sky2_port *sky2 = netdev_priv(dev);
struct sky2_hw *hw = sky2-hw;
-   u32 imask;
 
if (netif_msg_timer(sky2))
printk(KERN_ERR PFX %s: tx timeout\n, dev-name);
@@ -1885,19 +1882,8 @@
   sky2_read16(hw, sky2-port == 0 ? STAT_TXA1_RIDX : 
STAT_TXA2_RIDX),
   sky2_read16(hw, Q_ADDR(txqaddr[sky2-port], Q_DONE)));
 
-   imask = sky2_read32(hw, B0_IMSK);   /* block IRQ in hw */
-   sky2_write32(hw, B0_IMSK, 0);
-   sky2_read32(hw, B0_IMSK);
-
-   netif_poll_disable(hw-dev[0]); /* stop NAPI poll */
-   synchronize_irq(hw-pdev-irq);
-
-   netif_start_queue(dev); /* don't wakeup during flush */
-   sky2_tx_complete(sky2, sky2-tx_prod);  /* Flush transmit queue */
-
-   sky2_write32(hw, B0_IMSK, imask);
-
-   sky2_phy_reinit(sky2);  /* this clears flow control etc 
*/
+   /* can't restart safely under softirq */
+   schedule_work(hw-restart_work);
 }
 
 static int sky2_change_mtu(struct net_device *dev, int new_mtu)
@@ -2651,6 +2637,49 @@
sky2_write8(hw, STAT_ISR_TIMER_CTRL, TIM_START);
 }
 
+static void sky2_restart(struct work_struct *work)
+{
+   struct sky2_hw *hw = container_of(work, struct sky2_hw, restart_work);
+   struct net_device *dev;
+   int i, err;
+
+   dev_dbg(hw-pdev-dev, restarting\n);
+
+   del_timer_sync(hw-idle_timer);
+
+   rtnl_lock();
+   sky2_write32(hw, B0_IMSK, 0);
+   sky2_read32(hw, B0_IMSK);
+
+   netif_poll_disable(hw-dev[0]);
+
+   for (i = 0; i  hw-ports; i++) {
+   dev = hw-dev[i];
+   if (netif_running(dev))
+   sky2_down(dev);
+   }
+
+   sky2_reset(hw);
+   sky2_write32(hw, B0_IMSK, Y2_IS_BASE);
+   netif_poll_enable(hw-dev[0]);
+
+   for (i = 0; i  hw-ports; i++) {
+   dev = hw-dev[i];
+   if (netif_running(dev)) {
+   err = sky2_up(dev);
+   if (err) {
+   printk(KERN_INFO PFX %s: could not restart 
%d\n,
+  dev-name, err);
+   dev_close(dev);
+   }
+   }
+   }
+
+   sky2_idle_start(hw);
+
+   rtnl_unlock();
+}
+
 static inline u8 sky2_wol_supported(const struct sky2_hw *hw)
 {
return sky2_is_copper(hw) ? (WAKE_PHY | WAKE_MAGIC) : 0;
@@ -3613,6 +3642,8 @@
}
 
setup_timer(hw-idle_timer, sky2_idle, (unsigned long) hw);
+   INIT_WORK(hw-restart_work, sky2_restart);
+
sky2_idle_start(hw);
 
pci_set_drvdata(pdev, hw);
@@ -3649,6 +3680,8 @@
 
del_timer_sync(hw-idle_timer);
 
+   flush_scheduled_work();
+
sky2_write32(hw, B0_IMSK, 0);
synchronize_irq(hw-pdev-irq);
 
--- sky2-dev.orig/drivers/net/sky2.h2007-02-15 11:58:51.0 -0800
+++ sky2-dev/drivers/net/sky2.h 2007-02-15 11:59:07.0 -0800
@@ -1933,6 +1933,7 @@
dma_addr_t   st_dma;
 
struct timer_listidle_timer;
+   struct work_struct   restart_work;
int  msi;
wait_queue_head_tmsi_wait;
 };

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/6] sky2: receive error handling improvements

2007-02-15 Thread Stephen Hemminger
Don't drop oversize frame it might be a VLAN (untagged).
Use different counter for fifo overrun vs fifo error.
Print error on fifo overrrun.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- sky2-dev.orig/drivers/net/sky2.c2007-02-15 14:33:52.0 -0800
+++ sky2-dev/drivers/net/sky2.c 2007-02-15 15:00:38.0 -0800
@@ -2056,9 +2056,6 @@
if (!(status  GMR_FS_RX_OK))
goto resubmit;
 
-   if (length  dev-mtu + ETH_HLEN)
-   goto oversize;
-
if (length  copybreak)
skb = receive_copy(sky2, re, length);
else
@@ -2068,14 +2065,10 @@
 
return skb;
 
-oversize:
-   ++sky2-net_stats.rx_over_errors;
-   goto resubmit;
-
 error:
++sky2-net_stats.rx_errors;
if (status  GMR_FS_RX_FF_OV) {
-   sky2-net_stats.rx_fifo_errors++;
+   sky2-net_stats.rx_over_errors++;
goto resubmit;
}
 

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bcm43xx: Fix code for spec changes of 2/7/2007

2007-02-15 Thread Johannes Berg
On Wed, 2007-02-14 at 22:40 +0100, Michael Buesch wrote:
 On Wednesday 14 February 2007 14:18, Johannes Berg wrote:
  On Sat, 2007-02-10 at 06:55 +0100, Michael Buesch wrote:
  
   It's likely that old cards still work with v4 firmware,
  
  No, it's absolutely impossible. Rev 2/4 cores have a totally different
  instruction set in the microcode.
 
 Ok, I was not talking about _that_ old cards. ;)

Are there cards where they have new microcode instruction set but no v4
firmware?

johannes


signature.asc
Description: This is a digitally signed message part


Re: [PATCH] bcm43xx: Fix code for spec changes of 2/7/2007

2007-02-15 Thread Michael Buesch
On Thursday 15 February 2007 16:07, Johannes Berg wrote:
 On Wed, 2007-02-14 at 22:40 +0100, Michael Buesch wrote:
  On Wednesday 14 February 2007 14:18, Johannes Berg wrote:
   On Sat, 2007-02-10 at 06:55 +0100, Michael Buesch wrote:
   
It's likely that old cards still work with v4 firmware,
   
   No, it's absolutely impossible. Rev 2/4 cores have a totally different
   instruction set in the microcode.
  
  Ok, I was not talking about _that_ old cards. ;)
 
 Are there cards where they have new microcode instruction set but no v4
 firmware?

I don't know. I guessed so. Am I wrong? That would be good :)

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bcm43xx: Fix code for spec changes of 2/7/2007

2007-02-15 Thread Johannes Berg
On Thu, 2007-02-15 at 16:13 +0100, Michael Buesch wrote:
 On Thursday 15 February 2007 16:07, Johannes Berg wrote:
  On Wed, 2007-02-14 at 22:40 +0100, Michael Buesch wrote:
   On Wednesday 14 February 2007 14:18, Johannes Berg wrote:
On Sat, 2007-02-10 at 06:55 +0100, Michael Buesch wrote:

 It's likely that old cards still work with v4 firmware,

No, it's absolutely impossible. Rev 2/4 cores have a totally different
instruction set in the microcode.
   
   Ok, I was not talking about _that_ old cards. ;)
  
  Are there cards where they have new microcode instruction set but no v4
  firmware?
 
 I don't know. I guessed so. Am I wrong? That would be good :)

I wouldn't think so since we have rev5 v4 firmware and that should be
the oldest post-rev4 right?

johannes


signature.asc
Description: This is a digitally signed message part


Re: [PATCH] bcm43xx: Fix code for spec changes of 2/7/2007

2007-02-15 Thread Johannes Berg
On Thu, 2007-02-15 at 17:51 +0100, Martin Langer wrote:

 Yep. We have all kinds of firmware with the new instruction set. It's 
 only ucode2 (old instruction set) that's missing. But the later ucode4 
 which also uses the old instruction set is available in v4.
 OTOH, ucode13 isn't available in v3. We can't offer one firmware version 
 for all card revisions. Both are limited to a specific range of 
 revisions.
 
 v3rev2...rev12
 v4rev4...=rev13

The upper limit doesn't really matter, afaict the effect of running a
rev9 instead of rev13 will just be a missing performance increase due to
bigger FIFOs not being used fully.

johannes


signature.asc
Description: This is a digitally signed message part


[PATCH] - drivers/net/hamradio remove local random function, use random32()

2007-02-15 Thread Joe Perches
remove local random function, use random32() instead

Signed-off-by: Joe Perches [EMAIL PROTECTED]

diff --git a/drivers/net/hamradio/baycom_epp.c 
b/drivers/net/hamradio/baycom_epp.c
index 153b6dc..84aa211 100644
--- a/drivers/net/hamradio/baycom_epp.c
+++ b/drivers/net/hamradio/baycom_epp.c
@@ -52,6 +52,7 @@
 #include linux/hdlcdrv.h
 #include linux/baycom.h
 #include linux/jiffies.h
+#include linux/random.h
 #include net/ax25.h 
 #include asm/uaccess.h
 
@@ -433,16 +434,6 @@ static void encode_hdlc(struct baycom_state *bc)
 
 /* -- */
 
-static unsigned short random_seed;
-
-static inline unsigned short random_num(void)
-{
-   random_seed = 28629 * random_seed + 157;
-   return random_seed;
-}
-
-/* -- */
-
 static int transmit(struct baycom_state *bc, int cnt, unsigned char stat)
 {
struct parport *pp = bc-pdev-port;
@@ -464,7 +455,7 @@ static int transmit(struct baycom_state *bc, int cnt, 
unsigned char stat)
if ((--bc-hdlctx.slotcnt)  0)
return 0;
bc-hdlctx.slotcnt = bc-ch_params.slottime;
-   if ((random_num() % 256)  bc-ch_params.ppersist)
+   if ((random32() % 256)  bc-ch_params.ppersist)
return 0;
}
}
diff --git a/drivers/net/hamradio/hdlcdrv.c b/drivers/net/hamradio/hdlcdrv.c
index 452873e..f5a17ad 100644
--- a/drivers/net/hamradio/hdlcdrv.c
+++ b/drivers/net/hamradio/hdlcdrv.c
@@ -56,6 +56,7 @@
 #include linux/if_arp.h
 #include linux/skbuff.h
 #include linux/hdlcdrv.h
+#include linux/random.h
 #include net/ax25.h 
 #include asm/uaccess.h
 
@@ -371,16 +372,6 @@ static void start_tx(struct net_device *dev, struct 
hdlcdrv_state *s)
 
 /* -- */
 
-static unsigned short random_seed;
-
-static inline unsigned short random_num(void)
-{
-   random_seed = 28629 * random_seed + 157;
-   return random_seed;
-}
-
-/* -- */
-
 void hdlcdrv_arbitrate(struct net_device *dev, struct hdlcdrv_state *s)
 {
if (!s || s-magic != HDLCDRV_MAGIC || s-hdlctx.ptt || !s-skb) 
@@ -396,7 +387,7 @@ void hdlcdrv_arbitrate(struct net_device *dev, struct 
hdlcdrv_state *s)
if ((--s-hdlctx.slotcnt)  0)
return;
s-hdlctx.slotcnt = s-ch_params.slottime;
-   if ((random_num() % 256)  s-ch_params.ppersist)
+   if ((random32() % 256)  s-ch_params.ppersist)
return;
start_tx(dev, s);
 }
diff --git a/drivers/net/hamradio/yam.c b/drivers/net/hamradio/yam.c
index 6d74f08..efc0bcd 100644
--- a/drivers/net/hamradio/yam.c
+++ b/drivers/net/hamradio/yam.c
@@ -50,6 +50,7 @@
 #include linux/slab.h
 #include linux/errno.h
 #include linux/bitops.h
+#include linux/random.h
 #include asm/io.h
 #include asm/system.h
 #include linux/interrupt.h
@@ -566,14 +567,6 @@ static void yam_start_tx(struct net_device *dev, struct 
yam_port *yp)
ptt_on(dev);
 }
 
-static unsigned short random_seed;
-
-static inline unsigned short random_num(void)
-{
-   random_seed = 28629 * random_seed + 157;
-   return random_seed;
-}
-
 static void yam_arbitrate(struct net_device *dev)
 {
struct yam_port *yp = netdev_priv(dev);
@@ -600,7 +593,7 @@ static void yam_arbitrate(struct net_device *dev)
yp-slotcnt = yp-slot / 10;
 
/* is random  persist ? */
-   if ((random_num() % 256)  yp-pers)
+   if ((random32() % 256)  yp-pers)
return;
 
yam_start_tx(dev, yp);


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2, resend] gianfar: don't duplicate gfar_error()

2007-02-15 Thread Andy Fleming


On Feb 15, 2007, at 07:56, Sergei Shtylyov wrote:

It was hardly necessary to repeat most of the code from gfar_error 
() in
gfar_interrupt(), especially having some inconsistencies between  
the two.
So, make the gfar_interrupt() just call gfar_error(), and not  
acknowledge

the interrupts itself as gfar_{receive/transmit/error}() do it anyway.
While at it, also clarify/cleanup debug messages in gfar_error()...

Signed-off-by: Sergei Shtylyov [EMAIL PROTECTED]


Acked-by: Andy Fleming [EMAIL PROTECTED]



---
The patch survived netperf stressing on MPC8540ADS realtime  
kernel. :-)


Sorry, forgot to remove the obsolete regs argument from gfar_error 
() call,
call, so the previous version wasn't even compilable -- I've tested  
the patch
in the older kernel. Resending now with better placed comments  
which you won't

have to edit out... :-

 drivers/net/gianfar.c |   85 +++ 
+--

 1 files changed, 15 insertions(+), 70 deletions(-)

Index: linux-2.6/drivers/net/gianfar.c
===
--- linux-2.6.orig/drivers/net/gianfar.c
+++ linux-2.6/drivers/net/gianfar.c
@@ -10,6 +10,7 @@
  * Maintainer: Kumar Gala
  *
  * Copyright (c) 2002-2006 Freescale Semiconductor, Inc.
+ * Copyright (c) 2007 MontaVista Software, Inc.
  *
  * This program is free software; you can redistribute  it and/or  
modify it
  * under  the terms of  the GNU General  Public License as  
published by the

@@ -1613,71 +1614,17 @@ static irqreturn_t gfar_interrupt(int ir
/* Save ievent for future reference */
u32 events = gfar_read(priv-regs-ievent);

-   /* Clear IEVENT */
-   gfar_write(priv-regs-ievent, events);
-
/* Check for reception */
-   if ((events  IEVENT_RXF0) || (events  IEVENT_RXB0))
+   if (events  IEVENT_RX_MASK)
gfar_receive(irq, dev_id);

/* Check for transmit completion */
-   if ((events  IEVENT_TXF) || (events  IEVENT_TXB))
+   if (events  IEVENT_TX_MASK)
gfar_transmit(irq, dev_id);

-   /* Update error statistics */
-   if (events  IEVENT_TXE) {
-   priv-stats.tx_errors++;
-
-   if (events  IEVENT_LC)
-   priv-stats.tx_window_errors++;
-   if (events  IEVENT_CRL)
-   priv-stats.tx_aborted_errors++;
-   if (events  IEVENT_XFUN) {
-   if (netif_msg_tx_err(priv))
-printk(KERN_WARNING %s: tx underrun. dropped packet\n, dev- 
name);

-   priv-stats.tx_dropped++;
-   priv-extra_stats.tx_underrun++;
-
-   /* Reactivate the Tx Queues */
-   gfar_write(priv-regs-tstat, TSTAT_CLEAR_THALT);
-   }
-   }
-   if (events  IEVENT_BSY) {
-   priv-stats.rx_errors++;
-   priv-extra_stats.rx_bsy++;
-
-   gfar_receive(irq, dev_id);
-
-#ifndef CONFIG_GFAR_NAPI
-   /* Clear the halt bit in RSTAT */
-   gfar_write(priv-regs-rstat, RSTAT_CLEAR_RHALT);
-#endif
-
-   if (netif_msg_rx_err(priv))
-   printk(KERN_DEBUG %s: busy error (rhalt: %x)\n,
-   dev-name,
-   gfar_read(priv-regs-rstat));
-   }
-   if (events  IEVENT_BABR) {
-   priv-stats.rx_errors++;
-   priv-extra_stats.rx_babr++;
-
-   if (netif_msg_rx_err(priv))
-   printk(KERN_DEBUG %s: babbling error\n, dev-name);
-   }
-   if (events  IEVENT_EBERR) {
-   priv-extra_stats.eberr++;
-   if (netif_msg_rx_err(priv))
-   printk(KERN_DEBUG %s: EBERR\n, dev-name);
-   }
-   if ((events  IEVENT_RXC)  (netif_msg_rx_err(priv)))
-   printk(KERN_DEBUG %s: control frame\n, dev-name);
-
-   if (events  IEVENT_BABT) {
-   priv-extra_stats.tx_babt++;
-   if (netif_msg_rx_err(priv))
-   printk(KERN_DEBUG %s: babt error\n, dev-name);
-   }
+   /* Check for errors */
+   if (events  IEVENT_ERR_MASK)
+   gfar_error(irq, dev_id);

return IRQ_HANDLED;
 }
@@ -1939,7 +1886,7 @@ static irqreturn_t gfar_error(int irq, v
/* Hmm... */
if (netif_msg_rx_err(priv) || netif_msg_tx_err(priv))
 		printk(KERN_DEBUG %s: error interrupt (ievent=0x%08x imask=0x% 
08x)\n,

-   dev-name, events, 
gfar_read(priv-regs-imask));
+  dev-name, events, gfar_read(priv-regs-imask));

/* Update the error counters */
if (events  IEVENT_TXE) {
@@ -1951,8 +1898,8 @@ static irqreturn_t gfar_error(int irq, v
priv-stats.tx_aborted_errors++;
if (events  IEVENT_XFUN) {
if (netif_msg_tx_err(priv))
- 

degradation in bridging performance of 5% in 2.6.20 when compared to 2.6.19

2007-02-15 Thread kalyan tejaswi

Hi all,
I have been comparing bridging performance for 2.6.20 and 2.6.19
kernels. The kenel configurations are identical for both the kernels.
I use D-Link cards (8139too driver) for the Malta 4Kc board.

The setup is:

netperf  client  ---  malta 4Kc  - netperf  server.

The throughput statistics (in 10^6 bits/second) are:

   2.6.19  2.6.20
routing30.2   30.16
bridging  32.35  30.81

I observe that there has been a degradation in bridging performance of
5% in 2.6.20 when compared to 2.6.19.

Has anyone observed similar behaviour?
Any inputs or suggestions are welcome.


Regards
Kalyan
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[TIPC] Missing null check in the socket code.

2007-02-15 Thread Max Krasnyansky
Fixes an oops in the non-blocking mode.

Signed-off-by: Max Krasnyansky [EMAIL PROTECTED]
---
 net/tipc/socket.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 2a6a5a6..767f791 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -862,6 +862,10 @@ restart:
/* Get access to first message in receive queue */
 
buf = skb_peek(sock-sk-sk_receive_queue);
+   if (NULL == buf) {
+   res = -EAGAIN;
+   goto exit;
+   }
msg = buf_msg(buf);
sz = msg_data_sz(msg);
err = msg_errcode(msg);
-- 
1.4.4.2

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[TIPC] Properly mask header fields

2007-02-15 Thread Max Krasnyansky
TIPC code is a bit inconsistent in masking out upper bits of various
message fields when packing them into the headers. For the most part
things seem to be ok but we happened to hit a corner case in our labs
when broadcast counter reached certain value (don't remember exact
details) and was messing up status bits in the header. At which point
the link was busted and required a reset to bring it back up.
It's much safer to apply proper mask in the function that does the
actual packing rather than doing it all over the place, and missing a
few ;-).

Signed-off-by: Max Krasnyansky [EMAIL PROTECTED]
---
 net/tipc/msg.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/tipc/msg.h b/net/tipc/msg.h
index 6699aaf..4e681c8 100644
--- a/net/tipc/msg.h
+++ b/net/tipc/msg.h
@@ -72,7 +72,7 @@ static inline void msg_set_bits(struct tipc_msg *m, u32 w,
u32 pos, u32 mask, u32 val)
 {
u32 word = msg_word(m,w)  ~(mask  pos);
-   msg_set_word(m, w, (word |= (val  pos)));
+   msg_set_word(m, w, (word |= ((val  mask)  pos)));
 }
 
 /* 
-- 
1.4.4.2

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] RTNL and flush_scheduled_work deadlocks

2007-02-15 Thread Jarek Poplawski
On 14-02-2007 22:27, Stephen Hemminger wrote:
 Ben found this but the problem seems pretty widespread.
 
 The following places are subject to deadlock between flush_scheduled_work
 and the RTNL mutex. What can happen is that a work queue routine (like
 bridge port_carrier_check) is waiting forever for RTNL, and the driver
 routine has called flush_scheduled_work with RTNL held and is waiting
 for the work queue to clear.
 
 Several other places have comments like: can't call flush_scheduled_work
 here or it will deadlock. Most of the problem places are in device close
 routine. My recommendation would be to add a check for device netif_running in
 what ever work routine is used, and move the flush_scheduled_work to the
 remove routine.
 
 8139too.c: rtl8139_close -- rtl8139_stop_thread
 r8169.c:   rtl8169_down
 cassini.c: cas_change_mtu
 iseries_veth.c: veth_stop_connection
 s2io.c: s2io_close
 sis190.c: sis190_down
 

There is probably more than this...

I think the same problem is with
cancel_rearming_delayed_work. Plus indirect calling
these functions: eg. by ieee8021softmac_stop.

I found these dangerous places (probably not all):

cxgb3/cxgb3_main.c (cxgb_close - cxgb_down),

macb.c (macb_close),

skge.c (skge_down),

wireless/bcm43xx/bcm43xx_main.c (bcm_net_stop both
ieee80211...  and flush_...),
wireless/zd1211rw/zd_mac.c (zd_mac_stop -
housekeeping_disable),

chelsio/my3126.c (t1_interrupts_disable -
my3126_interrupt_disable), /* not sure */

drivers/usb/net/kaweth.c (kaweth_close -
kaweth_kill_urbs)

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] RTNL and flush_scheduled_work deadlocks

2007-02-15 Thread Ben Greear

Jarek Poplawski wrote:

On 14-02-2007 22:27, Stephen Hemminger wrote:
  

Ben found this but the problem seems pretty widespread.

The following places are subject to deadlock between flush_scheduled_work
and the RTNL mutex. What can happen is that a work queue routine (like
bridge port_carrier_check) is waiting forever for RTNL, and the driver
routine has called flush_scheduled_work with RTNL held and is waiting
for the work queue to clear.

Several other places have comments like: can't call flush_scheduled_work
here or it will deadlock. Most of the problem places are in device close
routine. My recommendation would be to add a check for device netif_running in
what ever work routine is used, and move the flush_scheduled_work to the
remove routine.

8139too.c: rtl8139_close -- rtl8139_stop_thread
r8169.c:   rtl8169_down
cassini.c: cas_change_mtu
iseries_veth.c: veth_stop_connection
s2io.c: s2io_close
sis190.c: sis190_down




There is probably more than this...
  


Maybe there should be something like an ASSERT_NOT_RTNL() in the 
flush_scheduled_work()
method?  If it's performance criticial, #ifdef it out if we're not 
debugging locks?


Ben

--
Ben Greear [EMAIL PROTECTED] 
Candela Technologies Inc  http://www.candelatech.com



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] 8139too: RTNL and flush_scheduled_work deadlock

2007-02-15 Thread Jarek Poplawski
On 15-02-2007 23:37, Francois Romieu wrote:
 Your usual dont-flush_scheduled_work-with-RTNL-held stuff.
 
 It is a bit different here since the thread runs permanently
 or is only occasionally kicked for recovery depending on the
 hardware revision.
 
 Signed-off-by: Francois Romieu [EMAIL PROTECTED]
 ---
  drivers/net/8139too.c |   40 +---
  1 files changed, 17 insertions(+), 23 deletions(-)
 
 diff --git a/drivers/net/8139too.c b/drivers/net/8139too.c
 index 35ad5cf..99304b2 100644
 --- a/drivers/net/8139too.c
 +++ b/drivers/net/8139too.c
 @@ -1109,6 +1109,8 @@ static void __devexit rtl8139_remove_one (struct 
 pci_dev *pdev)
  
   assert (dev != NULL);
  
 + flush_scheduled_work();
 +

IMHO there should be rather cancel_rearming_delayed_work
instead of this.

   unregister_netdev (dev);
  
   __rtl8139_cleanup_dev (dev);
 @@ -1603,18 +1605,21 @@ static void rtl8139_thread (struct work_struct *work)
   struct net_device *dev = tp-mii.dev;
   unsigned long thr_delay = next_tick;
  
 + rtnl_lock();
 +
 + if (!netif_running(dev))
 + goto out_unlock;

I wonder, why you don't do netif_running before
rtnl_lock? It's an atomic operation.

And I'm not sure if increasing rtnl_lock range
is really needed here.

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] cxgb3 Fix copyrights in the cxgb3 driver.

2007-02-15 Thread Steve Wise

Fix copyrights in the cxgb3 driver.

Remove the Open Grid Computing copyright.  It shouldn't be there.

Signed-off-by: Steve Wise [EMAIL PROTECTED]
---

 drivers/net/cxgb3/cxgb3_defs.h|1 -
 drivers/net/cxgb3/cxgb3_offload.c |1 -
 drivers/net/cxgb3/cxgb3_offload.h |1 -
 drivers/net/cxgb3/l2t.c   |1 -
 drivers/net/cxgb3/l2t.h   |1 -
 drivers/net/cxgb3/t3cdev.h|1 -
 6 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/drivers/net/cxgb3/cxgb3_defs.h b/drivers/net/cxgb3/cxgb3_defs.h
index 16e0049..e14862b 100644
--- a/drivers/net/cxgb3/cxgb3_defs.h
+++ b/drivers/net/cxgb3/cxgb3_defs.h
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2006-2007 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006-2007 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/net/cxgb3/cxgb3_offload.c 
b/drivers/net/cxgb3/cxgb3_offload.c
index c6b7266..b2cf5f6 100644
--- a/drivers/net/cxgb3/cxgb3_offload.c
+++ b/drivers/net/cxgb3/cxgb3_offload.c
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2006-2007 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006-2007 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/net/cxgb3/cxgb3_offload.h 
b/drivers/net/cxgb3/cxgb3_offload.h
index 0e6beb6..f15446a 100644
--- a/drivers/net/cxgb3/cxgb3_offload.h
+++ b/drivers/net/cxgb3/cxgb3_offload.h
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2006-2007 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006-2007 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/net/cxgb3/l2t.c b/drivers/net/cxgb3/l2t.c
index 3c0cb85..d660af7 100644
--- a/drivers/net/cxgb3/l2t.c
+++ b/drivers/net/cxgb3/l2t.c
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2003-2007 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006-2007 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/net/cxgb3/l2t.h b/drivers/net/cxgb3/l2t.h
index ba5d2cb..d790013 100644
--- a/drivers/net/cxgb3/l2t.h
+++ b/drivers/net/cxgb3/l2t.h
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2003-2007 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006-2007 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/net/cxgb3/t3cdev.h b/drivers/net/cxgb3/t3cdev.h
index 9af3bcd..fa4099b 100644
--- a/drivers/net/cxgb3/t3cdev.h
+++ b/drivers/net/cxgb3/t3cdev.h
@@ -1,6 +1,5 @@
 /*
  * Copyright (C) 2006-2007 Chelsio Communications.  All rights reserved.
- * Copyright (C) 2006-2007 Open Grid Computing, Inc.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] s2io: add PCI error recovery support

2007-02-15 Thread Linas Vepstas

Koushik, Raju,

Please review, comment, and if you find this acceptable, 
please forward upstream. This patch incorporates all of 
fixes resulting from the last set of discussions, circa 
November 2006.

--linas

This patch adds PCI error recovery support to the 
s2io 10-Gigabit ethernet device driver. Fourth revision,
blocks interrupts and the watchdog. Adds a flag to 
s2io_down(), to avoid doing I/O when PCI bus is offline.

Tested, seems to work well.

Signed-off-by: Linas Vepstas [EMAIL PROTECTED]
Acked-by: Ramkrishna Vepa [EMAIL PROTECTED]
Cc: Raghavendra Koushik [EMAIL PROTECTED]
Cc: Ananda Raju [EMAIL PROTECTED]
Cc: Wen Xiong [EMAIL PROTECTED]


 drivers/net/s2io.c |  116 ++---
 drivers/net/s2io.h |5 ++
 2 files changed, 116 insertions(+), 5 deletions(-)

Index: linux-2.6.20-git4/drivers/net/s2io.c
===
--- linux-2.6.20-git4.orig/drivers/net/s2io.c   2007-02-15 15:39:35.0 
-0600
+++ linux-2.6.20-git4/drivers/net/s2io.c2007-02-15 16:15:10.0 
-0600
@@ -435,11 +435,18 @@ static struct pci_device_id s2io_tbl[] _
 
 MODULE_DEVICE_TABLE(pci, s2io_tbl);
 
+static struct pci_error_handlers s2io_err_handler = {
+   .error_detected = s2io_io_error_detected,
+   .slot_reset = s2io_io_slot_reset,
+   .resume = s2io_io_resume,
+};
+
 static struct pci_driver s2io_driver = {
   .name = S2IO,
   .id_table = s2io_tbl,
   .probe = s2io_init_nic,
   .remove = __devexit_p(s2io_rem_nic),
+  .err_handler = s2io_err_handler,
 };
 
 /* A simplifier macro used both by init and free shared_mem Fns(). */
@@ -2577,6 +2584,9 @@ static void s2io_netpoll(struct net_devi
u64 val64 = 0xULL;
int i;
 
+   if (pci_channel_offline(nic-pdev))
+   return;
+
disable_irq(dev-irq);
 
atomic_inc(nic-isr_cnt);
@@ -3079,6 +3089,8 @@ static void alarm_intr_handler(struct s2
int i;
if (atomic_read(nic-card_state) == CARD_DOWN)
return;
+   if (pci_channel_offline(nic-pdev))
+   return;
nic-mac_control.stats_info-sw_stat.ring_full_cnt = 0;
/* Handling the XPAK counters update */
if(nic-mac_control.stats_info-xpak_stat.xpak_timer_count  72000) {
@@ -4117,6 +4129,10 @@ static irqreturn_t s2io_isr(int irq, voi
struct mac_info *mac_control;
struct config_param *config;
 
+   /* Pretend we handled any irq's from a disconnected card */
+   if (pci_channel_offline(sp-pdev))
+   return IRQ_NONE;
+
atomic_inc(sp-isr_cnt);
mac_control = sp-mac_control;
config = sp-config;
@@ -6188,7 +6204,7 @@ static void s2io_rem_isr(struct s2io_nic
} while(cnt  5);
 }
 
-static void s2io_card_down(struct s2io_nic * sp)
+static void do_s2io_card_down(struct s2io_nic * sp, int do_io)
 {
int cnt = 0;
struct XENA_dev_config __iomem *bar0 = sp-bar0;
@@ -6203,7 +6219,8 @@ static void s2io_card_down(struct s2io_n
atomic_set(sp-card_state, CARD_DOWN);
 
/* disable Tx and Rx traffic on the NIC */
-   stop_nic(sp);
+   if (do_io)
+   stop_nic(sp);
 
s2io_rem_isr(sp);
 
@@ -6211,7 +6228,7 @@ static void s2io_card_down(struct s2io_n
tasklet_kill(sp-task);
 
/* Check if the device is Quiescent and then Reset the NIC */
-   do {
+   while(do_io) {
/* As per the HW requirement we need to replenish the
 * receive buffer to avoid the ring bump. Since there is
 * no intention of processing the Rx frame at this pointwe are
@@ -6236,8 +6253,9 @@ static void s2io_card_down(struct s2io_n
  (unsigned long long) val64);
break;
}
-   } while (1);
-   s2io_reset(sp);
+   }
+   if (do_io)
+   s2io_reset(sp);
 
spin_lock_irqsave(sp-tx_lock, flags);
/* Free all Tx buffers */
@@ -6252,6 +6270,11 @@ static void s2io_card_down(struct s2io_n
clear_bit(0, (sp-link_state));
 }
 
+static void s2io_card_down(struct s2io_nic * sp)
+{
+   do_s2io_card_down(sp, 1);
+}
+
 static int s2io_card_up(struct s2io_nic * sp)
 {
int i, ret = 0;
@@ -7536,3 +7559,86 @@ static void lro_append_pkt(struct s2io_n
sp-mac_control.stats_info-sw_stat.clubbed_frms_cnt++;
return;
 }
+
+/**
+ * s2io_io_error_detected - called when PCI error is detected
+ * @pdev: Pointer to PCI device
+ * @state: The current pci conneection state
+ *
+ * This function is called after a PCI bus error affecting
+ * this device has been detected.
+ */
+static pci_ers_result_t s2io_io_error_detected(struct pci_dev *pdev,
+   pci_channel_state_t state)
+{
+   struct net_device *netdev = pci_get_drvdata(pdev);
+   struct s2io_nic *sp = netdev-priv;
+
+ 

[PATCH] 2.6.21 iw_cxgb3 Fail posts synchronously when in TERMINATE state.

2007-02-15 Thread Steve Wise
From: Steve Wise [EMAIL PROTECTED]

Fail posts synchronously when in TERMINATE state.

For T3B devices, mark user qp in error once we transition
to TERMINATE.

Signed-off-by: Steve Wise [EMAIL PROTECTED]
---

 drivers/infiniband/hw/cxgb3/iwch_qp.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c 
b/drivers/infiniband/hw/cxgb3/iwch_qp.c
index e066727..da13a38 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_qp.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c
@@ -846,6 +846,8 @@ int iwch_modify_qp(struct iwch_dev *rhp,
break;
case IWCH_QP_STATE_TERMINATE:
qhp-attr.state = IWCH_QP_STATE_TERMINATE;
+   if (t3b_device(qhp-rhp))
+   cxio_set_wq_in_error(qhp-wq);
if (!internal)
terminate = 1;
break;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] iw_cxgb3 Fix copyrights in the iw_cxgb3 driver.

2007-02-15 Thread Steve Wise

Fix copyrights in the iw_cxgb3 driver.

Remove the Open Grid Computing copyright.  It shouldn't be there.

Signed-off-by: Steve Wise [EMAIL PROTECTED]
---

 drivers/infiniband/hw/cxgb3/cxio_dbg.c  |1 -
 drivers/infiniband/hw/cxgb3/cxio_hal.c  |1 -
 drivers/infiniband/hw/cxgb3/cxio_hal.h  |1 -
 drivers/infiniband/hw/cxgb3/cxio_resource.c |1 -
 drivers/infiniband/hw/cxgb3/cxio_resource.h |1 -
 drivers/infiniband/hw/cxgb3/cxio_wr.h   |1 -
 drivers/infiniband/hw/cxgb3/iwch.c  |1 -
 drivers/infiniband/hw/cxgb3/iwch.h  |1 -
 drivers/infiniband/hw/cxgb3/iwch_cm.c   |1 -
 drivers/infiniband/hw/cxgb3/iwch_cm.h   |1 -
 drivers/infiniband/hw/cxgb3/iwch_cq.c   |1 -
 drivers/infiniband/hw/cxgb3/iwch_ev.c   |1 -
 drivers/infiniband/hw/cxgb3/iwch_mem.c  |1 -
 drivers/infiniband/hw/cxgb3/iwch_provider.c |1 -
 drivers/infiniband/hw/cxgb3/iwch_provider.h |1 -
 drivers/infiniband/hw/cxgb3/iwch_qp.c   |1 -
 drivers/infiniband/hw/cxgb3/iwch_user.h |1 -
 17 files changed, 0 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb3/cxio_dbg.c 
b/drivers/infiniband/hw/cxgb3/cxio_dbg.c
index 5a7306f..75f7b16 100644
--- a/drivers/infiniband/hw/cxgb3/cxio_dbg.c
+++ b/drivers/infiniband/hw/cxgb3/cxio_dbg.c
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2006 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c 
b/drivers/infiniband/hw/cxgb3/cxio_hal.c
index 82fa720..114ac3b 100644
--- a/drivers/infiniband/hw/cxgb3/cxio_hal.c
+++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2006 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.h 
b/drivers/infiniband/hw/cxgb3/cxio_hal.h
index 1b97e80..8ab04a7 100644
--- a/drivers/infiniband/hw/cxgb3/cxio_hal.h
+++ b/drivers/infiniband/hw/cxgb3/cxio_hal.h
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2006 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/infiniband/hw/cxgb3/cxio_resource.c 
b/drivers/infiniband/hw/cxgb3/cxio_resource.c
index 997aa32..65bf577 100644
--- a/drivers/infiniband/hw/cxgb3/cxio_resource.c
+++ b/drivers/infiniband/hw/cxgb3/cxio_resource.c
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2006 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/infiniband/hw/cxgb3/cxio_resource.h 
b/drivers/infiniband/hw/cxgb3/cxio_resource.h
index a6bbe83..a2703a3 100644
--- a/drivers/infiniband/hw/cxgb3/cxio_resource.h
+++ b/drivers/infiniband/hw/cxgb3/cxio_resource.h
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2006 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/infiniband/hw/cxgb3/cxio_wr.h 
b/drivers/infiniband/hw/cxgb3/cxio_wr.h
index 103fc42..90d7b89 100644
--- a/drivers/infiniband/hw/cxgb3/cxio_wr.h
+++ b/drivers/infiniband/hw/cxgb3/cxio_wr.h
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2006 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/infiniband/hw/cxgb3/iwch.c 
b/drivers/infiniband/hw/cxgb3/iwch.c
index 4611afa..0315c9d 100644
--- a/drivers/infiniband/hw/cxgb3/iwch.c
+++ b/drivers/infiniband/hw/cxgb3/iwch.c
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2006 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006 Open Grid Computing, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
diff --git a/drivers/infiniband/hw/cxgb3/iwch.h 
b/drivers/infiniband/hw/cxgb3/iwch.h
index 6517ef8..caf4e60 100644
--- a/drivers/infiniband/hw/cxgb3/iwch.h
+++ b/drivers/infiniband/hw/cxgb3/iwch.h
@@ -1,6 +1,5 @@
 /*
  * Copyright (c) 2006 Chelsio, Inc. All rights reserved.
- * Copyright (c) 2006 Open Grid 

Re: [PATCH] 2.6.21 iw_cxgb3 Fail posts synchronously when in TERMINATE state.

2007-02-15 Thread Roland Dreier
thanks, applied.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] iw_cxgb3 Fix copyrights in the iw_cxgb3 driver.

2007-02-15 Thread Roland Dreier
thanks, applied
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 21/21] Xen-paravirt: Add the Xen virtual network device driver.

2007-02-15 Thread Jeremy Fitzhardinge
The network device frontend driver allows the kernel to access network
devices exported exported by a virtual machine containing a physical
network device driver.

Signed-off-by: Ian Pratt [EMAIL PROTECTED]
Signed-off-by: Christian Limpach [EMAIL PROTECTED]
Signed-off-by: Chris Wright [EMAIL PROTECTED]
Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED]
Cc: netdev@vger.kernel.org

---
 drivers/net/Kconfig|   12 
 drivers/net/Makefile   |2 
 drivers/net/xen-netfront.c | 2066 
 include/xen/events.h   |2 
 4 files changed, 2082 insertions(+)

===
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2525,6 +2525,18 @@ source drivers/atm/Kconfig
 
 source drivers/s390/net/Kconfig
 
+config XEN_NETDEV_FRONTEND
+   tristate Xen network device frontend driver
+   depends on XEN
+   default y
+   help
+ The network device frontend driver allows the kernel to
+ access network devices exported exported by a virtual
+ machine containing a physical network device driver. The
+ frontend driver is intended for unprivileged guest domains;
+ if you are compiling a kernel for a Xen guest, you almost
+ certainly want to enable this.
+
 config ISERIES_VETH
tristate iSeries Virtual Ethernet driver support
depends on PPC_ISERIES
===
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -218,3 +218,5 @@ obj-$(CONFIG_FS_ENET) += fs_enet/
 obj-$(CONFIG_FS_ENET) += fs_enet/
 
 obj-$(CONFIG_NETXEN_NIC) += netxen/
+
+obj-$(CONFIG_XEN_NETDEV_FRONTEND) += xen-netfront.o
===
--- /dev/null
+++ b/drivers/net/xen-netfront.c
@@ -0,0 +1,2066 @@
+/**
+ * Virtual network driver for conversing with remote driver backends.
+ *
+ * Copyright (c) 2002-2005, K A Fraser
+ * Copyright (c) 2005, XenSource Ltd
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the Software), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include linux/module.h
+#include linux/version.h
+#include linux/kernel.h
+#include linux/sched.h
+#include linux/slab.h
+#include linux/string.h
+#include linux/errno.h
+#include linux/netdevice.h
+#include linux/inetdevice.h
+#include linux/etherdevice.h
+#include linux/skbuff.h
+#include linux/init.h
+#include linux/bitops.h
+#include linux/ethtool.h
+#include linux/in.h
+#include linux/if_ether.h
+#include linux/io.h
+#include linux/moduleparam.h
+#include net/sock.h
+#include net/pkt_sched.h
+#include net/arp.h
+#include net/route.h
+#include asm/uaccess.h
+#include xen/xenbus.h
+#include xen/interface/io/netif.h
+#include xen/interface/memory.h
+#ifdef CONFIG_XEN_BALLOON
+#include xen/balloon.h
+#endif
+#include asm/page.h
+#include xen/interface/grant_table.h
+
+#include xen/events.h
+#include xen/page.h
+#include xen/grant_table.h
+
+/*
+ * Mutually-exclusive module options to select receive data path:
+ *  rx_copy : Packets are copied by network backend into local memory
+ *  rx_flip : Page containing packet data is transferred to our ownership
+ * For fully-virtualised guests there is no option - copying must be used.
+ * For paravirtualised guests, flipping is the default.
+ */
+#ifdef CONFIG_XEN
+static int MODPARM_rx_copy = 0;
+module_param_named(rx_copy, MODPARM_rx_copy, bool, 0);
+MODULE_PARM_DESC(rx_copy, Copy packets from network card (rather than flip));
+static int MODPARM_rx_flip = 0;
+module_param_named(rx_flip, MODPARM_rx_flip, bool,