Re: sky2: hangs on 2.6.16

2006-03-25 Thread MichaelM
On Fri, Mar 24, 2006 at 02:32:41PM -0800, Stephen Hemminger wrote:
 On Fri, 24 Mar 2006 22:13:54 +
 Michael Menegakis [EMAIL PROTECTED] wrote:
 
 
  were they any helpfull?
 
 The first thing to look for is are packets showing up (and being transmitted)
 by doing 
   ethtool -S eth0
 Since in this driver stats come out of the PHY, it is possible for the PHY
 to be receiving packets but have the bus interface wedged.
 
 It also will tell you if you have pause frames going back and forth.
 You might have a bad switch that doesn't do flow-control properly.
 
 Next you can turn on debug with:
   ethtool -s eth0 msglvl 0xfff
 
 and see if packets are being received and transmitted. 
 
 In your case, it looks like the driver is receiving and transmitting fine;
 so it probably is in the upper layers. So look into higher level statistics
 like: netstat (or ip and ss).
 
 
 Other possiblities:
 
 * turn off TSO
   ethtool -K eth0 tso off
 * turn off Tx checksum
   ethtool -K eth0 tx off
 * turn off Rx checksum
   ethtool -K eth0 rx off
 
 If you get things really wedged and want to dig into the driver and
 look at all the registers. But it really requires lots of time to decode...
 
   ethtool -d eth0 raw on /tmp/eth0.dump
   hexdump /tmp/eth0.dump

I hope this helps at all since my knowledge in all this is very limited
if not inexistent.

The iface seems to recieve but not transmit after the hang. The debug
options above managed to have a similar output with debug=16
on the logs. the turn off options, didn't seem to alter the way of
logging or affect networking, may have missed sth though.

an indicator I managed to get was that connections during
the hang up, seem to be stuck at netstat as FIN_WAIT1. that is, if I 
ctrl-c the application which tests downloading from multiple http
connections or if wait for those connections to time out.

on ethereal before/during/after the hang I got many TCP segment of a
reassembled PDU send from here, TCP Dup ACK send from other end, TCP
retransmission - TCP segment of a reassembled PDU from here, TCP Keep
Alive from here, TCP Out-of-order - TCP segment of a reassembled
PDU ..
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2: hangs on 2.6.16

2006-03-24 Thread Stephen Hemminger
On Fri, 24 Mar 2006 17:14:49 +
michael [EMAIL PROTECTED] wrote:

 transfer of data hangs with  sky2 very often on a
 
 :02:00.0 Ethernet controller: Marvell Technology Group
 Ltd. 88E8036 Fast Ethernet Controller (rev 10) 
 
 found on toshiba laptops,
 
 when using 2.6.16, which proves a critical problem since the proprietary
 driver does not support .16 at the moment.
 
 debug=16 doesn't produce any info I can understand, most of it looks
 like
 
 ..
 eth0: tx queued, slot 15, len 78
 sky2 eth0: rx slot 3 status 0x4e0100 len 78
 ..
 
 before, during, and after transfer of data stops. 
 
 let me know if you need that full data or anything else that may help.
 
 thanks.
 
 ps. please cc me any replies
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

I sent an update post 2.6.16 that should fix most of these problems.
or try this which is the important bit. Unfortunately, I may have to bend
the rules to get this in for -stable.


--- linux-2.6.16/drivers/net/sky2.c.orig2006-03-24 09:36:42.0 
-0800
+++ linux-2.6.16/drivers/net/sky2.c 2006-03-24 09:36:51.0 -0800
@@ -96,6 +96,10 @@ static int copybreak __read_mostly = 256
 module_param(copybreak, int, 0);
 MODULE_PARM_DESC(copybreak, Receive copy threshold);
 
+static int disable_msi = 0;
+module_param(disable_msi, int, 0);
+MODULE_PARM_DESC(disable_msi, Disable Message Signaled Interrupt (MSI));
+
 static const struct pci_device_id sky2_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, 0x9000) },
{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, 0x9E00) },
@@ -504,9 +508,9 @@ static void sky2_phy_init(struct sky2_hw
 /* Force a renegotiation */
 static void sky2_phy_reinit(struct sky2_port *sky2)
 {
-   down(sky2-phy_sema);
+   spin_lock_bh(sky2-phy_lock);
sky2_phy_init(sky2-hw, sky2-port);
-   up(sky2-phy_sema);
+   spin_unlock_bh(sky2-phy_lock);
 }
 
 static void sky2_mac_init(struct sky2_hw *hw, unsigned port)
@@ -571,9 +575,9 @@ static void sky2_mac_init(struct sky2_hw
 
sky2_read16(hw, SK_REG(port, GMAC_IRQ_SRC));
 
-   down(sky2-phy_sema);
+   spin_lock_bh(sky2-phy_lock);
sky2_phy_init(hw, port);
-   up(sky2-phy_sema);
+   spin_unlock_bh(sky2-phy_lock);
 
/* MIB clear */
reg = gma_read16(hw, port, GM_PHY_ADDR);
@@ -886,9 +890,9 @@ static int sky2_ioctl(struct net_device 
case SIOCGMIIREG: {
u16 val = 0;
 
-   down(sky2-phy_sema);
+   spin_lock_bh(sky2-phy_lock);
err = __gm_phy_read(hw, sky2-port, data-reg_num  0x1f, val);
-   up(sky2-phy_sema);
+   spin_unlock_bh(sky2-phy_lock);
 
data-val_out = val;
break;
@@ -898,10 +902,10 @@ static int sky2_ioctl(struct net_device 
if (!capable(CAP_NET_ADMIN))
return -EPERM;
 
-   down(sky2-phy_sema);
+   spin_lock_bh(sky2-phy_lock);
err = gm_phy_write(hw, sky2-port, data-reg_num  0x1f,
   data-val_in);
-   up(sky2-phy_sema);
+   spin_unlock_bh(sky2-phy_lock);
break;
}
return err;
@@ -1014,7 +1018,7 @@ static int sky2_up(struct net_device *de
struct sky2_port *sky2 = netdev_priv(dev);
struct sky2_hw *hw = sky2-hw;
unsigned port = sky2-port;
-   u32 ramsize, rxspace;
+   u32 ramsize, rxspace, imask;
int err = -ENOMEM;
 
if (netif_msg_ifup(sky2))
@@ -1079,10 +1083,10 @@ static int sky2_up(struct net_device *de
goto err_out;
 
/* Enable interrupts from phy/mac for port */
-   spin_lock_irq(hw-hw_lock);
-   hw-intr_mask |= (port == 0) ? Y2_IS_PORT_1 : Y2_IS_PORT_2;
-   sky2_write32(hw, B0_IMSK, hw-intr_mask);
-   spin_unlock_irq(hw-hw_lock);
+   imask = sky2_read32(hw, B0_IMSK);
+   imask |= (port == 0) ? Y2_IS_PORT_1 : Y2_IS_PORT_2;
+   sky2_write32(hw, B0_IMSK, imask);
+
return 0;
 
 err_out:
@@ -1375,6 +1379,7 @@ static int sky2_down(struct net_device *
struct sky2_hw *hw = sky2-hw;
unsigned port = sky2-port;
u16 ctrl;
+   u32 imask;
 
/* Never really got started! */
if (!sky2-tx_le)
@@ -1386,14 +1391,6 @@ static int sky2_down(struct net_device *
/* Stop more packets from being queued */
netif_stop_queue(dev);
 
-   /* Disable port IRQ */
-   spin_lock_irq(hw-hw_lock);
-   hw-intr_mask = ~((sky2-port == 0) ? Y2_IS_IRQ_PHY1 : Y2_IS_IRQ_PHY2);
-   sky2_write32(hw, B0_IMSK, hw-intr_mask);
-   spin_unlock_irq(hw-hw_lock);
-
-   flush_scheduled_work();
-
sky2_phy_reset(hw, port);
 
/* Stop transmitter */
@@ -1437,6 +1434,11 @@ static int sky2_down(struct net_device *

Re: sky2: hangs on 2.6.16

2006-03-24 Thread michael
On Fri, Mar 24, 2006 at 09:38:44AM -0800, Stephen Hemminger wrote:
 On Fri, 24 Mar 2006 17:14:49 +
 michael [EMAIL PROTECTED] wrote:
 
  transfer of data hangs with  sky2 very often on a
  
  :02:00.0 Ethernet controller: Marvell Technology Group
  Ltd. 88E8036 Fast Ethernet Controller (rev 10) 
  
  found on toshiba laptops,
  
  when using 2.6.16, which proves a critical problem since the proprietary
  driver does not support .16 at the moment.
  
  debug=16 doesn't produce any info I can understand, most of it looks
  like
  
  ..
  eth0: tx queued, slot 15, len 78
  sky2 eth0: rx slot 3 status 0x4e0100 len 78
  ..
  
  before, during, and after transfer of data stops. 
  
  let me know if you need that full data or anything else that may help.
  
  thanks.
  
  ps. please cc me any replies
  -
  To unsubscribe from this list: send the line unsubscribe netdev in
  the body of a message to [EMAIL PROTECTED]
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 I sent an update post 2.6.16 that should fix most of these problems.
 or try this which is the important bit. Unfortunately, I may have to bend
 the rules to get this in for -stable.

Maybe the hang I notice is different because it stays with this patch
and appears primarily after using multiple connections.

An easy way to reproduce it here is this

axel -n 40 http://..file (a downloader that opens 40 connections)

it will go on downloading normally without problem.

If I ctrl-c that process and immediately re-run it, it will not start
again (or any other internet connection).

rmmod, modprobe and reconfiguring the iface brings it back to normal.

I haven't checked if all those 40 connections are opened or at some
point it doesn't open more than 10 or sth.

Let me know if I can help somehow further.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2: hangs on 2.6.16

2006-03-24 Thread Stephen Hemminger
On Fri, 24 Mar 2006 18:18:57 +
michael [EMAIL PROTECTED] wrote:

 On Fri, Mar 24, 2006 at 09:38:44AM -0800, Stephen Hemminger wrote:
  On Fri, 24 Mar 2006 17:14:49 +
  michael [EMAIL PROTECTED] wrote:
  
   transfer of data hangs with  sky2 very often on a
   
   :02:00.0 Ethernet controller: Marvell Technology Group
   Ltd. 88E8036 Fast Ethernet Controller (rev 10) 
   
   found on toshiba laptops,
   
   when using 2.6.16, which proves a critical problem since the proprietary
   driver does not support .16 at the moment.
   
   debug=16 doesn't produce any info I can understand, most of it looks
   like
   
   ..
   eth0: tx queued, slot 15, len 78
   sky2 eth0: rx slot 3 status 0x4e0100 len 78
   ..
   
   before, during, and after transfer of data stops. 
   
   let me know if you need that full data or anything else that may help.
   
   thanks.
   
   ps. please cc me any replies
   -
   To unsubscribe from this list: send the line unsubscribe netdev in
   the body of a message to [EMAIL PROTECTED]
   More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
  I sent an update post 2.6.16 that should fix most of these problems.
  or try this which is the important bit. Unfortunately, I may have to bend
  the rules to get this in for -stable.
 
 Maybe the hang I notice is different because it stays with this patch
 and appears primarily after using multiple connections.
 
 An easy way to reproduce it here is this
 
 axel -n 40 http://..file (a downloader that opens 40 connections)
 
 it will go on downloading normally without problem.
 
 If I ctrl-c that process and immediately re-run it, it will not start
 again (or any other internet connection).
 
 rmmod, modprobe and reconfiguring the iface brings it back to normal.
 
 I haven't checked if all those 40 connections are opened or at some
 point it doesn't open more than 10 or sth.
 
 Let me know if I can help somehow further.

How far away is the site you are downloading from.  Perhaps it just means you
have lots of connections open, and the memory is getting low.  I can't reproduce
it here (downloading from kernel.org on P4 with motherboard chip and 2.6.16-git 
latest).
But I have 2G of memory.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2: hangs on 2.6.16

2006-03-24 Thread fs
On Fri, Mar 24, 2006 at 10:40:00AM -0800, Stephen Hemminger wrote:
  Maybe the hang I notice is different because it stays with this patch
  and appears primarily after using multiple connections.
  
  An easy way to reproduce it here is this
  
  axel -n 40 http://..file (a downloader that opens 40 connections)
  
  it will go on downloading normally without problem.
  
  If I ctrl-c that process and immediately re-run it, it will not start
  again (or any other internet connection).
  
  rmmod, modprobe and reconfiguring the iface brings it back to normal.
  
  I haven't checked if all those 40 connections are opened or at some
  point it doesn't open more than 10 or sth.
  
  Let me know if I can help somehow further.
 
 How far away is the site you are downloading from.  Perhaps it just means you
 have lots of connections open, and the memory is getting low.  I can't 
 reproduce
 it here (downloading from kernel.org on P4 with motherboard chip and 
 2.6.16-git latest).
 But I have 2G of memory.

The problem doesn't appear with the proprietary driver on .15 on the
same sites. Basically, when the hang happens I can't open an inet
connection without rmmod, modprobe, iface reconfiguration. I have 1.5gb.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2: hangs on 2.6.16

2006-03-24 Thread Stephen Hemminger
On Fri, 24 Mar 2006 18:48:37 +
fs [EMAIL PROTECTED] wrote:

 On Fri, Mar 24, 2006 at 10:40:00AM -0800, Stephen Hemminger wrote:
   Maybe the hang I notice is different because it stays with this patch
   and appears primarily after using multiple connections.
   
   An easy way to reproduce it here is this
   
   axel -n 40 http://..file (a downloader that opens 40 connections)
   
   it will go on downloading normally without problem.
   
   If I ctrl-c that process and immediately re-run it, it will not start
   again (or any other internet connection).
   
   rmmod, modprobe and reconfiguring the iface brings it back to normal.
   
   I haven't checked if all those 40 connections are opened or at some
   point it doesn't open more than 10 or sth.
   
   Let me know if I can help somehow further.
  
  How far away is the site you are downloading from.  Perhaps it just means 
  you
  have lots of connections open, and the memory is getting low.  I can't 
  reproduce
  it here (downloading from kernel.org on P4 with motherboard chip and 
  2.6.16-git latest).
  But I have 2G of memory.
 
 The problem doesn't appear with the proprietary driver on .15 on the
 same sites. Basically, when the hang happens I can't open an inet
 connection without rmmod, modprobe, iface reconfiguration. I have 1.5gb.

Do you see anything on the console log (dmesg) like transmit timeouts?
or allocation failures?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2: hangs on 2.6.16

2006-03-24 Thread fs
On Fri, Mar 24, 2006 at 11:24:35AM -0800, Stephen Hemminger wrote:
 On Fri, 24 Mar 2006 18:48:37 +
 fs [EMAIL PROTECTED] wrote:
 
  On Fri, Mar 24, 2006 at 10:40:00AM -0800, Stephen Hemminger wrote:
Maybe the hang I notice is different because it stays with this patch
and appears primarily after using multiple connections.

An easy way to reproduce it here is this

axel -n 40 http://..file (a downloader that opens 40 connections)

it will go on downloading normally without problem.

If I ctrl-c that process and immediately re-run it, it will not start
again (or any other internet connection).

rmmod, modprobe and reconfiguring the iface brings it back to normal.

I haven't checked if all those 40 connections are opened or at some
point it doesn't open more than 10 or sth.

Let me know if I can help somehow further.
   
   How far away is the site you are downloading from.  Perhaps it just means 
   you
   have lots of connections open, and the memory is getting low.  I can't 
   reproduce
   it here (downloading from kernel.org on P4 with motherboard chip and 
   2.6.16-git latest).
   But I have 2G of memory.
  
  The problem doesn't appear with the proprietary driver on .15 on the
  same sites. Basically, when the hang happens I can't open an inet
  connection without rmmod, modprobe, iface reconfiguration. I have 1.5gb.
 
 Do you see anything on the console log (dmesg) like transmit timeouts?
 or allocation failures?

no, nothing out of the ordinary on dmesg.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2: hangs on 2.6.16

2006-03-24 Thread Michael Menegakis
On Fri, Mar 24, 2006 at 07:29:33PM +, fs wrote:
 On Fri, Mar 24, 2006 at 11:24:35AM -0800, Stephen Hemminger wrote:
  On Fri, 24 Mar 2006 18:48:37 +
  fs [EMAIL PROTECTED] wrote:
  
   On Fri, Mar 24, 2006 at 10:40:00AM -0800, Stephen Hemminger wrote:
 Maybe the hang I notice is different because it stays with this patch
 and appears primarily after using multiple connections.
 
 An easy way to reproduce it here is this
 
 axel -n 40 http://..file (a downloader that opens 40 connections)
 
 it will go on downloading normally without problem.
 
 If I ctrl-c that process and immediately re-run it, it will not start
 again (or any other internet connection).
 
 rmmod, modprobe and reconfiguring the iface brings it back to normal.
 
 I haven't checked if all those 40 connections are opened or at some
 point it doesn't open more than 10 or sth.
 
 Let me know if I can help somehow further.

How far away is the site you are downloading from.  Perhaps it just 
means you
have lots of connections open, and the memory is getting low.  I can't 
reproduce
it here (downloading from kernel.org on P4 with motherboard chip and 
2.6.16-git latest).
But I have 2G of memory.
   
   The problem doesn't appear with the proprietary driver on .15 on the
   same sites. Basically, when the hang happens I can't open an inet
   connection without rmmod, modprobe, iface reconfiguration. I have 1.5gb.
  
  Do you see anything on the console log (dmesg) like transmit timeouts?
  or allocation failures?
 
 no, nothing out of the ordinary on dmesg.

So, any idea how I should go into debugging this? I have no much
experience in debugging, and none at all in network debugging.

debug=16 was repeating things like

..
Mar 24 18:05:34 localhost kernel: eth0: tx done, up to 184
Mar 24 18:05:34 localhost kernel: sky2 eth0: rx slot 138 status 
0x59a0100 len 14
..

were they any helpfull?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2: hangs on 2.6.16

2006-03-24 Thread Stephen Hemminger
On Fri, 24 Mar 2006 22:13:54 +
Michael Menegakis [EMAIL PROTECTED] wrote:


 were they any helpfull?

The first thing to look for is are packets showing up (and being transmitted)
by doing 
ethtool -S eth0
Since in this driver stats come out of the PHY, it is possible for the PHY
to be receiving packets but have the bus interface wedged.

It also will tell you if you have pause frames going back and forth.
You might have a bad switch that doesn't do flow-control properly.

Next you can turn on debug with:
ethtool -s eth0 msglvl 0xfff

and see if packets are being received and transmitted. 

In your case, it looks like the driver is receiving and transmitting fine;
so it probably is in the upper layers. So look into higher level statistics
like: netstat (or ip and ss).


Other possiblities:

* turn off TSO
ethtool -K eth0 tso off
* turn off Tx checksum
ethtool -K eth0 tx off
* turn off Rx checksum
ethtool -K eth0 rx off

If you get things really wedged and want to dig into the driver and
look at all the registers. But it really requires lots of time to decode...

ethtool -d eth0 raw on /tmp/eth0.dump
hexdump /tmp/eth0.dump

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html