Re: PROBLEM: Network sky2 Module, kernel version 2.6.23-rc7

2007-09-28 Thread Bill Davidsen

ben soo wrote:
i spoke too soon.  The Gbit interface still dies.  Lasted around 19hrs. 
or so.  i can't tell if there are hardware issues: yesterday a Gbit NIC 
on the firewall died.  Different chip (Realtek), different driver, 
different machine, same segment.  Segment is a mix of 100Mbit and 1Gbit 
machines.


Symptoms of the failure are it just stops functioning with no error 
messages.  ifconfig says there are packets being TX'd and none being 
RX'd.  Interface can't be brought up again.




If you search through my exchanges with Adrian Bunk WRT sk98lin removal, 
I mentioned a very similar problem. When I wrote that I had some notes 
in front of me, which are now archived and not quickly available. 
unloading and reloading the modules didn't fix it, IIRC.


I will be doing an update on the machine in question by the end of the 
year, and at that time I will try shy2 again, since I'll be able to do 
better testing.


--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: Network sky2 Module, kernel version 2.6.23-rc7

2007-09-28 Thread Bill Davidsen

ben soo wrote:
i spoke too soon.  The Gbit interface still dies.  Lasted around 19hrs. 
or so.  i can't tell if there are hardware issues: yesterday a Gbit NIC 
on the firewall died.  Different chip (Realtek), different driver, 
different machine, same segment.  Segment is a mix of 100Mbit and 1Gbit 
machines.


Symptoms of the failure are it just stops functioning with no error 
messages.  ifconfig says there are packets being TX'd and none being 
RX'd.  Interface can't be brought up again.




If you search through my exchanges with Adrian Bunk WRT sk98lin removal, 
I mentioned a very similar problem. When I wrote that I had some notes 
in front of me, which are now archived and not quickly available. 
unloading and reloading the modules didn't fix it, IIRC.


I will be doing an update on the machine in question by the end of the 
year, and at that time I will try shy2 again, since I'll be able to do 
better testing.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: Network sky2 Module, kernel version 2.6.23-rc7

2007-09-24 Thread Stephen Hemminger
On Sat, 22 Sep 2007 21:56:04 -0400
ben soo <[EMAIL PROTECTED]> wrote:

> i spoke too soon.  The Gbit interface still dies.  Lasted around 
> 19hrs. or so.  i can't tell if there are hardware issues: yesterday 
> a Gbit NIC on the firewall died.  Different chip (Realtek), 
> different driver, different machine, same segment.  Segment is a 
> mix of 100Mbit and 1Gbit machines.
> 
> Symptoms of the failure are it just stops functioning with no error 
> messages.  ifconfig says there are packets being TX'd and none 
> being RX'd.  Interface can't be brought up again.
> 
> i use this box as the secondary DNS for our Internet domains 
> (unblocked by firewall), as well as the database server for our 
> developmental web CMS and the LAN ntp server (both services 
> invisible outside the LAN).
> 
> 
> dmesg says this on bootup:
> 
> > ACPI: PCI Interrupt :01:00.0[A] -> GSI 18 (level, low) -> IRQ 16
> > PCI: Setting latency timer of device :01:00.0 to 64
> > sky2 :01:00.0: v1.17 addr 0xdfefc000 irq 16 Yukon-EC (0xb6) rev 2
> > sky2 eth0: addr 00:01:29:d6:35:f4
> > ACPI: PCI Interrupt :02:00.0[A] -> GSI 19 (level, low) -> IRQ 17
> > PCI: Setting latency timer of device :02:00.0 to 64
> > sky2 :02:00.0: v1.17 addr 0xdfcfc000 irq 17 Yukon-EC (0xb6) rev 2
> > sky2 eth1: addr 00:01:29:d6:36:26

Does your network switch support flowcontrol?  The Yukon-EC may have
a hardware problem that happens if FIFO gets full. If flow control is enabled,
that shouldn't happen. 

The newest version of the driver has a watchdog to try
and detect and do an automatic reset, but it still means there would be about
2 seconds of downtime.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: Network sky2 Module, kernel version 2.6.23-rc7

2007-09-24 Thread Stephen Hemminger
On Sat, 22 Sep 2007 21:56:04 -0400
ben soo [EMAIL PROTECTED] wrote:

 i spoke too soon.  The Gbit interface still dies.  Lasted around 
 19hrs. or so.  i can't tell if there are hardware issues: yesterday 
 a Gbit NIC on the firewall died.  Different chip (Realtek), 
 different driver, different machine, same segment.  Segment is a 
 mix of 100Mbit and 1Gbit machines.
 
 Symptoms of the failure are it just stops functioning with no error 
 messages.  ifconfig says there are packets being TX'd and none 
 being RX'd.  Interface can't be brought up again.
 
 i use this box as the secondary DNS for our Internet domains 
 (unblocked by firewall), as well as the database server for our 
 developmental web CMS and the LAN ntp server (both services 
 invisible outside the LAN).
 
 
 dmesg says this on bootup:
 
  ACPI: PCI Interrupt :01:00.0[A] - GSI 18 (level, low) - IRQ 16
  PCI: Setting latency timer of device :01:00.0 to 64
  sky2 :01:00.0: v1.17 addr 0xdfefc000 irq 16 Yukon-EC (0xb6) rev 2
  sky2 eth0: addr 00:01:29:d6:35:f4
  ACPI: PCI Interrupt :02:00.0[A] - GSI 19 (level, low) - IRQ 17
  PCI: Setting latency timer of device :02:00.0 to 64
  sky2 :02:00.0: v1.17 addr 0xdfcfc000 irq 17 Yukon-EC (0xb6) rev 2
  sky2 eth1: addr 00:01:29:d6:36:26

Does your network switch support flowcontrol?  The Yukon-EC may have
a hardware problem that happens if FIFO gets full. If flow control is enabled,
that shouldn't happen. 

The newest version of the driver has a watchdog to try
and detect and do an automatic reset, but it still means there would be about
2 seconds of downtime.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: Network sky2 Module, kernel version 2.6.23-rc7

2007-09-22 Thread ben soo
i spoke too soon.  The Gbit interface still dies.  Lasted around 
19hrs. or so.  i can't tell if there are hardware issues: yesterday 
a Gbit NIC on the firewall died.  Different chip (Realtek), 
different driver, different machine, same segment.  Segment is a 
mix of 100Mbit and 1Gbit machines.


Symptoms of the failure are it just stops functioning with no error 
messages.  ifconfig says there are packets being TX'd and none 
being RX'd.  Interface can't be brought up again.


i use this box as the secondary DNS for our Internet domains 
(unblocked by firewall), as well as the database server for our 
developmental web CMS and the LAN ntp server (both services 
invisible outside the LAN).



dmesg says this on bootup:


ACPI: PCI Interrupt :01:00.0[A] -> GSI 18 (level, low) -> IRQ 16
PCI: Setting latency timer of device :01:00.0 to 64
sky2 :01:00.0: v1.17 addr 0xdfefc000 irq 16 Yukon-EC (0xb6) rev 2
sky2 eth0: addr 00:01:29:d6:35:f4
ACPI: PCI Interrupt :02:00.0[A] -> GSI 19 (level, low) -> IRQ 17
PCI: Setting latency timer of device :02:00.0 to 64
sky2 :02:00.0: v1.17 addr 0xdfcfc000 irq 17 Yukon-EC (0xb6) rev 2
sky2 eth1: addr 00:01:29:d6:36:26


Motherboard manufacturer identifies the chips as:

Dual Gigabit LAN - Marvell 88E8052 and Marvell 88E8053 Gigabit PCI LAN 


board is DFI LANPARTY UT CFX3200-DR/G running an Opteron 165.

i've brought up the other Gbit network interface on the motherboard 
and am using that.


ben soo wrote:
i turned off the motherboard Marvell Gbit devices and installed a 
Realtek 8169 card, all the while keeping to kernel version 2.6.23-rc6, 
and saw all the network problems go away.  Must mean it was a sky2 
driver bug.


Am currently running 2.6.23-rc7 on the affected server and the rc7 sky2 
driver is holding up fine so far.


thank you!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: Network sky2 Module, kernel version 2.6.23-rc7

2007-09-22 Thread ben soo
i spoke too soon.  The Gbit interface still dies.  Lasted around 
19hrs. or so.  i can't tell if there are hardware issues: yesterday 
a Gbit NIC on the firewall died.  Different chip (Realtek), 
different driver, different machine, same segment.  Segment is a 
mix of 100Mbit and 1Gbit machines.


Symptoms of the failure are it just stops functioning with no error 
messages.  ifconfig says there are packets being TX'd and none 
being RX'd.  Interface can't be brought up again.


i use this box as the secondary DNS for our Internet domains 
(unblocked by firewall), as well as the database server for our 
developmental web CMS and the LAN ntp server (both services 
invisible outside the LAN).



dmesg says this on bootup:


ACPI: PCI Interrupt :01:00.0[A] - GSI 18 (level, low) - IRQ 16
PCI: Setting latency timer of device :01:00.0 to 64
sky2 :01:00.0: v1.17 addr 0xdfefc000 irq 16 Yukon-EC (0xb6) rev 2
sky2 eth0: addr 00:01:29:d6:35:f4
ACPI: PCI Interrupt :02:00.0[A] - GSI 19 (level, low) - IRQ 17
PCI: Setting latency timer of device :02:00.0 to 64
sky2 :02:00.0: v1.17 addr 0xdfcfc000 irq 17 Yukon-EC (0xb6) rev 2
sky2 eth1: addr 00:01:29:d6:36:26


Motherboard manufacturer identifies the chips as:

Dual Gigabit LAN - Marvell 88E8052 and Marvell 88E8053 Gigabit PCI LAN 


board is DFI LANPARTY UT CFX3200-DR/G running an Opteron 165.

i've brought up the other Gbit network interface on the motherboard 
and am using that.


ben soo wrote:
i turned off the motherboard Marvell Gbit devices and installed a 
Realtek 8169 card, all the while keeping to kernel version 2.6.23-rc6, 
and saw all the network problems go away.  Must mean it was a sky2 
driver bug.


Am currently running 2.6.23-rc7 on the affected server and the rc7 sky2 
driver is holding up fine so far.


thank you!

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/