David Sommerseth wrote:
[email protected] wrote:
PCI-X dual port Broadcom NetXtreme BCM5704 Gigabit Ethernet (rev 03)
adapter is working fine here driven by tg3, 2.6.27-hardened-r1. The driver
doesn't seem to be borked with my card.

Did you check out the "error" field of ifconfig's output for the interface
of your card?

Regards,
Dw.

Hmmm ... No, I have not had that opportunity.  The server is located 2000km 
away from me, and I
usually call a guy (who is not a technician)to go in and press CTRL-ALT-DEL on 
a keyboard.  That is
the short-time "fix".  But I'm going to have a look physically on the server in 
a couple of weeks,
so if I get positive feedbacks from others as well regarding 2.6.27 kernel, I'm 
willing to try that
upgrade.

This interface is an on-board interface in an IBM eServer.  The first time it 
happened, it was no
problems for about 28 days.  Now it was 13 days.  So I expect it to happen 
again, soon enough.

I'll try to hack the shutdown scripts to dump the ifconfig info somewhere 
somehow.

Then it happened again ... and I have ifconfig stats for the interface:

eth0      Link encap:Ethernet  HWaddr 00:14:5e:5d:3c:d0
          inet6 addr: fe80::214:5eff:fe5d:3cd0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:10551633 errors:4294967239 dropped:767 overruns:0 frame:170
          TX packets:9371606 errors:4294967239 dropped:0 overruns:0 carrier:0
          collisions:4294967239 txqueuelen:1000
          RX bytes:28237000 (26.9 MiB)  TX bytes:163377979 (155.8 MiB)
          Interrupt:16

From the kernel log I see this:

Dec 12 12:19:21 fw [74355.059369] tg3: tg3_abort_hw timed out for world, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
Dec 12 12:19:24 fw [74357.842979] tg3: world: No firmware running.
Dec 12 12:19:41 fw [74374.992867] tg3: world: Link is down.

I'm surprised by the errors and collision numbers here, as I checked it the other day, and all of them was 0. I also know that the TX and RX values was above 3-4GB, but don't remember which was what.

Could this be an overflow bug of some kind?

I have also found out that IBM have released an updated firmware to this network device, so I'll try to upgrade it during Christmas when I'm close to the box again. In the mean time I have a little ping-script, which restarts network (incl. reloading of the tg3 module) when the network dies. This restart gives me minimal downtime.

But I do not understand why this box was so rock solid until I upgraded from 2.6.22-hardened-r8 to 2.6.25-hardened-r8. The new kernel driver obviously does something it didn't do before. Unfortunately I can't find anything particular in the kernel git logs for the tg3.[ch] files which could pin-point anything particular.


Does anyone have any experiences regarding firmware upgrades on these cards? The instructions seems pretty much forward, but if you know about anything, whatever, I would appreciate that.


kind regards,

David Sommerseth

Reply via email to