Re: [E1000-devel] [PATCH] e100: Add missing dma sync for proper operation with non-coherent caches.
David Daney wrote: I am running the e100 driver on a MIPS 4KEc system (32 bit mips with non-coherent DMA). There was a problem where received packets would get 'stuck' for several seconds at a time and then be released all at once. The cause was that if an interrupt were received when no RX packets were available, the status for the receive buffer would be stuck in the cache, so when the next interrupt arrived the old status value was read (indicating no packets available) instead of the new value. The fix is to call pci_dma_sync_single_for_device on the RX if the packet is not available to invalidate the cache so that at the next interrupt valid status is returned. The driver currently calls pci_dma_sync_single_for_cpu before reading the status, and this is indeed needed for cases like the R1 CPU where the cache can be polluted by speculative execution, but for most machines it is a nop. The patch was tested on 2.6.17-rc4 on a MIPS 4KEc. lol, that's a bit old :) A LOT of work has gone into 2.6.26+'s version of e100 that addresses specifically non-coherent DMA machines, including a patch that looks very close to what you write below here. can you test the latest git tree and see if you still need (a modified or updated version) of your patch? I know for sure that the guys maintaining e100 do not have anything close to your system, so they can't test that. Cheers, Auke Signed-off-by: David Daney [EMAIL PROTECTED] --- drivers/net/e100.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/e100.c b/drivers/net/e100.c index 19d32a2..fb8d551 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -1840,6 +1840,11 @@ static int e100_rx_indicate(struct nic *nic, struct rx *rx, if (readb(nic-csr-scb.status) rus_no_res) nic-ru_running = RU_SUSPENDED; + /* We are done looking at the buffer. Prepare it for + * more DMA. */ + pci_dma_sync_single_for_device(nic-pdev, rx-dma_addr, +sizeof(struct rfd), +PCI_DMA_FROMDEVICE); return -ENODATA; } - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] [PATCH] e100: Add missing dma sync for proper operation with non-coherent caches.
Kok, Auke wrote: David Daney wrote: I am running the e100 driver on a MIPS 4KEc system (32 bit mips with non-coherent DMA). There was a problem where received packets would get 'stuck' for several seconds at a time and then be released all at once. The cause was that if an interrupt were received when no RX packets were available, the status for the receive buffer would be stuck in the cache, so when the next interrupt arrived the old status value was read (indicating no packets available) instead of the new value. The fix is to call pci_dma_sync_single_for_device on the RX if the packet is not available to invalidate the cache so that at the next interrupt valid status is returned. The driver currently calls pci_dma_sync_single_for_cpu before reading the status, and this is indeed needed for cases like the R1 CPU where the cache can be polluted by speculative execution, but for most machines it is a nop. The patch was tested on 2.6.17-rc4 on a MIPS 4KEc. lol, that's a bit old :) It was a typo. The real version is 2.6.27-rc4 (aka the HEAD). The bug is present on the HEAD. Sorry for the confusion. David Daney. A LOT of work has gone into 2.6.26+'s version of e100 that addresses specifically non-coherent DMA machines, including a patch that looks very close to what you write below here. can you test the latest git tree and see if you still need (a modified or updated version) of your patch? I know for sure that the guys maintaining e100 do not have anything close to your system, so they can't test that. Cheers, Auke Signed-off-by: David Daney [EMAIL PROTECTED] --- drivers/net/e100.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/e100.c b/drivers/net/e100.c index 19d32a2..fb8d551 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -1840,6 +1840,11 @@ static int e100_rx_indicate(struct nic *nic, struct rx *rx, if (readb(nic-csr-scb.status) rus_no_res) nic-ru_running = RU_SUSPENDED; +/* We are done looking at the buffer. Prepare it for + * more DMA. */ +pci_dma_sync_single_for_device(nic-pdev, rx-dma_addr, + sizeof(struct rfd), + PCI_DMA_FROMDEVICE); return -ENODATA; } - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] how to repair correupted EEPROM/NVM?
Pierre Ossman wrote: I've just noticed that the e1000e has delightfully made poo poo all over my EEPROM (something David Vrabel also has reported). Shit happens and all that I guess, but how do I get the thing back in a working order? Couldn't find anything useful on the interwebs... were you testing eeprom writes? We don't recommend changing your eeprom except in extreme cases. We don't have publically available tools to fix eeproms due to the design specific changes that are made to each eeprom. I haven't done anything outside of normal end user usage. The card just got corrupted as of late, and when I saw David Vrabel's comments I assumed it was the driver that borked it. driver corruption is highly unlikely, the driver doesn't write to the eeprom unless told to do so by the user. More than likely either your system is being managed by iAMT and was somehow corrupted by that firmware, or you have a hardware failure. One avenue at this point is contacting the customer support of your laptop vendor. if you saved a dump of your eeprom before you started writing to it, you can program each byte back into place using ethtool (if the device shows up) The device shows up on the PCI bus, but no network device (as the driver refuses to continue once it detects a bad checksum). I have no dump though since it had never occurred to me that I needed to keep backups of my NIC's EEPROM. :) you can try bypassing the return when the eeprom checksum fails, and see if the driver will load. If the MAC address check fails after that you can try bypassing that too. A colleague has an identical machine though, could that be used with some manual fiddling of the MAC address bytes? yes. I would start by getting the driver to load (if it will) and then doing ethtool -e on both machines and comparing results. If you happen to be using a LOM or e1000e part that is integrated with a chipset like ich8/9 it is likely one of the bios update(s) will upgrade/repair the eeprom (since on those parts the eeprom is embedded as part of the BIOS flash rom). This is a laptop, so it is most probably built in. But isn't the MAC address stored in EEPROM? A BIOS image is not machine specific, so I don't see how that would be able to fully restore the data. depending on the laptop it might be a discreet part with discreet eeprom, or it could be integrated. I don't know because you never mentioned what type of hardware you're using nor did you post an lspci -vvv - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] [PATCH] e100: Add missing dma sync for proper operation with non-coherent caches.
David Daney wrote: I am running the e100 driver on a MIPS 4KEc system (32 bit mips with non-coherent DMA). There was a problem where received packets would get 'stuck' for several seconds at a time and then be released all at once. The cause was that if an interrupt were received when no RX packets were available, the status for the receive buffer would be stuck in the cache, so when the next interrupt arrived the old status value was read (indicating no packets available) instead of the new value. The fix is to call pci_dma_sync_single_for_device on the RX if the packet is not available to invalidate the cache so that at the next interrupt valid status is returned. The driver currently calls pci_dma_sync_single_for_cpu before reading the status, and this is indeed needed for cases like the R1 CPU where the cache can be polluted by speculative execution, but for most machines it is a nop. The patch was tested on 2.6.17-rc4 on a MIPS 4KEc. Signed-off-by: David Daney [EMAIL PROTECTED] --- drivers/net/e100.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/e100.c b/drivers/net/e100.c index 19d32a2..fb8d551 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -1840,6 +1840,11 @@ static int e100_rx_indicate(struct nic *nic, struct rx *rx, if (readb(nic-csr-scb.status) rus_no_res) nic-ru_running = RU_SUSPENDED; + /* We are done looking at the buffer. Prepare it for + * more DMA. */ + pci_dma_sync_single_for_device(nic-pdev, rx-dma_addr, +sizeof(struct rfd), +PCI_DMA_FROMDEVICE); return -ENODATA; } Should the call to pci_dma_sync_single_for_device be DMA_TO_DEVICE since we are giving the memory back to the device? -ack - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] [PATCH] e100: Add missing dma sync for proper operation with non-coherent caches.
David Acker wrote: David Daney wrote: diff --git a/drivers/net/e100.c b/drivers/net/e100.c index 19d32a2..fb8d551 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -1840,6 +1840,11 @@ static int e100_rx_indicate(struct nic *nic, struct rx *rx, if (readb(nic-csr-scb.status) rus_no_res) nic-ru_running = RU_SUSPENDED; +/* We are done looking at the buffer. Prepare it for + * more DMA. */ +pci_dma_sync_single_for_device(nic-pdev, rx-dma_addr, + sizeof(struct rfd), + PCI_DMA_FROMDEVICE); return -ENODATA; } Should the call to pci_dma_sync_single_for_device be DMA_TO_DEVICE since we are giving the memory back to the device? No. We are giving the memory back to the device, but the direction of the data transfer is from the device to memory. David Daney - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] NFS regression? Odd delays and lockups accessing an NFS export.
On Mon, Aug 25, 2008 at 11:04:08AM -0500, Tom Tucker wrote: Trond Myklebust wrote: On Sun, 2008-08-24 at 23:09 +0100, Ian Campbell wrote: (added some quoting from previous mail to save replying twice) On Sun, 2008-08-24 at 15:19 -0400, Trond Myklebust wrote: On Sun, 2008-08-24 at 15:17 -0400, Trond Myklebust wrote: From the tcpdump, it looks as if the NFS server is failing to close the socket, when the client closes its side. You therefore end up getting stuck in the FIN_WAIT2 state (as netstat clearly shows above). Is the server keeping the client in this state for a very long period? Well, it had been around an hour and a half on this occasion. Next time it happens I can wait longer but I'm pretty sure I've come back from time away and it's been wedged for at least a day. How long would you expect it to remain in this state for? The server should ideally start to close the socket as soon as it receives the FIN from the client. I'll have a look at the code. I don't think it should matter how long the connection stays in FIN WAIT, the client should reconnect anyway. Since the client seems to be the variable, I would think it might be an issue with the client reconnect logic? That said, 2.6.25 is when the server side transport switch logic went in. Any chance you could help Trond figure out why the server might be doing this? If not, I'll get to it, but not as soon as I should. --b. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
[E1000-devel] Port showing link down at random times
Hi All, I'm having a problem with my e1000 cards intermittently shutting down. This is happening across multiple cards, on multiple systems. I get the following messages in the syslog at the time: Aug 26 22:58:36 ElizaQOS kernel: e1000: eth13: e1000_watchdog: NIC Link is Down ethtool and mii-tool both show a different status for the affected port: --- [EMAIL PROTECTED]:~$ sudo mii-tool -v eth13 eth13: negotiated 100baseTx-FD flow-control, link ok product info: vendor 00:50:43, model 2 rev 5 basic mode: autonegotiation enabled basic status: autonegotiation complete, link ok capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control [EMAIL PROTECTED]:~$ sudo ethtool eth13 Settings for eth13: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: Unknown! (65535) Duplex: Unknown! (255) Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: umbg Wake-on: d Current message level: 0x0007 (7) Link detected: no --- The port can be brought up by issuing this command: ethtool -s eth13 speed 100 duplex full autoneg on which I assume simply restarts autonegotiation. Turning off autoneg and setting the speed/duplex doesn't seem to help. This is on a Debian 2.6.18 kernel. lspci and lspci -t attached. Any ideas what might be happening? - [EMAIL PROTECTED]:~$ lspci 00:00.0 Host bridge: Intel Corporation Q963/Q965 Memory Controller Hub (rev 02) 00:01.0 PCI bridge: Intel Corporation Q963/Q965 PCI Express Root Port (rev 02) 00:02.0 VGA compatible controller: Intel Corporation Q963/Q965 Integrated Graphics Controller (rev 02) 00:03.0 Communication controller: Intel Corporation Q963/Q965 HECI Controller (rev 02) 00:19.0 Ethernet controller: Intel Corporation 82566DM Gigabit Network Connection (rev 02) 00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 (rev 02) 00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 (rev 02) 00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 (rev 02) 00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02) 00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2) 00:1f.0 ISA bridge: Intel Corporation 82801HO (ICH8DO) LPC Interface Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation 82801HB (ICH8) SATA AHCI Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02) 03:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller 04:0c.0 PCI bridge: Pericom Semiconductor Unknown device 01a7 04:0d.0 PCI bridge: Pericom Semiconductor Unknown device 01a7 04:0e.0 PCI bridge: Pericom Semiconductor Unknown device 01a7 04:0f.0 PCI bridge: Pericom Semiconductor Unknown device 01a7 05:0b.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 05:0b.1 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 05:0d.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 05:0d.1 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 06:0b.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 06:0b.1 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 06:0d.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 06:0d.1 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 07:0b.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 07:0b.1 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 07:0d.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 07:0d.1 Ethernet controller: Intel Corporation
Re: [E1000-devel] Port showing link down at random times
I think there i som confusion here. Eithe both sides of the link need to be in auto-neg mode or both sides need to be forced to something like 100M/Full Duplex. This is by spec. If one side is forced and the other auto-neg unpredictable things can happen to the link. Please try to force both sides to the link to 100/full and see if the problem goes away. I'm assuming that you are on a switch and not a hub, is this correct? And that it is a 10/100Mbps switch. Is this also corrct? Cheers, John --- Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety., Benjamin Franklin 1755 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leigh Sharpe Sent: Tuesday, August 26, 2008 4:27 PM To: e1000-devel@lists.sourceforge.net Subject: [E1000-devel] Port showing link down at random times Hi All, I'm having a problem with my e1000 cards intermittently shutting down. This is happening across multiple cards, on multiple systems. I get the following messages in the syslog at the time: Aug 26 22:58:36 ElizaQOS kernel: e1000: eth13: e1000_watchdog: NIC Link is Down ethtool and mii-tool both show a different status for the affected port: --- [EMAIL PROTECTED]:~$ sudo mii-tool -v eth13 eth13: negotiated 100baseTx-FD flow-control, link ok product info: vendor 00:50:43, model 2 rev 5 basic mode: autonegotiation enabled basic status: autonegotiation complete, link ok capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control [EMAIL PROTECTED]:~$ sudo ethtool eth13 Settings for eth13: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: Unknown! (65535) Duplex: Unknown! (255) Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: umbg Wake-on: d Current message level: 0x0007 (7) Link detected: no --- The port can be brought up by issuing this command: ethtool -s eth13 speed 100 duplex full autoneg on which I assume simply restarts autonegotiation. Turning off autoneg and setting the speed/duplex doesn't seem to help. This is on a Debian 2.6.18 kernel. lspci and lspci -t attached. Any ideas what might be happening? - [EMAIL PROTECTED]:~$ lspci 00:00.0 Host bridge: Intel Corporation Q963/Q965 Memory Controller Hub (rev 02) 00:01.0 PCI bridge: Intel Corporation Q963/Q965 PCI Express Root Port (rev 02) 00:02.0 VGA compatible controller: Intel Corporation Q963/Q965 Integrated Graphics Controller (rev 02) 00:03.0 Communication controller: Intel Corporation Q963/Q965 HECI Controller (rev 02) 00:19.0 Ethernet controller: Intel Corporation 82566DM Gigabit Network Connection (rev 02) 00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 (rev 02) 00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 (rev 02) 00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 (rev 02) 00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02) 00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2) 00:1f.0 ISA bridge: Intel Corporation 82801HO (ICH8DO) LPC Interface Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation 82801HB (ICH8) SATA AHCI Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02) 03:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller 04:0c.0 PCI bridge: Pericom Semiconductor Unknown device 01a7 04:0d.0 PCI bridge: Pericom Semiconductor Unknown device 01a7 04:0e.0 PCI bridge: Pericom Semiconductor Unknown device 01a7 04:0f.0 PCI bridge: Pericom Semiconductor Unknown device 01a7 05:0b.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) 05:0b.1
Re: [E1000-devel] Port showing link down at random times
Leigh Sharpe wrote: Hi All, I'm having a problem with my e1000 cards intermittently shutting down. This is happening across multiple cards, on multiple systems. I get the following messages in the syslog at the time: Aug 26 22:58:36 ElizaQOS kernel: e1000: eth13: e1000_watchdog: NIC Link is Down This message is from the driver and it means that the interface lost link for some reason. What is the link partner? ethtool and mii-tool both show a different status for the affected Use ethtool [EMAIL PROTECTED]:~$ sudo ethtool eth13 Settings for eth13: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: Unknown! (65535) Duplex: Unknown! (255) Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: umbg Wake-on: d Current message level: 0x0007 (7) Link detected: no --- Ethtool output matches with the e1000 link down message. The port can be brought up by issuing this command: ethtool -s eth13 speed 100 duplex full autoneg on This command is actually incorrect. To force speed/duplex you need to use autoneg off, but as John pointed out this will be invalid configuration if your link partner is set to auto. To reset the autonegotiation use ethtool -r instead. which I assume simply restarts autonegotiation. Turning off autoneg and setting the speed/duplex doesn't seem to help. This is on a Debian 2.6.18 kernel. lspci and lspci -t attached. What is the version of e1000? Any ideas what might be happening? It's not obvious and I have not seen the link go down unless the cable was disconnected, or perhaps a problem with the link partner of some sort. Thanks, Emil - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] Port showing link down at random times
The same thing is happening irrespective of whether I force both sides to the same speed and duplex, or if I use autoneg on both sides. It's happening when connected to a switch, or to a PC, or a hub. -Original Message- From: Ronciak, John [mailto:[EMAIL PROTECTED] Sent: Wednesday, 27 August 2008 9:47 AM To: Leigh Sharpe; e1000-devel@lists.sourceforge.net Subject: RE: [E1000-devel] Port showing link down at random times I think there i som confusion here. Eithe both sides of the link need to be in auto-neg mode or both sides need to be forced to something like 100M/Full Duplex. This is by spec. If one side is forced and the other auto-neg unpredictable things can happen to the link. Please try to force both sides to the link to 100/full and see if the problem goes away. I'm assuming that you are on a switch and not a hub, is this correct? And that it is a 10/100Mbps switch. Is this also corrct? Cheers, John --- Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety., Benjamin Franklin 1755 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leigh Sharpe Sent: Tuesday, August 26, 2008 4:27 PM To: e1000-devel@lists.sourceforge.net Subject: [E1000-devel] Port showing link down at random times Hi All, I'm having a problem with my e1000 cards intermittently shutting down. This is happening across multiple cards, on multiple systems. I get the following messages in the syslog at the time: Aug 26 22:58:36 ElizaQOS kernel: e1000: eth13: e1000_watchdog: NIC Link is Down ethtool and mii-tool both show a different status for the affected port: --- [EMAIL PROTECTED]:~$ sudo mii-tool -v eth13 eth13: negotiated 100baseTx-FD flow-control, link ok product info: vendor 00:50:43, model 2 rev 5 basic mode: autonegotiation enabled basic status: autonegotiation complete, link ok capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control [EMAIL PROTECTED]:~$ sudo ethtool eth13 Settings for eth13: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: Unknown! (65535) Duplex: Unknown! (255) Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: umbg Wake-on: d Current message level: 0x0007 (7) Link detected: no --- The port can be brought up by issuing this command: ethtool -s eth13 speed 100 duplex full autoneg on which I assume simply restarts autonegotiation. Turning off autoneg and setting the speed/duplex doesn't seem to help. This is on a Debian 2.6.18 kernel. lspci and lspci -t attached. Any ideas what might be happening? - [EMAIL PROTECTED]:~$ lspci 00:00.0 Host bridge: Intel Corporation Q963/Q965 Memory Controller Hub (rev 02) 00:01.0 PCI bridge: Intel Corporation Q963/Q965 PCI Express Root Port (rev 02) 00:02.0 VGA compatible controller: Intel Corporation Q963/Q965 Integrated Graphics Controller (rev 02) 00:03.0 Communication controller: Intel Corporation Q963/Q965 HECI Controller (rev 02) 00:19.0 Ethernet controller: Intel Corporation 82566DM Gigabit Network Connection (rev 02) 00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 (rev 02) 00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 (rev 02) 00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 (rev 02) 00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02) 00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2) 00:1f.0 ISA bridge: Intel Corporation 82801HO (ICH8DO) LPC Interface Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation 82801HB (ICH8) SATA AHCI Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev
Re: [E1000-devel] Port showing link down at random times
On Wed, 2008-08-27 at 09:26 +1000, Leigh Sharpe wrote: Hi All, I'm having a problem with my e1000 cards intermittently shutting down. This is happening across multiple cards, on multiple systems. I get the following messages in the syslog at the time: Aug 26 22:58:36 ElizaQOS kernel: e1000: eth13: e1000_watchdog: NIC Link is Down ethtool and mii-tool both show a different status for the affected port: --- [EMAIL PROTECTED]:~$ sudo mii-tool -v eth13 eth13: negotiated 100baseTx-FD flow-control, link ok product info: vendor 00:50:43, model 2 rev 5 basic mode: autonegotiation enabled basic status: autonegotiation complete, link ok capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control what does ethtool -i eth13 say? also, can you tell us if you see the Intel AMT bios pre-boot screen come up? we have heard lots of reports of people having interaction problems with AMT, but believe the driver to have solved most of them now. ethtool -i will say but you didn't report your driver version. Would you be willing to try the e1000e driver from sourceforge? version 0.4.1.7 would be the best. you would have to manually remove e1000 and install e1000e in its place, changing modprobe.conf is probably necessary. Another option is to run a more recent kernel, but that is much harder to get set up than just upgrading our driver. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] Port showing link down at random times
what does ethtool -i eth13 say? [EMAIL PROTECTED]:~$ sudo ethtool -i eth13 driver: e1000 version: 7.1.9-k4-NAPI firmware-version: N/A bus-info: :08:0b.0 [EMAIL PROTECTED]:~$ also, can you tell us if you see the Intel AMT bios pre-boot screen come up? No pre-boot screen. I'll have a go at the new driver and let you know. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel