Kernel v2.6.16.y (+ patches)
e1000e v0.5.11.2

Hi,

I'm facing NVM corruption on both ports a dual port NIC module:

  PCI: Enabling device 0000:09:00.0 (0000 -> 0003)
  ACPI: PCI Interrupt 0000:09:00.0[A] -> GSI 19 (level, low) -> IRQ 193
  PCI: Setting latency timer of device 0000:09:00.0 to 64
  0000:09:00.0: 0000:09:00.0: The NVM Checksum Is Not Valid
  ACPI: PCI interrupt for device 0000:09:00.0 disabled
  e1000e: probe of 0000:09:00.0 failed with error -5
  PCI: Enabling device 0000:09:00.1 (0000 -> 0003)
  ACPI: PCI Interrupt 0000:09:00.1[B] -> GSI 16 (level, low) -> IRQ 169
  PCI: Setting latency timer of device 0000:09:00.1 to 64
  0000:09:00.1: 0000:09:00.1: The NVM Checksum Is Not Valid

Output of lspci is available here [1], here [2] and here [3].

There are three other identical modules in that box which do not face
the issue.  Reportedly the interfaces more or less worked before
the upgrade (before that version was the all-in-one e1000 driver
v7.6.15.5).  However, both these interfaces reportedly both failed
several times before the upgrade of the driver.

Wrt to http://lkml.org/lkml/2008/9/25/510 and the patches mentioned
therein I backported specifically

 e1000e: allow bad checksum

As expected, the interface does not work after that, but the output
is different:

  PCI: Enabling device 0000:09:00.1 (0000 -> 0003)
  ACPI: PCI Interrupt 0000:09:00.1[B] -> GSI 16 (level, low) -> IRQ 169
  PCI: Setting latency timer of device 0000:09:00.1 to 64
  0000:09:00.1: 0000:09:00.1: The NVM Checksum Is Not Valid
  0000:09:00.1: 0000:09:00.1: Invalid MAC Address: 00:00:00:00:00:00
  0000:09:00.1: eth7: (PCI Express:2.5GB/s:Width x4) f79b4118M
  0000:09:00.1: eth7: Intel(R) PRO/1000 Network Connection
  0000:09:00.1: eth7: MAC: 1, PHY: 1, PBA No: ffffff-0ff

As I'm only a bit familiar with the HW documetation available for
82571EB modules I need your help:

1. can I safely modify the commit 4a7703582836f55 (Linus tree)

 e1000e: write protect ICHx NVM to prevent malicious write/erase

to include our modules as well in order to find out who is overwriting
memory?

* i copied an apparently correct eeprom from another box (ethtool -e)
and tried to apply it (ethtool -E) on the broken box:

 # ethtool -E eth6 < e1000e-eeprom-eth6 
 Cannot set EEPROM data: Invalid argument

(I specifically made sure that the above mentioned patch to
write-protect the NVRAM was disabled).  Maybe I'm just stupid, but
what is wrong here?

Any help welcome.

Regards.

 /holger

[1] http://people.astaro.com/heitzenberger/e1000e/lspci_tv
[2] http://people.astaro.com/heitzenberger/e1000e/lspci_vvx
[3] http://people.astaro.com/heitzenberger/e1000e/lspci_vvxn


------------------------------------------------------------------------------
Stay on top of everything new and different, both inside and 
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today. 
Use priority code J9JMT32. http://p.sf.net/sfu/p
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel

Reply via email to