This is a description of rescuing an older intel e1000 hardware that had
a corrupted EEPROM. Maybe someone else can use the info from this
success to create their own rescue.
I stumbled across a homeless Dell Precision 650, and since it looked like
an interesting (old) target to use for boot testing stuff on, I gave it a
temporary(!) home. After putting a common linux distro on it, I got this:
--------------------------------------------------
[ 2.997690] e1000: /*********************/
[ 2.997697] e1000: Current EEPROM Checksum : 0x1e5e
[ 2.997699] e1000: Calculated : 0x2b5f
[ 2.997702] e1000: Offset Values
[ 2.997704] e1000: ======== ======
[ 2.997708] 00000000: ff ff 56 16 16 fc 10 0b ff ff ff ff ff ff ff ff
[ 2.997711] 00000010: 01 00 03 00 0b 46 2c 01 28 10 0f 10 86 80 68 b0
[ 2.997714] 00000020: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 2.997717] 00000030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 2.997719] 00000040: 0c c3 61 78 08 1c 02 21 c8 0c ff ff ff ff ff ff
[ 2.997722] 00000050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00 01
[ 2.997725] 00000060: 64 01 02 40 05 12 ff ff ff ff ff ff ff ff ff ff
[ 2.997728] 00000070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff 5e 1e
[ 2.997730] e1000: Include this output when contacting your support
provider.
[ 2.997732] e1000: This is not a software error! Something bad happened
to
[ 2.997735] e1000: your hardware or EEPROM image. Ignoring this problem
could
[ 2.997737] e1000: result in further problems, possibly loss of data,
[ 2.997739] e1000: corruption or system hangs!
[ 2.997741] e1000: The MAC Address will be reset to 00:00:00:00:00:00,
[ 2.997743] e1000: which is invalid and requires you to set the proper
MAC
[ 2.997745] e1000: address manually before continuing to enable this
network
[ 2.997748] e1000: device. Please inspect the EEPROM dump and report the
[ 2.997750] e1000: issue to your hardware vendor or Intel Customer
Support.
[ 2.997752] e1000: /*********************/
[ 2.997759] e1000 0000:03:0e.0: (unregistered net_device): Invalid MAC
Address
--------------------------------------------------
Great. Driver fail, with a handfull of binary gobbledy-gook. A bit of
digging and it turns out we can get the same data from ethtool on demand:
root@crapbox:~# ethtool -e eth2 | head -n 10
Offset Values
------ ------
0x0000: ff ff 56 16 16 fc 10 0b ff ff ff ff ff ff ff ff
0x0010: 01 00 03 00 0b 46 2c 01 28 10 0f 10 86 80 68 b0
0x0020: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0040: 0c c3 61 78 08 1c 02 21 c8 0c ff ff ff ff ff ff
0x0050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00 01
0x0060: 64 01 02 40 05 12 ff ff ff ff ff ff ff ff ff ff
0x0070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff 5e 1e
root@crapbox:~#
Note the 1e5e in the last 2 bytes of the data dump. Same as that
reported by the driver above for current checksum.
Would be nice to know what the bits-n-bytes are though. Turns out
that it is actually documented:
http://www.intel.com/design/network/applnots/ap470.htm
The above takes you to "82546GB/EB and 82545GM/EM Gigabit Ethernet
Controller
EEPROM Map and Programming Information Application Note (AP-470)".
With that, I find out that the 1st chunk of EEPROM is for the MAC (no real
surprise there). And that the last two bytes are the values needed for
the whole 0x40 words (0x80 bytes) to checksum to 0xBABA.
So I test this on another old dell I have nearby:
------------------------------------------------------------------
root@gx270:~# ethtool -e eth1 | head -n 10
Offset Values
------ ------
0x0000: 00 0f 1f d7 8a f5 10 0b 98 99 ff ff ff ff ff ff
0x0010: 05 00 01 a0 0b 66 51 01 28 10 0e 10 86 80 20 b0
0x0020: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0040: 04 e3 61 78 07 1b 03 21 c8 0c ff ff ff ff ff ff
0x0050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00 01
0x0060: ec 01 02 40 05 12 ff ff ff ff ff ff ff ff ff ff
0x0070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff 2a e9
root@gx270:~# cat bc-script
obase=16
ibase=16
0F00+D71F+F58A+0B10+9998+FFFF+FFFF+FFFF+0005+A001+660B+0151+1028+100E+8086+B020+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+E304+7861+1B07+2103+0CC8+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+0100+01EC+4002+1205+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+E92A
root@gx270:~# bc < bc-script
30BABA
root@gx270:~# ifconfig eth1|grep eth1
eth1 Link encap:Ethernet HWaddr 00:0f:1f:d7:8a:f5
root@gx270:~#
------------------------------------------------------------------
Sure enough, it works. Checksum (ignoring carry) is 0xBABA just like
the in-kernel driver code checks for. And we see the MAC in the 1st
6 bytes of the dump.
The Dell GX270 has a 82540EM, where the precision 650 has a 82545EM,
so we expect the EEPROM to be different.
On the other hand, they are quite similar. After the MAC address, we
see in both 10 0b. Really it is only the leading 0xff 0xff in the
precision 650 that looks rather suspicious as "erased". Taking that
one step further, lets assume that the damage is limited to two bytes.
So we are looking for a Dell MAC that starts with XX:XX:56 maybe.
Knowing that the 1st three bytes of a MAC are vendor specific, I
look for a list for Dell. Here is one such site:
http://www.coffer.com/mac_find/?string=Dell
There are about two dozen, but only one matches the xx:xx:56,
that being "000D56 -- Dell PCBA Test". Lets plug that into our
possibly corrupted EEPROM, with just those two values changed,
and see what the checksum comes out to be:
------------------------------------------
root@crapbox:~# cat bc-script
obase=16
ibase=16
0D00+1656+FC16+0B10+FFFF+FFFF+FFFF+FFFF+0001+0003+460B+012C+1028+100F+8086+B068+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+C30C+7861+1C08+2102+0CC8+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+0100+0164+4002+1205+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+FFFF+1E5E
root@crapbox:~# bc < bc-script
2EBABA
root@crapbox:~#
------------------------------------------
Woot! We've confirmed that replacing the two leading 0xff values
with 0x0d, 0x00 used by "Dell PCBA Test" will make the EEPROM
image pass the checksum test by returning 0xBABA. The corruption
is limited to the 1st two bytes. So now we just need to write those
back to the EEPROM.
Turns out that ethtool can do this too:
# ethtool -E eth2 magic 0x100f8086 offset 0x0 value 0x00
# ethtool -E eth2 magic 0x100f8086 offset 0x01 value 0x0D
Be sure to select the correct device, if you have multiple
cards like I do! The magic value is there for that reason.
The "magic" is just the PCI device ID and vendor ID (lspci -nvv).
Dumping the contents shows the writes "stuck" and now the driver
loads without any complaints and I have a working gigE interface!
-----------------
root@crapbox:~# ethtool -e eth2 | head -n 10
Offset Values
------ ------
0x0000: 00 0d 56 16 16 fc 10 0b ff ff ff ff ff ff ff ff
0x0010: 01 00 03 00 0b 46 2c 01 28 10 0f 10 86 80 68 b0
0x0020: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0040: 0c c3 61 78 08 1c 02 21 c8 0c ff ff ff ff ff ff
0x0050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00 01
0x0060: 64 01 02 40 05 12 ff ff ff ff ff ff ff ff ff ff
0x0070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff 5e 1e
root@crapbox:~#
---------------
Obviously, if your EEPROM corruption is more extensive, you may need
to find a similar system that you can "steal" the EEPROM data from.
Knowing the above, you could tweak the MAC (to make it unique) and
tweak the checksum to preserve the magic 0xBABA. (or just re-use
the MAC if the computers are worlds apart!)
See also this page, which contained useful info:
http://blog.vodkamelone.de/archives/146-Unbricking-an-Intel-Pro1000-e1000-network-interface.html
Good luck in your own rescue attempts!
------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired