EBSA-285 SDRAM problems

Jim Fischer Sat, 20 Jan 2001 16:01:23 -0800

I've got an EBSA-285 that has ARM's Angel Debugger in flash 0 and R.King's
ARM/Linux bios v1.10 in flash 6.

The EBSA is configured for host bridge mode and is plugged into the PCIMG
"system" slot (J102) on the "co-host" side of an EB552 (an Intel 21554
non-transparent PCI-PCI bridge evaluation board). Also on the co-host side
of the EB554 is a 3Com 3c905C NIC.

    "Host side primary PCI bus"

    (Host PC's PCI slot)
    +-- EB554 : 21554 non-transparent PCI-PCI bridge --+
    (PCIMG slot) EBSA285 (host bridge mode)
    (misc. PCI slot)  3Com 3c905C NIC
    (misc. PCI slot)  <empty>
    (misc. PCI slot)  <empty>

    "Co-host side primary PCI bus"


I recently upgraded the EBSA285's memory by adding a 128MB DIMM from Viking,

    Item #    H6523
    Descr.    HP; 128MB EEC PC100 DIMM D65

where,

    EBSA285-J11 : Viking 128MB SDRAM DIMM
    EBSA285-J12 : Original 16MB SDRAM DIMM

The total installed memory is now 128MB + 16MB or 144MB (0x09000000).


Looking at the debug output emitted by the ARM/Linux bios during its POST /
device initialization sequence, the bios seems to correctly detect the sizes
and types of the two DIMMs:


Bank address   : 0x00000000
    Test result: loc1 0x00000000, loc2 0x00000040, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x00000000, loc2 0x00100000, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x00000000, loc2 0x00400000, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x00000000, loc2 0x00200000, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x00000000, loc2 0x00040000, wrote 0x55AA55AA, read
0x55AA55AA
  Detect result: 0x0F
  Bank size reg: 0x00000047
Bank address   : 0x04000000
    Test result: loc1 0x04000000, loc2 0x04000040, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x04000000, loc2 0x04100000, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x04000000, loc2 0x04400000, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x04000000, loc2 0x04200000, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x04000000, loc2 0x04040000, wrote 0x55AA55AA, read
0x55AA55AA
  Detect result: 0x0F
  Bank size reg: 0x04000047
Bank address   : 0x08000000
    Test result: loc1 0x08000000, loc2 0x08000040, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x08000000, loc2 0x08100000, wrote 0xAA55AA55, read
0x55AA55AA
    Test result: loc1 0x08000000, loc2 0x08400000, wrote 0xAA55AA55, read
0x55AA55AA
    Test result: loc1 0x08000000, loc2 0x08200000, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x08000000, loc2 0x08040000, wrote 0x55AA55AA, read
0x55AA55AA
  Detect result: 0x03
  Bank size reg: 0x08000014
Bank address   : 0x0C000000
    Test result: loc1 0x0C000000, loc2 0x0C000040, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x0C000000, loc2 0x0C100000, wrote 0xAA55AA55, read
0x55AA55AA
    Test result: loc1 0x0C000000, loc2 0x0C400000, wrote 0xAA55AA55, read
0x55AA55AA
    Test result: loc1 0x0C000000, loc2 0x0C200000, wrote 0x55AA55AA, read
0x55AA55AA
    Test result: loc1 0x0C000000, loc2 0x0C040000, wrote 0x55AA55AA, read
0x55AA55AA
  Detect result: 0x03
  Bank size reg: 0x0C000014
Total RAM size : 0x09000000


The bios's "quick check" memory test apparently passes, because it emits the
(expected) diagnostic

    147456 KB SDRAM OK

If I now press 's' twice to enter the manual boot menu, and then press 'm'
to run the "in depth" memory checks, these memory tests also pass; no error
messages are emitted and the tests run to completion on all blocks of
memory:

    Checking chunk 0x08f000c0...Ok
    Checking chunk 0x00000000... (thru 0x08e80000)


Following BIOS POST/init phase, the BIOS successfully downloads (via
bootp/tftp/NFS) and boots a Linux kernel (2.4.0-test8-rmk5). The Linux
kernel also seems to detect the correct amount of memory:

[root@ebsa285: /root]$ cat /proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
Mem:  146206720  7913472 138293248        0        0  4751360
Swap:        0        0        0
MemTotal:    142780 kB
MemFree:     135052 kB
MemShared:        0 kB
Buffers:          0 kB
Cached:        4640 kB
HighTotal:        0 kB
HighFree:         0 kB
LowTotal:    142780 kB
LowFree:     135052 kB
SwapTotal:        0 kB
SwapFree:         0 kB
[root@ebsa285: /root]$

So far, so good.


Here's the problem (finally!): After I log on to the EBSA285, the system
works fine for a while -- sometimes for a few minutes, sometimes overnight,
etc. Eventually, however, the 3Com 3c905C NIC on the co-host side fails and
the NIC's Linux device driver begins emitting error messages via the EBSA's
serial port. For example, if I try to build Perl (recall I'm working with an
NFS/root file system), the NIC ultimately fails and the EBSA emits the
diagnostics shown below:


$ make #perl

... runs for a while ...

 Making Fcntl (dynamic)
Writing Makefile for Fcntl
make[1]: Entering directory `/usr/src/redhat/BUILD/perl-5.6.0/ext/Fcntl'
mkdir ../../lib/auto/Fcntl
make[1]: Leaving directory `/usr/src/redhat/BUILD/perl-5.6.0/ext/Fcntl'
make[1]: Entering directory `/usr/src/redhat/BUILD/perl-5.6.0/ext/Fcntl'
cp Fcntl.pm ../../lib/Fcntl.pm
../../miniperl -I../../lib -I../../lib
../../lib/ExtUtils/xsubpp -noprototypes -typemap ../../lib/ExtUtils/typemap
Fcntl.xs > Fcntl.xsc && mv Fcntl.xsc Fcntl.c
cc -c  -DOVR_DBL_DIG=14 -fno-strict-aliasing -I/usr/local/include -D_LARGEFI
LE_SOURCE -D_FILE_OFFSET_BITS=64 -O2     -DVERSION=\"1.03\" -DXS_VERSION=\"1
.03\"  -I../..  Fcntl.c
Running Mkbootstrap for Fcntl ()
chmod 644 Fcntl.bs
LD_RUN_PATH="" cc -o ../../lib/auto/Fcntl/Fcntl.o  -r  -L/usr/local/lib
Fcntl.o
eth0: Host error, FIFO diagnostic register 0000.
eth0: Host error, ... (etc., 31 more times) ...
eth0: Host error, FIFO diagnostic register 0000.
eth0: Too much work in interrupt, status e003.
# then 32 more of these:
eth0: Host error, FIFO diagnostic register 0000.
eth0: Too much work in interrupt, status e003.
eth0: Host error, FIFO diagnostic register 0000.
# etc...
eth0: Host error, FIFO diagnostic register 0000.
eth0: Too much work in interrupt, status e003.
eth0: Host error, FIFO diagnostic register 0000.
# etc., etc...
eth0: Host error, FIFO diagnostic register 0000.
eth0: Too much work in interrupt, status e003.
nfs: server 129.65.26.95 not responding, still trying
nfs: server 129.65.26.95 not responding, still trying
nfs: server 129.65.26.95 not responding, still trying
nfs: server 129.65.26.95 not responding, still trying
eth0: Host error, FIFO diagnostic register 0000.
etc...
eth0: Host error, FIFO diagnostic register 0000.
eth0: Too much work in interrupt, status e003.
nfs: server 129.65.26.95 not responding, still trying
eth0: Host error, FIFO diagnostic register 0000.

and so on...

Note that if I remove the 128MB DIMM and move the (original) 16MB DIMM from
J10 back to J11 everything works fine (except, of course, I cannot compile
Perl because the make runs out of memory!).


So, does anyone have any ideas as to what might be causing the NIC to fail
when the 128MB DIMM is installed?


Jim



_______________________________________________
http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm
Please visit the above address for information on this list.
EBSA-285 SDRAM problems

Reply via email to