Hi

On 09/05/06 18:50, Dana H. Myers wrote:
Now that we've had a chance to do additional investigation, we've nailed-down
the root-cause of the DL385/585 run-time hang.  The specific hardware 
configuration
of the HP DL385/585 machines trips over Opteron Erratum #99 "Background 
Scrubbing Must
Be Disabled With Non-Contiguous Memory Map", and CR 6391605 is filed against 
this.
This problem is present in Nevada build 34 and later and Solaris 10 Update 2; it
is not preset in Solaris 10 Update 1 or FCS.

Thanks, Dana, for all your work on this and I'm sorry I've only caught up with
the more recent part of this thread now or I'd have recognised the issue
sooner :-(

In the "fma for amd64" project we (Solaris) take charge of some platform
configuration normally touched by the BIOS alone.  In particular we prefer
to enable dram, dcache and l2cache hardware scrubbing if this was not
already enabled in the BIOS since those are desirable RAS features.

Some Opteron errata affect dram scrubbing for certain chip revisions.
The BIOS on these systems will have seen that erratum #99 applies to the
installed chip revision(s) and will have chosen not to enable DRAM
scrubbing (the option will be greyed out, or will have no effect
even if you do manage to select it).  Unfortunately Solaris then
blindly goes ahead and enables dram scrubbing, and *if* the
system has discontiguous chip-select ranges on a node (from
manual memory hoisting to reclaim the dram hole) then when the
dram scrubber attempts to access the hole the system wedges (and
always after much the same uptime for the same memory configuration).

To be clear, there is nothing about the HP DL385/585 systems that
makes them uniquely subject to this problem.  It is a combination
of the chip revisions in use, the amount of physcial memory installed
on each node, and that Solaris by default enables the dram scrubber.
Erratum #99 affects chip revisions D and earlier;  in revision E a new
dram hole register is provided and memory hoisting is achieved without
mapping chip-selects discontiguously (and as long as your BIOS version
is rev E aware it will do the right thing).  You also need to have
4GB or more physical memory installed on a single node before the
erratum applies, and must have asked the BIOS to remap the hole or
have a BIOS that just does this automatically.

The other erratum that affects dram scrubbing is #101 - Solaris 10 Update 2
(and Nevada/OpenSolaris) do have code to avoid that erratum (to do
with dram scrubbing in the presence of node memory interleaving).

If you don't know what revision chips you have you can use the AMD
revision guide (search for publication # 25759) on www.amd.com)
and the Solaris boot info:

home-amd64:~> dmesg | grep step
Sep  6 09:24:13 home-amd64 unix: [ID 950921 kern.info] cpu0: x86 (AuthenticAMD 
family 15 model 47 step 0 clock 2211 MHz)

So my home system is family 0xf (Opteron/Athlon 64/Turion) extended
47 (hex 0x2f - extended model 2 and regular model f) stepping 0.  Some
staring at the revguide will show you that is a DH-E3 Athlon 64 socket 939,
ie revision E.

Alternatively, if the fma x64 stuff is installed:

# echo "*mc_list::list mc_t mc_next | ::print mc_t mc_revname" | mdb -k

mc_revname = 0xfffffffff3ff9fa8 "E"

(more lines would appear on multicpu systems).

We can also use mdb to view you chip-select mapping:

# echo "*mc_list::list mc_t mc_next | ::print mc_t mc_cslist | ::list mc_cs_t mccs_next | ::print mc_cs_t mccs_base mccs_size" | mdb -k
mccs_base = 0
mccs_size = 0x40000000
mccs_base = 0x40000000
mccs_size = 0x20000000
mccs_base = 0x60000000
mccs_size = 0x20000000

(there'd be more for multicpu systems).  My chip-selects are contiguous
since each base + size gives the next base.  So even if I had rev D or
earlier chip revision I would not be at risk (and I only have 2GB total
which is less the 4Gb so we knew that, anyway).

We've confirmed that the correct work-around for 6391605 is to add the following
line to /etc/system and reboot:

  set cpu\.AuthenticAMD\.15:ao_scrub_rate_dram = 0

This is the preferred workaround.  It will work if your BIOS is aware of
erratum #99 (most will be).  With reference to

http://cvs.opensolaris.org/source/xref/on/usr/src/uts/i86pc/cpu/amd_opteron/ao_cpu.c

note that the default for ao_scrub_policy is AO_SCRUB_MAX (value 2) in which 
case
the scrub rate we choose is computed as


    178                 ao_scrub_rate_dram =
    179                     ao_scrubber_max(ao_scrub_rate_dram, ao_scrub_bios,
    180                     AMD_NB_SCRUBCTL_DRAM_MASK, 
AMD_NB_SCRUBCTL_DRAM_SHIFT);

The workaround sets ao_scrub_rate_dram to 0 and an erratum #99 aware BIOS
will have ao_scrub_bios with a dram scrub rate of 0, so the result will
be 0 - no dram scrubbing.  *If* the BIOS had not set a rate of 0 for
an affected system (which I guess could happen with down-rev BIOS;
this does not happen on these HP systems) then this workaround could
fail in which case the following bigger hammer will work:

    set cpu\.AuthenticAMD\.15:ao_scrub_policy = 1
    set cpu\.AuthenticAMD\.15:ao_scrub_rate_dram = 0

which sets the AO_SCRUB_FIXED policy (just used the Solaris preferred
rates without MAX'ing with the BIOS rates) and sets the Solaris
preferred dram scrub rate to 0 (disable).

A proper fix for this in Nevada and patch for S10U2 are in-progress.

I have a fix for this tied up in a soon-to-be-putback wad for
FMA support for the recently-launched "Next-Generation AMD Opteron"
processors (also known as Socket F processors, or revision F).
This is due for putback in the next week or so.

Thanks to everyone for your help; I know this has been frustrating.

My apologies for the delay in fixing this, and that it is present in
Solaris 10 Update 2 at all.  I logged 6391605 back in February but
the fix for it is mildly messy and most easily addressed with the
additional structure that the revision F wad necessitated.

Regards

Gavin

--
Gavin Maltby, Solaris Kernel Development.
_______________________________________________
opensolaris-discuss mailing list
[email protected]

Reply via email to