Paul,

Thanks for the info. I believe it is OK to keep the memory module until 
we get a replacement.

Antonello

Paul Dorian wrote:
> 
> Hi Antonello,
> 
>     I have every reason to believe the pages are being retired.
> 
>    Look at fmdump -a it will show you pages retired.
> 
>     Sorry I don't know the mdb command but I'm sure you
>     could query and find out what memory has been retired
> 
>  HTH
> 
>  -Paul
> 
> 
> Antonello Cruz wrote:
>> One of our build servers (coupe.sfbay) reported a memory module " 
>> experiencing excessive correctable errors affecting large numbers of 
>> pages."
>>
>> The output of 'fmadm faulty' is:
>> --------------- ------------------------------------  -------------- 
>> ---------
>> TIME            EVENT-ID                              MSG-ID SEVERITY
>> --------------- ------------------------------------  -------------- 
>> ---------
>> Nov 13 12:34:32 bd61220f-bd1c-45d0-9884-f4d1c8b94dc1  GMCA-8000-YN Major
>>
>> Fault class : fault.memory.generic-x86.dimm_ce
>> Affects     : 
>> hc://:product-id=Sun-Fire-X4240:chassis-id=0826QAS0AC:server-id=coupe/motherboard=0/chip=1/memory-controller=0/dram-channel=0/chip-select=0
>>  
>>
>>                    faulted but still in service
>> Problem in  : 
>> hc://:product-id=Sun-Fire-X4240:chassis-id=0826QAS0AC:server-id=coupe/motherboard=0/chip=1/memory-controller=0/dram-channel=0/chip-select=0
>>  
>>
>>                    faulty
>>
>> Description : A memory module is experiencing excessive correctable 
>> errors
>>                affecting large numbers of pages.  Refer to
>>                http://sun.com/msg/GMCA-8000-YN for more information.
>>
>> Response    :
>>
>>                If Solaris is running on bare-metal (native) then 
>> affected memory
>>                pages associated with this memory module will be 
>> removed from
>>                service.
>>
>> Impact      :
>>
>>                Page retirement (where supported) is capped at a small 
>> fraction
>>                of memory to avoid performance impact.
>>
>> Action      : Schedule a repair action to replace the memory module 
>> indicated
>>                by 'fmadm faulty'.  While the errors are correctable in 
>> nature
>>                they may be a precursor to an uncorrectable error which 
>> will
>>                result in downtime.
>>
>>
>> Lab support said it may take about a week to get the memory module to 
>> replace it.
>>
>> My question is if the faulty pages are not retired automatically by 
>> fma, how do I retire them while we wait for the memory module 
>> replacement? I would prefer not to remove the memory module, if 
>> possible. The machine is a x4240.
>>
>> Thanks,
>>
>> Antonello
>> _______________________________________________
>> fm-discuss mailing list
>> fm-discuss@opensolaris.org
>>   
> 
_______________________________________________
fm-discuss mailing list
fm-discuss@opensolaris.org

Reply via email to