On 2022-06-21 12:23, Larry Rosenman wrote:
When I ran into these errors it turned out to be a hot CPU as I recall. While I'm familiar with the hardware your using. I have no history with *your* equipment. The first 2 things I'd do given ECC is so sensitive, is replace/swap the PSU with a known good one. The CPU(s) should be re-seated && re-greased. The fans operate as intended? At that point a long session with sysutils/memtest86 or a buildworld session should tell you if everything is AOK. Frankly; as to testing memory; working with a single stick at a time would be more conclusive resultingOn 06/21/2022 1:23 pm, Chris wrote:On 2022-06-20 17:23, Larry Rosenman wrote:I'm seeing them constantly:FWIW it looks like a sync(ing) problem between your RAM && CPU cache. Are are your clocks set correctly for your CPU && RAM? Is your CPU too hot? Is the CPU cache ECC?root@freenas[~]# mcelog --dmi[snip]Hrm. IIRC all the BIOS parameters are default (I could be mistaken). It's aSuperMicro X8DTN+ motherboard with:CPU: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz (2400.22-MHz K8-class CPU)Origin="GenuineIntel" Id=0x206c2 Family=0x6 Model=0x2c Stepping=2 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x29ee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,POPCNT,AESNI> AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> AMD Features2=0x1<LAHF> Structured Extended Features3=0x9c000000<IBPB,STIBP,L1DFL,SSBD> VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID TSC: P-state invariant, performance statistics real memory = 77309411328 (73728 MB) avail memory = 75186962432 (71703 MB) (2 packages, 6 core, 12-threads each) and 18 4GB sticks. this ONE slot seems to be a problem.How would you recommend looking for an issue modulo pulling the 2 cpu packages?
in a shorter time to conclusion. HTH Chris
0xBDE49540.asc
Description: application/pgp-keys