On Tue, 2018-12-04 at 11:39 +0100, Tom Hetmer wrote: > Sure. It seems there's a similar ticket > already: https://github.com/chu11/freeipmi-mirror/issues/19
Ahh, if you could, update it with info from ipmitool / ipmiutil. I was reluctant to add support based on reverse engineering. But if other tools have "official" interpretations from Supermicro, I'm more confident in the addition. > Yep, that's the code. ipmitool and a few others decode it too. > > > We have a *lot* of Supermicros so I can help with testing if needed - > but we don't get that much CRC errors though :) The one thing I'll need is product ID numbers (you can get from bmc- info) and the name of the product. This goes into the documentation and some of the code. Thanks, Al > So I guess we'd have to wait till one pops up. But I hope the 'ver 2' > method from ipmiutil works fine. > We used ipmitool in our monitoring before and it was accurate but > slow, that's why I rewrote it all to use freeipmi. > > > Thanks! > > > Best, > Tom Hetmer > > > CDN77 Operations > supp...@cdn77.com / +44 (0) 20 3514 2399 / www.cdn77.com > > ----- Původní zpráva ----- > > Odesilatel: "Albert Chu" <ch...@llnl.gov> > > Příjemce: "Tom Hetmer" <tomas.het...@cdn77.com>, freeipmi-users@gnu > > .org > > Datum: 12/03/18 21:06 > > Předmět: Re: [Freeipmi-users] Decoding ram errors on supermicro > > > > Hi Tom, > > > > Thanks for the pointer to ipmiutil's code. I assume you found this > > comment: > > > > --- > > /* ver 2 method: 2A 80 = P1_DIMMB1 > > */ > > > > /* SuperMicro > > says: > > > > * pair: %c (data2 >> 4) + 0x40 + (data3 & 0x3) * 3, > > (='B') > > > > * dimm: %c (data2 & 0xf) + > > 0x27, > > > > * cpu: %x (data3 & 0x03) + > > 1); > > > > */ > > --- > > > > I can definitely add it to my todo list. > > > > Would you mind writing up an issue on github here? > > > > https://github.com/chu11/freeipmi-mirror > > > > Al > > > > On Mon, 2018-12-03 at 17:55 +0100, Tom Hetmer wrote: > > > Hi, > > > > > > it'd be good if freeipmi supported decoding the supermicro ECC > > > errors. > > > > > > > > > Manufacturer: Supermicro > > > Product Name: X10DRH LN4 > > > eg. > > > freeipmi > > > 1,Dec-01-2018,06:37:53,Sensor #0,Memory,Critical,Uncorrectable > > > memory > > > error ; OEM Event Data2 code = 3Ah ; OEM Event Data3 code = 81h > > > > > > > > > web interface > > > 1 | 12/01/2018 | 06:37:53 | Memory | Uncorrectable ECC > > > (@DIMMG1(CPU2)) | Asserted > > > > > > > > > something like this worked for me (stolen from ipmiutil) > > > > > > > > > $cpu = ($data3 & 0x03) + 1; > > > > > > > > > $NPAIRS = 26; > > > $rgpairs = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; > > > > > > > > > $bdata = "0x".$data2.$data3; > > > $bdata = hexdec($bdata); > > > $pair = (($bdata & 0xF0) >> 4) - 1; > > > > > > > > > if ($pair < 0) $pair = 0; > > > if ($pair > $NPAIRS) $pair = $NPAIRS - 1; > > > > > > > > > $pair = $rgpairs[$pair - 1]; > > > > > > > > > $dimm = $bdata & 0x0F; > > > > > > > > > $dimm may be incorrect as the original code decrements 9, but on > > > that > > > board it was wrong so i changed it to get the right result - > > > we'll > > > see if it keeps getting the right values. > > > > > > Best, > > > Tom Hetmer > > > > > > > > > CDN77 Operations > > > supp...@cdn77.com / +44 (0) 20 3514 2399 / www.cdn77.com > > > > > > _______________________________________________ > > > Freeipmi-users mailing list > > > Freeipmi-users@gnu.org > > > https://lists.gnu.org/mailman/listinfo/freeipmi-users > > > > -- > > Albert Chu > > ch...@llnl.gov > > Computer Scientist > > High Performance Systems Division > > Lawrence Livermore National Laboratory > > _______________________________________________ > Freeipmi-users mailing list > Freeipmi-users@gnu.org > https://lists.gnu.org/mailman/listinfo/freeipmi-users -- Albert Chu ch...@llnl.gov Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory _______________________________________________ Freeipmi-users mailing list Freeipmi-users@gnu.org https://lists.gnu.org/mailman/listinfo/freeipmi-users