On Tue, Feb 12, 2019 at 12:11:11AM +0100, Hans van Kranenburg wrote: > This means you will have to do things like hop on the upstream > development mailing list, build a reproducable failure case, search for > a developer that has similar hardware and wants to spend time on it, > donate hardware to someone to reproduce the error scenarios or learn how > to do it yourself, or whatever it takes. :)
I had hopes of avoiding doing such. Problem is there are so many pieces of software I have to use that if I jumped on the mailing lists of each of them would be akin to trying to read all of Usenet. I may not be able to avoid that here, but... Looks like Xen's MCE support is in near-useless shape. The code in the git repository mention documentation for family 10h, problem is that is almost entirely decade-old processors. The last apparently significant change was in 2014. The copyright is to AMD, so I guess that means they need more funding. Looks like Intel has been offering more support to Xen. :-( I'm surprised at Xen's handling of MCE. Given Xen's approach to things I would expect MCE handling to be done more by Domain 0. Let Domain 0 handle talking to the memory controller and merely have Xen map the physical address to a domain and domain address. Domain 0 can log all correctable memory errors to a single location, and in case of an uncorrectable error it can panic the machine. (plus Linux's MCE support is in better shape) Handling MCE errors in non-Domain 0 only seems to make sense in HVM where you want to simulate memory errors. -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sig...@m5p.com PGP 87145445 | ) / \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445