On Mon, Dec 07, 2015 at 08:41:43PM -0500, Raj, Ashok wrote: > On Tue, Dec 08, 2015 at 12:25:24AM +0100, Borislav Petkov wrote: > > > > Did you miss my statement in my previous mail where I said that the MCE > > is being raised only on the cores of node 0? > > > > That's right.. but i think if MCE is only given to node0, then the system > would panic eveytime with or without the patch. which is why i got confused. > > I somehow misunderstood that with this patch the system didn't panic.
No, the system did panic in both times. The "strange" observation is that the MCE gets reported only on the cores on node 0. Or at least only the printks from mce_panic() on the cores on node0 reach the serial console. If we really broadcast only on node0, then that would be a problem if the corrupted data leaves the node and manages to corrupt storage when written out on some of the other nodes. I'm not sure if the kernel panicking the whole system is on time and there's not a small window between the detection and the panicking, in which the corruption might happen. If so, this'd defeat the purpose of MCE broadcasting but I'm just hypothesizing here. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/