It's possible (unlikely but possible), that the memory in both is from the same batch and is indeed bad. Perhaps before the reboot the bad blocks were in a wired but unused or not critically used section so it won't cause a crash. I would boot into the old kernel on one of them and see if the problem persists? if so, swap the memory and see if it persists.
b On Mon, Jul 19, 2010 at 5:29 PM, Wiley Sanders <[email protected]> wrote: >> How did you decide that the problem isn't a failed memory module? > > 1) The Sun ran reliably for many months up until the kernel update. > 2) Both X4100 and R510 are crashing regularly since the kernel upgrade > was applied to both hosts on successive days. > > The Dell is actually halting with a "kernel panic - not syncing: nmi > watchdog" on the console and has needed a power cycle each time it has > crashed. So, I'm not ruling out hardware problems. > > -w > > _______________________________________________ > Linux-PowerEdge mailing list > [email protected] > https://lists.us.dell.com/mailman/listinfo/linux-poweredge > Please read the FAQ at http://lists.us.dell.com/faq > _______________________________________________ Linux-PowerEdge mailing list [email protected] https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
