Re: [ceph-users] Self shutdown of 1 whole system: Oops, it did it again (not yet anymore)

2018-07-31 Thread Nicolas Huillard
Hi all, The latest hint I received (thanks!) was to replace a failing hardware. Before that, I updated the BIOS, which included a CPU microcode fix for melddown/spectre and probably other thngs. Last time I had checked, the vendor didn't have that fix yet. Since this update, not CATERR

Re: [ceph-users] Self shutdown of 1 whole system: Oops, it did it again

2018-07-24 Thread Nicolas Huillard
Hi all, The same server did it again with the same CATERR exactly 3 days after rebooting (+/- 30 seconds). If it were'nt for the exact +3 days, I would think it's a random event. But exactly 3 days after reboot does not seem random. Nothing I added got me more information (mcelog, pstore, BMC