Re: [ceph-users] "CPU CATERR Fault" Was: Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Nicolas Huillard
Le lundi 23 juillet 2018 à 12:43 +0200, Oliver Freyermuth a écrit : > There ARE chassis/BMC/IPMI level events, one of which is "CPU > > CATERR > > Fault", with a timestamp matching the timestamps below, and no more > > information. > > If this kind of failure (or a less severe one) also happens

Re: [ceph-users] "CPU CATERR Fault" Was: Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Oliver Freyermuth
Am 23.07.2018 um 11:39 schrieb Nicolas Huillard: > Le lundi 23 juillet 2018 à 10:28 +0200, Caspar Smit a écrit : >> Do you have any hardware watchdog running in the system? A watchdog >> could >> trigger a powerdown if it meets some value. Any event logs from the >> chassis >> itself? > > Nice

Re: [ceph-users] "CPU CATERR Fault" Was: Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Nicolas Huillard
Le lundi 23 juillet 2018 à 10:28 +0200, Caspar Smit a écrit : > Do you have any hardware watchdog running in the system? A watchdog > could > trigger a powerdown if it meets some value. Any event logs from the > chassis > itself? Nice suggestions ;-) I see some [watchdog/N] and one [watchdogd]