Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Oliver Freyermuth
Am 23.07.2018 um 14:59 schrieb Nicolas Huillard: > Le lundi 23 juillet 2018 à 12:40 +0200, Oliver Freyermuth a écrit : >> Am 23.07.2018 um 11:18 schrieb Nicolas Huillard: >>> Le lundi 23 juillet 2018 à 18:23 +1000, Brad Hubbard a écrit : Ceph doesn't shut down systems as in kill or reboot the

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Nicolas Huillard
Le lundi 23 juillet 2018 à 12:40 +0200, Oliver Freyermuth a écrit : > Am 23.07.2018 um 11:18 schrieb Nicolas Huillard: > > Le lundi 23 juillet 2018 à 18:23 +1000, Brad Hubbard a écrit : > > > Ceph doesn't shut down systems as in kill or reboot the box if > > > that's > > > what you're saying? > >

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Nicolas Huillard
Le lundi 23 juillet 2018 à 11:40 +0100, Matthew Vernon a écrit : > One of my server silently shutdown last night, with no explanation > > whatsoever in any logs. According to the existing logs, the > > shutdown > > We have seen similar things with our SuperMicro servers; our current > best theory

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Oliver Freyermuth
Am 23.07.2018 um 11:18 schrieb Nicolas Huillard: > Le lundi 23 juillet 2018 à 18:23 +1000, Brad Hubbard a écrit : >> Ceph doesn't shut down systems as in kill or reboot the box if that's >> what you're saying? > > That's the first part of what I was saying, yes. I was pretty sure Ceph > doesn't

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Matthew Vernon
Hi, > One of my server silently shutdown last night, with no explanation > whatsoever in any logs. According to the existing logs, the shutdown We have seen similar things with our SuperMicro servers; our current best theory is that it's related to CPU power management. Disabling it in BIOS

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Nicolas Huillard
Le lundi 23 juillet 2018 à 18:23 +1000, Brad Hubbard a écrit : > Ceph doesn't shut down systems as in kill or reboot the box if that's > what you're saying? That's the first part of what I was saying, yes. I was pretty sure Ceph doesn't reboot/shutdown/reset, but now it's 100% sure, thanks. Maybe

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Caspar Smit
Do you have any hardware watchdog running in the system? A watchdog could trigger a powerdown if it meets some value. Any event logs from the chassis itself? Kind regards, Caspar 2018-07-21 10:31 GMT+02:00 Nicolas Huillard : > Hi all, > > One of my server silently shutdown last night, with no

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Brad Hubbard
Ceph doesn't shut down systems as in kill or reboot the box if that's what you're saying? On Mon, Jul 23, 2018 at 5:04 PM, Nicolas Huillard wrote: > Le lundi 23 juillet 2018 à 11:07 +0700, Konstantin Shalygin a écrit : >> > I even have no fancy kernel or device, just real standard Debian. >> >

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Nicolas Huillard
Le lundi 23 juillet 2018 à 11:07 +0700, Konstantin Shalygin a écrit : > > I even have no fancy kernel or device, just real standard Debian. > > The > > uptime was 6 days since the upgrade from 12.2.6... > > Nicolas, you should upgrade your 12.2.6 to 12.2.7 due bugs in this > release. That was

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-22 Thread Konstantin Shalygin
I even have no fancy kernel or device, just real standard Debian. The uptime was 6 days since the upgrade from 12.2.6... Nicolas, you should upgrade your 12.2.6 to 12.2.7 due bugs in this release. http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-July/028153.html k

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-22 Thread Nicolas Huillard
Le dimanche 22 juillet 2018 à 02:44 +0200, Oliver Freyermuth a écrit : > Since all services are running on these machines - are you by any > chance running low on memory?  > Do you have a monitoring of this?  I have Munin monitoring on all hosts, but nothing special to notice, except for a +3°C

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-21 Thread Oliver Freyermuth
Since all services are running on these machines - are you by any chance running low on memory? Do you have a monitoring of this? We observe some strange issues with our servers if they run for a long while, and with high memory pressure (more memory is ordered...). Then, it seems our

Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-21 Thread Nicolas Huillard
I forgot to mention that this server, along with all the other Ceph servers in my cluster, do not run anything else than Ceph, and each run  all the Ceph daemons (mon, mgr, mds, 2×osd). Le samedi 21 juillet 2018 à 10:31 +0200, Nicolas Huillard a écrit : > Hi all, > > One of my server silently

[ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-21 Thread Nicolas Huillard
Hi all, One of my server silently shutdown last night, with no explanation whatsoever in any logs. According to the existing logs, the shutdown (without reboot) happened between 03:58:20.061452 (last timestamp from /var/log/ceph/ceph-mgr.oxygene.log) and 03:59:01.515308 (new MON election called,