[ceph-users] Re: Cluster healthy, but 16.2.7 osd daemon upgrade says its unsafe to stop them?

2022-02-09 Thread Anthony D'Atri
Speculation: might the devicehealth pool be involved? It seems to typically have just 1 PG. > On Feb 9, 2022, at 1:41 PM, Zach Heise (SSCC) wrote: > > Good afternoon, thank you for your reply. Yes I know you are right, > eventually we'll switch to an odd number of mons rather than even.

[ceph-users] Re: Cluster healthy, but 16.2.7 osd daemon upgrade says its unsafe to stop them?

2022-02-09 Thread Zach Heise (SSCC)
Good afternoon, thank you for your reply. Yes I know you are right, eventually we'll switch to an odd number of mons rather than even. We're still in 'testing' mode right now and only my coworkers and I are using the cluster. Of the 7 pools, all but 2 are replica x3. The last two are EC 2+2.

[ceph-users] Re: Cluster healthy, but 16.2.7 osd daemon upgrade says its unsafe to stop them?

2022-02-09 Thread sascha a.
Hello, all your pools running replica > 1? also having 4 monitors is pretty bad for split brain situations.. Zach Heise (SSCC) schrieb am Mi., 9. Feb. 2022, 22:02: > Hello, > > ceph health detail says my 5-node cluster is healthy, yet when I ran > ceph orch upgrade start --ceph-version 16.2.7

[ceph-users] Cluster healthy, but 16.2.7 osd daemon upgrade says its unsafe to stop them?

2022-02-09 Thread Zach Heise (SSCC)
Hello, ceph health detail says my 5-node cluster is healthy, yet when I ran ceph orch upgrade start --ceph-version 16.2.7 everything seemed to go fine until we got to the OSD section, now for the past hour, every 15 seconds a new log entry of  'Upgrade: unsafe to stop osd(s) at this time (1

[ceph-users] Re: Monitoring slow ops

2022-02-09 Thread Trey Palmer
Thank y'all. This metric is exactly what we need. Turns out it was introduced in 14.2.17 and we have 14.2.9. On Wed, Feb 9, 2022 at 2:32 AM Konstantin Shalygin wrote: > Hi, > > On 9 Feb 2022, at 09:03, Benoît Knecht wrote: > > I don't remember in which Ceph release it was introduced, but

[ceph-users] Not able to start MDS after upgrade to 16.2.7

2022-02-09 Thread Izzy Kulbe
Hi, last weekend we upgraded one of our clusters from 16.2.5 to 16.2.7 using cephadm. The upgrade itself seemed to run without a problem but shortly after the upgrade we noticed the servers holding the MDS containers being laggy, then unresponsive, then crashing outright due getting reaped by

[ceph-users] Re: managed block storage stopped working

2022-02-09 Thread Michael Thomas
On 1/7/22 16:49, Marc wrote: Where else can I look to find out why the managed block storage isn't accessible anymore? ceph -s ? I guess it is not showing any errors, and there is probably nothing with ceph, you can do an rbdmap and see if you can just map an image. Then try mapping an

[ceph-users] Re: Advice on enabling autoscaler

2022-02-09 Thread Maarten van Ingen
Hi, Thanks so far for the suggestions. We have enabled the balancer first to make sure PG distribution is more optimal. After a few additions/replacements and data growth it was not optimal. We enabled upmap as this was suggested to be better than the default setting. To limit simultaneous