[ceph-users] replacing all disks in a stretch mode ceph cluster

Zoran Bošnjak Tue, 18 Jul 2023 08:37:34 -0700

Hello ceph users, 
my ceph configuration is
- ceph version 17.2.5 on ubuntu 20.04
- stretch mode
- 2 rooms with OSDs and monitors + additional room for the tiebreaker monitor
- 4 OSD servers in each room
- 6 OSDs per OSD server
- ceph installation/administration is manual (without ansible, orch... or any 
other tool like this)


Ceph health is currently OK.
Raw usage is around 60%,
Pools usage is below 75%

I need to replace all OSD disks in the cluster with larger capacity disks (500G 
to 1000G). So the eventual configuration will contain the same number of OSDs 
and servers.

I understand I can replace OSDs one by one, following the documented procedure 
(removing old and adding new OSD to the configuration) and waiting for health 
OK. But in this case, ceph will probably copy data around like crazy after each 
step. So, my question is:

What is the recommended procedure in this case of replacing ALL disks and 
keeping the ceph operational during the upgrade?

In particular:
Should I use any of "nobackfill, norebalance, norecover..." flags during the 
process? If yes, which?
Should I do one OSD at the time, server at the time or even room at the time?

Thanks for the suggestions.

regards,
Zoran
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] replacing all disks in a stretch mode ceph cluster

Reply via email to