Hi,

the docs [0] contain the OSD removal process:

ceph orch osd rm <osd_id(s)> [--replace] [--force] [--zap]

So in your case I'd just remove and zap the faulty OSDs:

ceph orch osd rm 0 3 --force --zap

If you have a managed OSD spec matching your setup, the orchestrator will simply redeploy OSDs on the wiped disks. Of course, in a production environment you'd need to be sure if it's safe to wipe an OSD. So maybe try without --force first to see if it will result in inactive PGs. Now that those two OSDs are already dead, there's no real danger here, but I just wanted to mention it.
You can also start with one OSD and see if the process works for you.

Regards,
Eugen

[0] https://docs.ceph.com/en/latest/cephadm/services/osd/#remove-an-osd

Zitat von lejeczek <pelj...@yahoo.co.uk>:

Hi guys.

I've browsing through the net in a search of a relatively clear "howto" but I failed to find one. It's rather many, sometimes different notes/thoughts on how to deal with such/similar situation. Having a 3-node containerized cluster which lost osd - it crushed, there is nothing wrong with the node, nothing wrong with the disk, but never mind that.
Is there a howto which covers containerized environment?
One example I followed is: https://docs.redhat.com/en/documentation/red_hat_ceph_storage/1.2.3/html/red_hat_ceph_administration_guide/setting_unsetting_overrides
but it is - to me - clear, what to do with "broken" containers.
I'm got to:
-> $ ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-1         0.68359  root default
-3               0      host podster1
-7         0.34180      host podster2
 2    hdd  0.04880          osd.2          up   1.00000  1.00000
 4    hdd  0.29300          osd.4          up   1.00000  1.00000
-5         0.34180      host podster3
 1    hdd  0.04880          osd.1          up   1.00000  1.00000
 5    hdd  0.29300          osd.5          up   1.00000  1.00000

yet:
-> $ ceph orch ps --daemon-type=osd
NAME   HOST                PORTS  STATUS         REFRESHED  AGE MEM USE  MEM LIM  VERSION    IMAGE ID      CONTAINER ID osd.0  podster1.mine.priv         error             7m ago 3w        -    4096M  <unknown>  <unknown> <unknown> osd.1  podster3.mine.priv         running (25h)     7m ago 3w     942M    4096M  19.2.3     aade1b12b8e6  d71051ea79dc osd.2  podster2.mine.priv         running (6d)      7m ago   3w 1192M    4096M  19.2.3     aade1b12b8e6  e8d05142a73a osd.3  podster1.mine.priv         error             7m ago 2w        -    4096M  <unknown>  <unknown> <unknown> osd.4  podster2.mine.priv         running (6d)      7m ago   2w 3293M    4096M  19.2.3     aade1b12b8e6  6116277f69d1 osd.5  podster3.mine.priv         running (25h)     7m ago   2w 2963M    4096M  19.2.3     aade1b12b8e6  d671bf73cc01

what would be next bits needed to complete such removal&reuse/re-create of osd(s)? p.s. This a 'lab' setup so I'm not worried, but it'd be great to complete this process in a healthy manner.
many thanks, L.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to