Hi,
the docs [0] contain the OSD removal process:
ceph orch osd rm <osd_id(s)> [--replace] [--force] [--zap]
So in your case I'd just remove and zap the faulty OSDs:
ceph orch osd rm 0 3 --force --zap
If you have a managed OSD spec matching your setup, the orchestrator
will simply redeploy OSDs on the wiped disks. Of course, in a
production environment you'd need to be sure if it's safe to wipe an
OSD. So maybe try without --force first to see if it will result in
inactive PGs. Now that those two OSDs are already dead, there's no
real danger here, but I just wanted to mention it.
You can also start with one OSD and see if the process works for you.
Regards,
Eugen
[0] https://docs.ceph.com/en/latest/cephadm/services/osd/#remove-an-osd
Zitat von lejeczek <pelj...@yahoo.co.uk>:
Hi guys.
I've browsing through the net in a search of a relatively clear
"howto" but I failed to find one. It's rather many, sometimes
different notes/thoughts on how to deal with such/similar situation.
Having a 3-node containerized cluster which lost osd - it crushed,
there is nothing wrong with the node, nothing wrong with the disk,
but never mind that.
Is there a howto which covers containerized environment?
One example I followed is:
https://docs.redhat.com/en/documentation/red_hat_ceph_storage/1.2.3/html/red_hat_ceph_administration_guide/setting_unsetting_overrides
but it is - to me - clear, what to do with "broken" containers.
I'm got to:
-> $ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.68359 root default
-3 0 host podster1
-7 0.34180 host podster2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
4 hdd 0.29300 osd.4 up 1.00000 1.00000
-5 0.34180 host podster3
1 hdd 0.04880 osd.1 up 1.00000 1.00000
5 hdd 0.29300 osd.5 up 1.00000 1.00000
yet:
-> $ ceph orch ps --daemon-type=osd
NAME HOST PORTS STATUS REFRESHED AGE MEM
USE MEM LIM VERSION IMAGE ID CONTAINER ID
osd.0 podster1.mine.priv error 7m ago 3w
- 4096M <unknown> <unknown> <unknown>
osd.1 podster3.mine.priv running (25h) 7m ago 3w
942M 4096M 19.2.3 aade1b12b8e6 d71051ea79dc
osd.2 podster2.mine.priv running (6d) 7m ago 3w
1192M 4096M 19.2.3 aade1b12b8e6 e8d05142a73a
osd.3 podster1.mine.priv error 7m ago 2w
- 4096M <unknown> <unknown> <unknown>
osd.4 podster2.mine.priv running (6d) 7m ago 2w
3293M 4096M 19.2.3 aade1b12b8e6 6116277f69d1
osd.5 podster3.mine.priv running (25h) 7m ago 2w
2963M 4096M 19.2.3 aade1b12b8e6 d671bf73cc01
what would be next bits needed to complete such
removal&reuse/re-create of osd(s)?
p.s. This a 'lab' setup so I'm not worried, but it'd be great to
complete this process in a healthy manner.
many thanks, L.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io