On 20/11/13 22:27, Robert van Leeuwen wrote:
Hi,

What is the easiest way to replace a failed disk / OSD.
It looks like the documentation here is not really compatible with ceph_deploy:
http://ceph.com/docs/master/rados/operations/add-or-rm-osds/

It is talking about adding stuff to the ceph.conf while ceph_deploy works in a 
different way.
(I've tried it without adding to ceph.conf and that obviously did not work)

Is there a easy way to replace a single failed OSD which has been deployed with 
ceph_deploy?
You could remove the OSD and add a new one but I would prefer to just reuse the 
current config / OSD numbers.
Basically I would like to do a partition/format and some ceph commands to get 
stuff working again...


I looked at this a while back (experimenting with how to add a failed osd back in after rebuilding it). Unfortunately I can't find my notes right now - but I recall the key features being (following the docs for removing an osd):

- checking it is stopped/failed
- remove it from the crushmap (ceph osd crush remove)
- delete its auth key (ceph auth del)
- remove the osd (ceph osd rm)

Then recreate it using cep-deploy (osd create of similar), would (usually? reuse the (now) vacant osd num slot.

Nothing needed for editing ceph.conf I recall.

I'll take a better look at this tomorrow (unless someone who knows this better - which is probably a big list - corrects my musings)!

Cheers

Mark

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to