Re: Removing disks / OSDs

Mark Kirkwood Mon, 21 Oct 2013 14:22:39 -0700

On 22/10/13 06:17, Gregory Farnum wrote:

On Mon, Oct 21, 2013 at 9:57 AM, Loic Dachary <[email protected]> wrote:



On 21/10/2013 18:49, Gregory Farnum wrote:

I'm not quite sure what questions you're actually asking here...


I guess I was asking if my understanding was correct.

In general, the OSD is not removed from the system without explicit
admin intervention. When it is removed, all traces of it should be
zapped (including its key), so it can't reconnect.


Ok. So reusing osd ids is not an issue. If you reconnect a disk after osd rm 
the id it had, you get what you deserve and you can zap the disk so that it is 
formated again. Is it what you mean ?


I was actually under the impression that "osd rm" would, itself, clear
out the keys as well as the osd map state, but it does not do so. That
looks like a serious bug! http://tracker.ceph.com/issues/6605
-Greg

FWIW the docs herehttp://ceph.com/docs/master/rados/operations/add-or-rm-osds/ say thatyou need to do:


$ ceph osd crush remove osd.{osd.num}
$ ceph auth del osd.{osd-num}
$ ceph osd rm {osd-num}

Regards

If it hasn't been removed, then indeed it will continue working
properly even if moved to a different box.


Cool.

That makes it real simple from the point of view of tools like puppet :-)

Cheers

-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Mon, Oct 21, 2013 at 9:15 AM, Loic Dachary <[email protected]> wrote:

Hi Ceph,

In the context of the ceph puppet module ( 
https://wiki.openstack.org/wiki/Puppet-openstack/ceph-blueprint ) I tried to 
think about what should be provided to deal with disks / OSDs when they are 
removed or moved around.

Here is a possible scenario:

* Machine A dies and contains OSD 42
* ceph osd rm 42 is done to get rid of the OSD
* ceph-prepare is called on a new disk and gets OSD id 42

ceph/src/mon/OSDMonitor.cc

     // allocate a new id
     for (i=0; i < osdmap.get_max_osd(); i++) {
       if (!osdmap.exists(i) &&
           pending_inc.new_up_client.count(i) == 0 &&
           (pending_inc.new_state.count(i) == 0 ||
            (pending_inc.new_state[i] & CEPH_OSD_EXISTS) == 0))
         goto done;
     }

* The disk of machine A is still good and is plugged into machine C
* the udev logic sees it has the ceph magic uuid and contains a well formed osd 
file system and runs the osd daemon on it. The osd daemon fails and dies 
because its key does not match. It will try again and fail in the same way when 
the machine reboots.

If the osd id was not reused, the disk would find its way back in the cluster 
and be reused without manual intervention. Since ceph-disk uses the osd uuid to 
create the disk, it does not matter that it has been removed : 
https://github.com/ceph/ceph/blob/master/src/ceph-disk#L458 . I'm not sure to 
understand how the key that was previously registered is re-imported. If I 
understand correctly it is created with ceph-osd --mkkey 
https://github.com/ceph/ceph/blob/master/src/ceph-disk#L1301 and stored in the 
osd tree at the location specified by --keyring 
https://github.com/ceph/ceph/blob/master/src/ceph-disk#L1307 .

At this point my understanding is that (as long as OSD ids are not reused), 
removing a disk or moving it from machine to machine, even over a long period 
of time does not require any action. The OSD id is an int which is probably 
large enough for any kind of cluster in the near future. The OSD ids that are 
not used and not removed could be cleaned from time to time to garbage collect 
the space they use.

Please let me know if I've missed something :-)

--
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.


--
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Removing disks / OSDs

Reply via email to