On Mon, Oct 21, 2013 at 9:57 AM, Loic Dachary <[email protected]> wrote: > > > On 21/10/2013 18:49, Gregory Farnum wrote: >> I'm not quite sure what questions you're actually asking here... > > I guess I was asking if my understanding was correct. > >> In general, the OSD is not removed from the system without explicit >> admin intervention. When it is removed, all traces of it should be >> zapped (including its key), so it can't reconnect. > > Ok. So reusing osd ids is not an issue. If you reconnect a disk after osd rm > the id it had, you get what you deserve and you can zap the disk so that it > is formated again. Is it what you mean ?
I was actually under the impression that "osd rm" would, itself, clear out the keys as well as the osd map state, but it does not do so. That looks like a serious bug! http://tracker.ceph.com/issues/6605 -Greg > >> If it hasn't been removed, then indeed it will continue working >> properly even if moved to a different box. > > Cool. > > That makes it real simple from the point of view of tools like puppet :-) > > Cheers > >> -Greg >> Software Engineer #42 @ http://inktank.com | http://ceph.com >> >> >> On Mon, Oct 21, 2013 at 9:15 AM, Loic Dachary <[email protected]> wrote: >>> Hi Ceph, >>> >>> In the context of the ceph puppet module ( >>> https://wiki.openstack.org/wiki/Puppet-openstack/ceph-blueprint ) I tried >>> to think about what should be provided to deal with disks / OSDs when they >>> are removed or moved around. >>> >>> Here is a possible scenario: >>> >>> * Machine A dies and contains OSD 42 >>> * ceph osd rm 42 is done to get rid of the OSD >>> * ceph-prepare is called on a new disk and gets OSD id 42 >>> >>> ceph/src/mon/OSDMonitor.cc >>> >>> // allocate a new id >>> for (i=0; i < osdmap.get_max_osd(); i++) { >>> if (!osdmap.exists(i) && >>> pending_inc.new_up_client.count(i) == 0 && >>> (pending_inc.new_state.count(i) == 0 || >>> (pending_inc.new_state[i] & CEPH_OSD_EXISTS) == 0)) >>> goto done; >>> } >>> >>> * The disk of machine A is still good and is plugged into machine C >>> * the udev logic sees it has the ceph magic uuid and contains a well formed >>> osd file system and runs the osd daemon on it. The osd daemon fails and >>> dies because its key does not match. It will try again and fail in the same >>> way when the machine reboots. >>> >>> If the osd id was not reused, the disk would find its way back in the >>> cluster and be reused without manual intervention. Since ceph-disk uses the >>> osd uuid to create the disk, it does not matter that it has been removed : >>> https://github.com/ceph/ceph/blob/master/src/ceph-disk#L458 . I'm not sure >>> to understand how the key that was previously registered is re-imported. If >>> I understand correctly it is created with ceph-osd --mkkey >>> https://github.com/ceph/ceph/blob/master/src/ceph-disk#L1301 and stored in >>> the osd tree at the location specified by --keyring >>> https://github.com/ceph/ceph/blob/master/src/ceph-disk#L1307 . >>> >>> At this point my understanding is that (as long as OSD ids are not reused), >>> removing a disk or moving it from machine to machine, even over a long >>> period of time does not require any action. The OSD id is an int which is >>> probably large enough for any kind of cluster in the near future. The OSD >>> ids that are not used and not removed could be cleaned from time to time to >>> garbage collect the space they use. >>> >>> Please let me know if I've missed something :-) >>> >>> -- >>> Loïc Dachary, Artisan Logiciel Libre >>> All that is necessary for the triumph of evil is that good people do >>> nothing. >>> > > -- > Loïc Dachary, Artisan Logiciel Libre > All that is necessary for the triumph of evil is that good people do nothing. > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
