I did do that. It didn't add the auth key to ceph, so I had to do that manually. Then it said that osd.0 was set as destroyed, which yes, it was still in crushmap.
I followed the docs to a point. > 26 jan. 2018 kl. 18:50 skrev Wido den Hollander <[email protected]>: > > > > On 01/26/2018 06:37 PM, David Majchrzak wrote: >> Ran: >> ceph auth del osd.0 >> ceph auth del osd.6 >> ceph auth del osd.7 >> ceph osd rm osd.0 >> ceph osd rm osd.6 >> ceph osd rm osd.7 >> which seems to have removed them. > > Did you destroy the OSD prior to running ceph-volume? > > $ ceph osd destroy 6 > > After you've done that you can use ceph-volume to re-create the OSD. > > Wido > >> Thanks for the help Reed! >> Kind Regards, >> David Majchrzak >>> 26 jan. 2018 kl. 18:32 skrev David Majchrzak <[email protected] >>> <mailto:[email protected]>>: >>> >>> Thanks that helped! >>> >>> Since I had already "halfway" created a lvm volume I wanted to start from >>> the beginning and zap it. >>> >>> Tried to zap the raw device but failed since --destroy doesn't seem to be >>> in 12.2.2 >>> >>> http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/ >>> >>> root@int1:~# ceph-volume lvm zap /dev/sdc --destroy >>> usage: ceph-volume lvm zap [-h] [DEVICE] >>> ceph-volume lvm zap: error: unrecognized arguments: --destroy >>> >>> So i zapped it with the vg/lvm instead. >>> ceph-volume lvm zap >>> /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9 >>> >>> However I run create on it since the LVM was already there. >>> So I zapped it with sgdisk and ran dmsetup remove. After that I was able to >>> create it again. >>> >>> However - each "ceph-volume lvm create" that I ran that failed, >>> successfully added an osd to crush map ;) >>> >>> So I've got this now: >>> >>> root@int1:~# ceph osd df tree >>> ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME >>> -1 2.60959 - 2672G 1101G 1570G 41.24 1.00 - root default >>> -2 0.87320 - 894G 369G 524G 41.36 1.00 - host int1 >>> 3 ssd 0.43660 1.00000 447G 358G 90295M 80.27 1.95 301 osd.3 >>> 8 ssd 0.43660 1.00000 447G 11273M 436G 2.46 0.06 19 osd.8 >>> -3 0.86819 - 888G 366G 522G 41.26 1.00 - host int2 >>> 1 ssd 0.43159 1.00000 441G 167G 274G 37.95 0.92 147 osd.1 >>> 4 ssd 0.43660 1.00000 447G 199G 247G 44.54 1.08 173 osd.4 >>> -4 0.86819 - 888G 365G 523G 41.09 1.00 - host int3 >>> 2 ssd 0.43159 1.00000 441G 193G 248G 43.71 1.06 174 osd.2 >>> 5 ssd 0.43660 1.00000 447G 172G 274G 38.51 0.93 146 osd.5 >>> 0 0 0 0 0 0 0 0 0 osd.0 >>> 6 0 0 0 0 0 0 0 0 osd.6 >>> 7 0 0 0 0 0 0 0 0 osd.7 >>> >>> I guess I can just remove them from crush,auth and rm them? >>> >>> Kind Regards, >>> >>> David Majchrzak >>> >>>> 26 jan. 2018 kl. 18:09 skrev Reed Dier <[email protected] >>>> <mailto:[email protected]>>: >>>> >>>> This is the exact issue that I ran into when starting my bluestore >>>> conversion journey. >>>> >>>> See my thread here: https://www.spinics.net/lists/ceph-users/msg41802.html >>>> >>>> Specifying --osd-id causes it to fail. >>>> >>>> Below are my steps for OSD replace/migrate from filestore to bluestore. >>>> >>>> BIG caveat here in that I am doing destructive replacement, in that I am >>>> not allowing my objects to be migrated off of the OSD I’m replacing before >>>> nuking it. >>>> With 8TB drives it just takes way too long, and I trust my failure domains >>>> and other hardware to get me through the backfills. >>>> So instead of 1) reading data off, writing data elsewhere 2) remove/re-add >>>> 3) reading data elsewhere, writing back on, I am taking step one out, and >>>> trusting my two other copies of the objects. Just wanted to clarify my >>>> steps. >>>> >>>> I also set norecover and norebalance flags immediately prior to running >>>> these commands so that it doesn’t try to start moving data unnecessarily. >>>> Then when done, remove those flags, and let it backfill. >>>> >>>>> systemctl stop ceph-osd@$ID.service <mailto:ceph-osd@$id.service> >>>>> ceph-osd -i $ID --flush-journal >>>>> umount /var/lib/ceph/osd/ceph-$ID >>>>> ceph-volume lvm zap /dev/$ID >>>>> ceph osd crush remove osd.$ID >>>>> ceph auth del osd.$ID >>>>> ceph osd rm osd.$ID >>>>> ceph-volume lvm create --bluestore --data /dev/$DATA --block.db /dev/$NVME >>>> >>>> So essentially I fully remove the OSD from crush and the osdmap, and when >>>> I add the OSD back, like I would a new OSD, it fills in the numeric gap >>>> with the $ID it had before. >>>> >>>> Hope this is helpful. >>>> Been working well for me so far, doing 3 OSDs at a time (half of a failure >>>> domain). >>>> >>>> Reed >>>> >>>>> On Jan 26, 2018, at 10:01 AM, David <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> >>>>> Hi! >>>>> >>>>> On luminous 12.2.2 >>>>> >>>>> I'm migrating some OSDs from filestore to bluestore using the "simple" >>>>> method as described in docs: >>>>> http://docs.ceph.com/docs/master/rados/operations/bluestore-migration/#convert-existing-osds >>>>> Mark out and Replace. >>>>> >>>>> However, at 9.: ceph-volume create --bluestore --data $DEVICE --osd-id $ID >>>>> it seems to create the bluestore but it fails to authenticate with the >>>>> old osd-id auth. >>>>> (the command above is also missing lvm or simple) >>>>> >>>>> I think it's related to this: >>>>> http://tracker.ceph.com/issues/22642 >>>>> >>>>> # ceph-volume lvm create --bluestore --data /dev/sdc --osd-id 0 >>>>> Running command: sudo vgcreate --force --yes >>>>> ceph-efad7df8-721d-43d8-8d02-449406e70b90 /dev/sdc >>>>> stderr: WARNING: lvmetad is running but disabled. Restart lvmetad before >>>>> enabling it! >>>>> stdout: Physical volume "/dev/sdc" successfully created >>>>> stdout: Volume group "ceph-efad7df8-721d-43d8-8d02-449406e70b90" >>>>> successfully created >>>>> Running command: sudo lvcreate --yes -l 100%FREE -n >>>>> osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9 >>>>> ceph-efad7df8-721d-43d8-8d02-449406e70b90 >>>>> stderr: WARNING: lvmetad is running but disabled. Restart lvmetad before >>>>> enabling it! >>>>> stdout: Logical volume "osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9" >>>>> created. >>>>> Running command: sudo mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0 >>>>> Running command: chown -R ceph:ceph /dev/dm-4 >>>>> Running command: sudo ln -s >>>>> /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9 >>>>> /var/lib/ceph/osd/ceph-0/block >>>>> Running command: sudo ceph --cluster ceph --name client.bootstrap-osd >>>>> --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o >>>>> /var/lib/ceph/osd/ceph-0/activate.monmap >>>>> stderr: got monmap epoch 2 >>>>> Running command: ceph-authtool /var/lib/ceph/osd/ceph-0/keyring >>>>> --create-keyring --name osd.0 --add-key XXXXXXXX >>>>> stdout: creating /var/lib/ceph/osd/ceph-0/keyring >>>>> stdout: added entity osd.0 auth auth(auid = 18446744073709551615 key= >>>>> XXXXXXXX with 0 caps) >>>>> Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring >>>>> Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/ >>>>> Running command: sudo ceph-osd --cluster ceph --osd-objectstore bluestore >>>>> --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --key >>>>> **************************************** --osd-data >>>>> /var/lib/ceph/osd/ceph-0/ --osd-uuid 138ce507-f28a-45bf-814c-7fa124a9d9b9 >>>>> --setuser ceph --setgroup ceph >>>>> stderr: 2018-01-26 14:59:10.039549 7fd7ef951cc0 -1 >>>>> bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to >>>>> decode label at offset 102: buffer::malformed_input: void >>>>> bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past >>>>> end of struct encoding >>>>> stderr: 2018-01-26 14:59:10.039744 7fd7ef951cc0 -1 >>>>> bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to >>>>> decode label at offset 102: buffer::malformed_input: void >>>>> bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past >>>>> end of struct encoding >>>>> stderr: 2018-01-26 14:59:10.039925 7fd7ef951cc0 -1 >>>>> bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to >>>>> decode label at offset 102: buffer::malformed_input: void >>>>> bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past >>>>> end of struct encoding >>>>> stderr: 2018-01-26 14:59:10.039984 7fd7ef951cc0 -1 >>>>> bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid >>>>> stderr: 2018-01-26 14:59:11.359951 7fd7ef951cc0 -1 key XXXXXXXX >>>>> stderr: 2018-01-26 14:59:11.888476 7fd7ef951cc0 -1 created object store >>>>> /var/lib/ceph/osd/ceph-0/ for osd.0 fsid >>>>> efad7df8-721d-43d8-8d02-449406e70b90 >>>>> Running command: sudo ceph-bluestore-tool --cluster=ceph prime-osd-dir >>>>> --dev >>>>> /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9 >>>>> --path /var/lib/ceph/osd/ceph-0 >>>>> Running command: sudo ln -snf >>>>> /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9 >>>>> /var/lib/ceph/osd/ceph-0/block >>>>> Running command: chown -R ceph:ceph /dev/dm-4 >>>>> Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0 >>>>> Running command: sudo systemctl enable >>>>> ceph-volume@lvm-0-138ce507-f28a-45bf-814c-7fa124a9d9b9 >>>>> stderr: Created symlink from >>>>> /etc/systemd/system/multi-user.target.wants/[email protected] >>>>> >>>>> <mailto:etc/systemd/system/multi-user.target.wants/[email protected]> >>>>> to /lib/systemd/system/[email protected]. >>>>> Running command: sudo systemctl start ceph-osd@0 >>>>> >>>>> ceph-osd.0.log shows: >>>>> >>>>> 2018-01-26 15:09:07.379039 7f545d3b9cc0 4 rocksdb: >>>>> [/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2859 >>>>> <http://version_set.cc:2859/>] Recovered from manifest >>>>> file:db/MANIFEST-000095 succeeded,manifest_file_number is 95, >>>>> next_file_number is 97, last_sequence is 21, log_number is >>>>> 0,prev_log_number is 0,max_column_family is 0 >>>>> >>>>> 2018-01-26 15:09:07.379046 7f545d3b9cc0 4 rocksdb: >>>>> [/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2867 >>>>> <http://version_set.cc:2867/>] Column family [default] (ID 0), log number >>>>> is 94 >>>>> >>>>> 2018-01-26 15:09:07.379087 7f545d3b9cc0 4 rocksdb: EVENT_LOG_v1 >>>>> {"time_micros": 1516979347379083, "job": 1, "event": "recovery_started", >>>>> "log_files": [96]} >>>>> 2018-01-26 15:09:07.379091 7f545d3b9cc0 4 rocksdb: >>>>> [/build/ceph-12.2.2/src/rocksdb/db/db_impl_open.cc:482 >>>>> <http://db_impl_open.cc:482/>] Recovering log #96 mode 0 >>>>> 2018-01-26 15:09:07.379102 7f545d3b9cc0 4 rocksdb: >>>>> [/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2395 >>>>> <http://version_set.cc:2395/>] Creating manifest 98 >>>>> >>>>> 2018-01-26 15:09:07.380466 7f545d3b9cc0 4 rocksdb: EVENT_LOG_v1 >>>>> {"time_micros": 1516979347380463, "job": 1, "event": "recovery_finished"} >>>>> 2018-01-26 15:09:07.381331 7f545d3b9cc0 4 rocksdb: >>>>> [/build/ceph-12.2.2/src/rocksdb/db/db_impl_open.cc:1063 >>>>> <http://db_impl_open.cc:1063/>] DB pointer 0x556ecb8c3000 >>>>> 2018-01-26 15:09:07.381353 7f545d3b9cc0 1 >>>>> bluestore(/var/lib/ceph/osd/ceph-0) _open_db opened rocksdb path db >>>>> options >>>>> compression=kNoCompression,max_write_buffer_number=4,min_write_buffer_number_to_merge=1,recycle_log_file_num=4,write_buffer_size=268435456,writable_file_max_buffer_size=0,compaction_readahead_size=2097152 >>>>> 2018-01-26 15:09:07.381616 7f545d3b9cc0 1 freelist init >>>>> 2018-01-26 15:09:07.381660 7f545d3b9cc0 1 >>>>> bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc opening allocation >>>>> metadata >>>>> 2018-01-26 15:09:07.381679 7f545d3b9cc0 1 >>>>> bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc loaded 447 G in 1 extents >>>>> 2018-01-26 15:09:07.382077 7f545d3b9cc0 0 _get_class not permitted to >>>>> load kvs >>>>> 2018-01-26 15:09:07.382309 7f545d3b9cc0 0 <cls> >>>>> /build/ceph-12.2.2/src/cls/cephfs/cls_cephfs.cc:197 >>>>> <http://cls_cephfs.cc:197/>: loading cephfs >>>>> 2018-01-26 15:09:07.382583 7f545d3b9cc0 0 _get_class not permitted to >>>>> load sdk >>>>> 2018-01-26 15:09:07.382827 7f545d3b9cc0 0 <cls> >>>>> /build/ceph-12.2.2/src/cls/hello/cls_hello.cc:296 >>>>> <http://cls_hello.cc:296/>: loading cls_hello >>>>> 2018-01-26 15:09:07.385755 7f545d3b9cc0 0 _get_class not permitted to >>>>> load lua >>>>> 2018-01-26 15:09:07.386073 7f545d3b9cc0 0 osd.0 0 crush map has features >>>>> 288232575208783872, adjusting msgr requires for clients >>>>> 2018-01-26 15:09:07.386078 7f545d3b9cc0 0 osd.0 0 crush map has features >>>>> 288232575208783872 was 8705, adjusting msgr requires for mons >>>>> 2018-01-26 15:09:07.386079 7f545d3b9cc0 0 osd.0 0 crush map has features >>>>> 288232575208783872, adjusting msgr requires for osds >>>>> 2018-01-26 15:09:07.386132 7f545d3b9cc0 0 osd.0 0 load_pgs >>>>> 2018-01-26 15:09:07.386134 7f545d3b9cc0 0 osd.0 0 load_pgs opened 0 pgs >>>>> 2018-01-26 15:09:07.386137 7f545d3b9cc0 0 osd.0 0 using weightedpriority >>>>> op queue with priority op cut off at 64. >>>>> 2018-01-26 15:09:07.386580 7f545d3b9cc0 -1 osd.0 0 log_to_monitors >>>>> {default=true} >>>>> 2018-01-26 15:09:07.388077 7f545d3b9cc0 -1 osd.0 0 init authentication >>>>> failed: (1) Operation not permitted >>>>> >>>>> >>>>> The old osd is still there. >>>>> >>>>> # ceph osd tree >>>>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF >>>>> -1 2.60458 root default >>>>> -2 0.86819 host int1 >>>>> 0 ssd 0.43159 osd.0 destroyed 0 1.00000 >>>>> 3 ssd 0.43660 osd.3 up 1.00000 1.00000 >>>>> -3 0.86819 host int2 >>>>> 1 ssd 0.43159 osd.1 up 1.00000 1.00000 >>>>> 4 ssd 0.43660 osd.4 up 1.00000 1.00000 >>>>> -4 0.86819 host int3 >>>>> 2 ssd 0.43159 osd.2 up 1.00000 1.00000 >>>>> 5 ssd 0.43660 osd.5 up 1.00000 1.00000 >>>>> >>>>> >>>>> What's the best course of action? Purging osd.0, zapping the device again >>>>> and creating without --osd-id set? >>>>> >>>>> >>>>> Kind Regards, >>>>> >>>>> David Majchrzak >>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> [email protected] <mailto:[email protected]> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> [email protected] <mailto:[email protected]> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ >> ceph-users mailing list >> [email protected] >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
