Re: [ceph-users] Migrating filestore to bluestore using ceph-volume

David Majchrzak Fri, 26 Jan 2018 09:54:06 -0800

I did do that.
It didn't add the auth key to ceph, so I had to do that manually. Then it said 
that osd.0 was set as destroyed, which yes, it was still in crushmap.


I followed the docs to a point.


> 26 jan. 2018 kl. 18:50 skrev Wido den Hollander <[email protected]>:
> 
> 
> 
> On 01/26/2018 06:37 PM, David Majchrzak wrote:
>> Ran:
>> ceph auth del osd.0
>> ceph auth del osd.6
>> ceph auth del osd.7
>> ceph osd rm osd.0
>> ceph osd rm osd.6
>> ceph osd rm osd.7
>> which seems to have removed them.
> 
> Did you destroy the OSD prior to running ceph-volume?
> 
> $ ceph osd destroy 6
> 
> After you've done that you can use ceph-volume to re-create the OSD.
> 
> Wido
> 
>> Thanks for the help Reed!
>> Kind Regards,
>> David Majchrzak
>>> 26 jan. 2018 kl. 18:32 skrev David Majchrzak <[email protected] 
>>> <mailto:[email protected]>>:
>>> 
>>> Thanks that helped!
>>> 
>>> Since I had already "halfway" created a lvm volume I wanted to start from 
>>> the beginning and zap it.
>>> 
>>> Tried to zap the raw device but failed since --destroy doesn't seem to be 
>>> in 12.2.2
>>> 
>>> http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/
>>> 
>>> root@int1:~# ceph-volume lvm zap /dev/sdc --destroy
>>> usage: ceph-volume lvm zap [-h] [DEVICE]
>>> ceph-volume lvm zap: error: unrecognized arguments: --destroy
>>> 
>>> So i zapped it with the vg/lvm instead.
>>> ceph-volume lvm zap 
>>> /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9
>>> 
>>> However I run create on it since the LVM was already there.
>>> So I zapped it with sgdisk and ran dmsetup remove. After that I was able to 
>>> create it again.
>>> 
>>> However - each "ceph-volume lvm create" that I ran that failed, 
>>> successfully added an osd to crush map ;)
>>> 
>>> So I've got this now:
>>> 
>>> root@int1:~# ceph osd df tree
>>> ID CLASS WEIGHT  REWEIGHT SIZE  USE    AVAIL  %USE  VAR  PGS TYPE NAME
>>> -1       2.60959        - 2672G  1101G  1570G 41.24 1.00   - root default
>>> -2       0.87320        -  894G   369G   524G 41.36 1.00   -     host int1
>>>  3   ssd 0.43660  1.00000  447G   358G 90295M 80.27 1.95 301         osd.3
>>>  8   ssd 0.43660  1.00000  447G 11273M   436G  2.46 0.06  19         osd.8
>>> -3       0.86819        -  888G   366G   522G 41.26 1.00   -     host int2
>>>  1   ssd 0.43159  1.00000  441G   167G   274G 37.95 0.92 147         osd.1
>>>  4   ssd 0.43660  1.00000  447G   199G   247G 44.54 1.08 173         osd.4
>>> -4       0.86819        -  888G   365G   523G 41.09 1.00   -     host int3
>>>  2   ssd 0.43159  1.00000  441G   193G   248G 43.71 1.06 174         osd.2
>>>  5   ssd 0.43660  1.00000  447G   172G   274G 38.51 0.93 146         osd.5
>>>  0             0        0     0      0      0     0    0   0 osd.0
>>>  6             0        0     0      0      0     0    0   0 osd.6
>>>  7             0        0     0      0      0     0    0   0 osd.7
>>> 
>>> I guess I can just remove them from crush,auth and rm them?
>>> 
>>> Kind Regards,
>>> 
>>> David Majchrzak
>>> 
>>>> 26 jan. 2018 kl. 18:09 skrev Reed Dier <[email protected] 
>>>> <mailto:[email protected]>>:
>>>> 
>>>> This is the exact issue that I ran into when starting my bluestore 
>>>> conversion journey.
>>>> 
>>>> See my thread here: https://www.spinics.net/lists/ceph-users/msg41802.html
>>>> 
>>>> Specifying --osd-id causes it to fail.
>>>> 
>>>> Below are my steps for OSD replace/migrate from filestore to bluestore.
>>>> 
>>>> BIG caveat here in that I am doing destructive replacement, in that I am 
>>>> not allowing my objects to be migrated off of the OSD I’m replacing before 
>>>> nuking it.
>>>> With 8TB drives it just takes way too long, and I trust my failure domains 
>>>> and other hardware to get me through the backfills.
>>>> So instead of 1) reading data off, writing data elsewhere 2) remove/re-add 
>>>> 3) reading data elsewhere, writing back on, I am taking step one out, and 
>>>> trusting my two other copies of the objects. Just wanted to clarify my 
>>>> steps.
>>>> 
>>>> I also set norecover and norebalance flags immediately prior to running 
>>>> these commands so that it doesn’t try to start moving data unnecessarily. 
>>>> Then when done, remove those flags, and let it backfill.
>>>> 
>>>>> systemctl stop ceph-osd@$ID.service <mailto:ceph-osd@$id.service>
>>>>> ceph-osd -i $ID --flush-journal
>>>>> umount /var/lib/ceph/osd/ceph-$ID
>>>>> ceph-volume lvm zap /dev/$ID
>>>>> ceph osd crush remove osd.$ID
>>>>> ceph auth del osd.$ID
>>>>> ceph osd rm osd.$ID
>>>>> ceph-volume lvm create --bluestore --data /dev/$DATA --block.db /dev/$NVME
>>>> 
>>>> So essentially I fully remove the OSD from crush and the osdmap, and when 
>>>> I add the OSD back, like I would a new OSD, it fills in the numeric gap 
>>>> with the $ID it had before.
>>>> 
>>>> Hope this is helpful.
>>>> Been working well for me so far, doing 3 OSDs at a time (half of a failure 
>>>> domain).
>>>> 
>>>> Reed
>>>> 
>>>>> On Jan 26, 2018, at 10:01 AM, David <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>> 
>>>>> 
>>>>> Hi!
>>>>> 
>>>>> On luminous 12.2.2
>>>>> 
>>>>> I'm migrating some OSDs from filestore to bluestore using the "simple" 
>>>>> method as described in docs: 
>>>>> http://docs.ceph.com/docs/master/rados/operations/bluestore-migration/#convert-existing-osds
>>>>> Mark out and Replace.
>>>>> 
>>>>> However, at 9.: ceph-volume create --bluestore --data $DEVICE --osd-id $ID
>>>>> it seems to create the bluestore but it fails to authenticate with the 
>>>>> old osd-id auth.
>>>>> (the command above is also missing lvm or simple)
>>>>> 
>>>>> I think it's related to this:
>>>>> http://tracker.ceph.com/issues/22642
>>>>> 
>>>>> # ceph-volume lvm create --bluestore --data /dev/sdc --osd-id 0
>>>>> Running command: sudo vgcreate --force --yes 
>>>>> ceph-efad7df8-721d-43d8-8d02-449406e70b90 /dev/sdc
>>>>>  stderr: WARNING: lvmetad is running but disabled. Restart lvmetad before 
>>>>> enabling it!
>>>>>  stdout: Physical volume "/dev/sdc" successfully created
>>>>>  stdout: Volume group "ceph-efad7df8-721d-43d8-8d02-449406e70b90" 
>>>>> successfully created
>>>>> Running command: sudo lvcreate --yes -l 100%FREE -n 
>>>>> osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9 
>>>>> ceph-efad7df8-721d-43d8-8d02-449406e70b90
>>>>>  stderr: WARNING: lvmetad is running but disabled. Restart lvmetad before 
>>>>> enabling it!
>>>>>  stdout: Logical volume "osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9" 
>>>>> created.
>>>>> Running command: sudo mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
>>>>> Running command: chown -R ceph:ceph /dev/dm-4
>>>>> Running command: sudo ln -s 
>>>>> /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9
>>>>>  /var/lib/ceph/osd/ceph-0/block
>>>>> Running command: sudo ceph --cluster ceph --name client.bootstrap-osd 
>>>>> --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o 
>>>>> /var/lib/ceph/osd/ceph-0/activate.monmap
>>>>>  stderr: got monmap epoch 2
>>>>> Running command: ceph-authtool /var/lib/ceph/osd/ceph-0/keyring 
>>>>> --create-keyring --name osd.0 --add-key XXXXXXXX
>>>>>  stdout: creating /var/lib/ceph/osd/ceph-0/keyring
>>>>>  stdout: added entity osd.0 auth auth(auid = 18446744073709551615 key= 
>>>>> XXXXXXXX with 0 caps)
>>>>> Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring
>>>>> Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/
>>>>> Running command: sudo ceph-osd --cluster ceph --osd-objectstore bluestore 
>>>>> --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --key 
>>>>> **************************************** --osd-data 
>>>>> /var/lib/ceph/osd/ceph-0/ --osd-uuid 138ce507-f28a-45bf-814c-7fa124a9d9b9 
>>>>> --setuser ceph --setgroup ceph
>>>>>  stderr: 2018-01-26 14:59:10.039549 7fd7ef951cc0 -1 
>>>>> bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to 
>>>>> decode label at offset 102: buffer::malformed_input: void 
>>>>> bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past 
>>>>> end of struct encoding
>>>>>  stderr: 2018-01-26 14:59:10.039744 7fd7ef951cc0 -1 
>>>>> bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to 
>>>>> decode label at offset 102: buffer::malformed_input: void 
>>>>> bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past 
>>>>> end of struct encoding
>>>>>  stderr: 2018-01-26 14:59:10.039925 7fd7ef951cc0 -1 
>>>>> bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to 
>>>>> decode label at offset 102: buffer::malformed_input: void 
>>>>> bluestore_bdev_label_t::decode(ceph::buffer::list::iterator&) decode past 
>>>>> end of struct encoding
>>>>>  stderr: 2018-01-26 14:59:10.039984 7fd7ef951cc0 -1 
>>>>> bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid
>>>>>  stderr: 2018-01-26 14:59:11.359951 7fd7ef951cc0 -1 key XXXXXXXX
>>>>>  stderr: 2018-01-26 14:59:11.888476 7fd7ef951cc0 -1 created object store 
>>>>> /var/lib/ceph/osd/ceph-0/ for osd.0 fsid 
>>>>> efad7df8-721d-43d8-8d02-449406e70b90
>>>>> Running command: sudo ceph-bluestore-tool --cluster=ceph prime-osd-dir 
>>>>> --dev 
>>>>> /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9
>>>>>  --path /var/lib/ceph/osd/ceph-0
>>>>> Running command: sudo ln -snf 
>>>>> /dev/ceph-efad7df8-721d-43d8-8d02-449406e70b90/osd-block-138ce507-f28a-45bf-814c-7fa124a9d9b9
>>>>>  /var/lib/ceph/osd/ceph-0/block
>>>>> Running command: chown -R ceph:ceph /dev/dm-4
>>>>> Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
>>>>> Running command: sudo systemctl enable 
>>>>> ceph-volume@lvm-0-138ce507-f28a-45bf-814c-7fa124a9d9b9
>>>>>  stderr: Created symlink from 
>>>>> /etc/systemd/system/multi-user.target.wants/[email protected]
>>>>>  
>>>>> <mailto:etc/systemd/system/multi-user.target.wants/[email protected]>
>>>>>  to /lib/systemd/system/[email protected].
>>>>> Running command: sudo systemctl start ceph-osd@0
>>>>> 
>>>>> ceph-osd.0.log shows:
>>>>> 
>>>>> 2018-01-26 15:09:07.379039 7f545d3b9cc0  4 rocksdb: 
>>>>> [/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2859 
>>>>> <http://version_set.cc:2859/>] Recovered from manifest 
>>>>> file:db/MANIFEST-000095 succeeded,manifest_file_number is 95, 
>>>>> next_file_number is 97, last_sequence is 21, log_number is 
>>>>> 0,prev_log_number is 0,max_column_family is 0
>>>>> 
>>>>> 2018-01-26 15:09:07.379046 7f545d3b9cc0  4 rocksdb: 
>>>>> [/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2867 
>>>>> <http://version_set.cc:2867/>] Column family [default] (ID 0), log number 
>>>>> is 94
>>>>> 
>>>>> 2018-01-26 15:09:07.379087 7f545d3b9cc0  4 rocksdb: EVENT_LOG_v1 
>>>>> {"time_micros": 1516979347379083, "job": 1, "event": "recovery_started", 
>>>>> "log_files": [96]}
>>>>> 2018-01-26 15:09:07.379091 7f545d3b9cc0  4 rocksdb: 
>>>>> [/build/ceph-12.2.2/src/rocksdb/db/db_impl_open.cc:482 
>>>>> <http://db_impl_open.cc:482/>] Recovering log #96 mode 0
>>>>> 2018-01-26 15:09:07.379102 7f545d3b9cc0  4 rocksdb: 
>>>>> [/build/ceph-12.2.2/src/rocksdb/db/version_set.cc:2395 
>>>>> <http://version_set.cc:2395/>] Creating manifest 98
>>>>> 
>>>>> 2018-01-26 15:09:07.380466 7f545d3b9cc0  4 rocksdb: EVENT_LOG_v1 
>>>>> {"time_micros": 1516979347380463, "job": 1, "event": "recovery_finished"}
>>>>> 2018-01-26 15:09:07.381331 7f545d3b9cc0  4 rocksdb: 
>>>>> [/build/ceph-12.2.2/src/rocksdb/db/db_impl_open.cc:1063 
>>>>> <http://db_impl_open.cc:1063/>] DB pointer 0x556ecb8c3000
>>>>> 2018-01-26 15:09:07.381353 7f545d3b9cc0  1 
>>>>> bluestore(/var/lib/ceph/osd/ceph-0) _open_db opened rocksdb path db 
>>>>> options 
>>>>> compression=kNoCompression,max_write_buffer_number=4,min_write_buffer_number_to_merge=1,recycle_log_file_num=4,write_buffer_size=268435456,writable_file_max_buffer_size=0,compaction_readahead_size=2097152
>>>>> 2018-01-26 15:09:07.381616 7f545d3b9cc0  1 freelist init
>>>>> 2018-01-26 15:09:07.381660 7f545d3b9cc0  1 
>>>>> bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc opening allocation 
>>>>> metadata
>>>>> 2018-01-26 15:09:07.381679 7f545d3b9cc0  1 
>>>>> bluestore(/var/lib/ceph/osd/ceph-0) _open_alloc loaded 447 G in 1 extents
>>>>> 2018-01-26 15:09:07.382077 7f545d3b9cc0  0 _get_class not permitted to 
>>>>> load kvs
>>>>> 2018-01-26 15:09:07.382309 7f545d3b9cc0  0 <cls> 
>>>>> /build/ceph-12.2.2/src/cls/cephfs/cls_cephfs.cc:197 
>>>>> <http://cls_cephfs.cc:197/>: loading cephfs
>>>>> 2018-01-26 15:09:07.382583 7f545d3b9cc0  0 _get_class not permitted to 
>>>>> load sdk
>>>>> 2018-01-26 15:09:07.382827 7f545d3b9cc0  0 <cls> 
>>>>> /build/ceph-12.2.2/src/cls/hello/cls_hello.cc:296 
>>>>> <http://cls_hello.cc:296/>: loading cls_hello
>>>>> 2018-01-26 15:09:07.385755 7f545d3b9cc0  0 _get_class not permitted to 
>>>>> load lua
>>>>> 2018-01-26 15:09:07.386073 7f545d3b9cc0  0 osd.0 0 crush map has features 
>>>>> 288232575208783872, adjusting msgr requires for clients
>>>>> 2018-01-26 15:09:07.386078 7f545d3b9cc0  0 osd.0 0 crush map has features 
>>>>> 288232575208783872 was 8705, adjusting msgr requires for mons
>>>>> 2018-01-26 15:09:07.386079 7f545d3b9cc0  0 osd.0 0 crush map has features 
>>>>> 288232575208783872, adjusting msgr requires for osds
>>>>> 2018-01-26 15:09:07.386132 7f545d3b9cc0  0 osd.0 0 load_pgs
>>>>> 2018-01-26 15:09:07.386134 7f545d3b9cc0  0 osd.0 0 load_pgs opened 0 pgs
>>>>> 2018-01-26 15:09:07.386137 7f545d3b9cc0  0 osd.0 0 using weightedpriority 
>>>>> op queue with priority op cut off at 64.
>>>>> 2018-01-26 15:09:07.386580 7f545d3b9cc0 -1 osd.0 0 log_to_monitors 
>>>>> {default=true}
>>>>> 2018-01-26 15:09:07.388077 7f545d3b9cc0 -1 osd.0 0 init authentication 
>>>>> failed: (1) Operation not permitted
>>>>> 
>>>>> 
>>>>> The old osd is still there.
>>>>> 
>>>>> # ceph osd tree
>>>>> ID CLASS WEIGHT  TYPE NAME     STATUS    REWEIGHT PRI-AFF
>>>>> -1       2.60458 root default
>>>>> -2       0.86819     host int1
>>>>>  0   ssd 0.43159         osd.0 destroyed        0 1.00000
>>>>>  3   ssd 0.43660         osd.3        up  1.00000 1.00000
>>>>> -3       0.86819     host int2
>>>>>  1   ssd 0.43159         osd.1        up  1.00000 1.00000
>>>>>  4   ssd 0.43660         osd.4        up  1.00000 1.00000
>>>>> -4       0.86819     host int3
>>>>>  2   ssd 0.43159         osd.2        up  1.00000 1.00000
>>>>>  5   ssd 0.43660         osd.5        up  1.00000 1.00000
>>>>> 
>>>>> 
>>>>> What's the best course of action? Purging osd.0, zapping the device again 
>>>>> and creating without --osd-id set?
>>>>> 
>>>>> 
>>>>> Kind Regards,
>>>>> 
>>>>> David Majchrzak
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> [email protected] <mailto:[email protected]>
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>> 
>>> _______________________________________________
>>> ceph-users mailing list
>>> [email protected] <mailto:[email protected]>
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Migrating filestore to bluestore using ceph-volume

Reply via email to