Hi,

we have a full bluestore cluster and had to deal with read errors on the SSD for the block.db. Something like this helped us to recreate a pre-existing OSD without rebalancing, just refilling the PGs. I would zap the journal device and let it recreate. It's very similar to your ceph-deploy output, but maybe you get more of it if you run it manually:

ceph-osd [--cluster-uuid <CLUSTER_UUID>] [--osd-objectstore filestore] --mkfs -i <OSD_ID> --osd-journal <PATH_TO_SSD> --osd-data /var/lib/ceph/osd/ceph-<OSD_ID>/ --mkjournal --setuser ceph --setgroup ceph --osd-uuid <OSD_UUID>

Maybe after zapping the journal this will work. At least it would rule out the old journal as the show-stopper.

Regards,
Eugen


Zitat von David Majchrzak <da...@oderland.se>:

Hi!
Trying to replace an OSD on a Jewel cluster (filestore data on HDD + journal device on SSD). I've set noout and removed the flapping drive (read errors) and replaced it with a new one.

I've taken down the osd UUID to be able to prepare the new disk with the same osd.ID. The journal device is the same as the previous one (should I delete the partition and recreate it?)
However, running ceph-disk prepare returns:
# ceph-disk -v prepare --cluster-uuid c51a2683-55dc-4634-9d9d-f0fec9a6f389 --osd-uuid dc49691a-2950-4028-91ea-742ffc9ed63f --journal-dev --data-dev --fs-type xfs /dev/sdo /dev/sda8 command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph --setuser ceph --setgroup ceph command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph --setuser ceph --setgroup ceph command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph --setuser ceph --setgroup ceph
Traceback (most recent call last):
File "/usr/sbin/ceph-disk", line 9, in <module>
load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 5371, in run
main(sys.argv[1:])
File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 5322, in main
args.func(args)
File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 1900, in main
Prepare.factory(args).prepare()
File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 1896, in factory
return PrepareFilestore(args)
File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 1909, in __init__
self.journal = PrepareJournal(args)
File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 2221, in __init__
raise Error('journal specified but not allowed by osd backend')
ceph_disk.main.Error: Error: journal specified but not allowed by osd backend

I tried googling first of course. It COULD be that we have set setuser_match_path globally in ceph.conf (like this bug report: https://tracker.ceph.com/issues/19642) since the cluster was created as dumpling a long time ago. Best practice to fix it? Create [osd.X] configs and set setuser_match_path in there instead for the old OSDs? Should I do any other steps preceding this if I want to use the same osd UUID? I've only stopped ceph-osd@21, removed the physical disk, inserted new one and tried running prepare.
Kind Regards,
David



_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to