1. Stopped osd.60-69: no problem 2. Skipped this and went to #3 to check first 3. Here, `find /etc/systemd/system | grep ceph-volume` returned nothing. I see in that directory
/etc/systemd/system/ceph-disk@60.service # and 61 - 69. No ceph-volume entries. On Tue, Nov 6, 2018 at 11:43 AM, Hayashida, Mami <mami.hayash...@uky.edu> wrote: > Ok. I will go through this this afternoon and let you guys know the > result. Thanks! > > On Tue, Nov 6, 2018 at 11:32 AM, Hector Martin <hec...@marcansoft.com> > wrote: > >> On 11/7/18 1:00 AM, Hayashida, Mami wrote: >> > I see. Thank you for clarifying lots of things along the way -- this >> > has been extremely helpful. Neither "df | grep osd" nor "mount | grep >> > osd" shows ceph-60 through 69. >> >> OK, that isn't right then. I suggest you try this: >> >> 1) bring down OSD 60-69 (systemctl stop ceph-osd@60 etc) >> >> 2) move those directories out of the way, as in: >> >> mkdir /var/lib/ceph/osd_old >> mv /var/lib/ceph/osd/ceph-6[0-9] /var/lib/ceph/osd_old >> >> (if this all works out you can delete them, just want to make sure you >> don't accidentally wipe something important) >> >> 2) run `find /etc/systemd/system | grep ceph-volume` and check the >> output. You're looking for symlinks in multi-user.target.wants or similar. >> >> There should be a single "ceph-volume@lvm-<id>-<uuid>" entry for each >> OSD, and the id and uuid should match the "ceph.osd_id" and >> "ceph.osd_fsid" LVM tags from `ceph-volume lvm list`. You can also use >> `lvs -o vg_name,name,lv_tags` >> >> If you see anything of the format "ceph-volume@simple-..." then that is >> old junk from previous attempts at using ceph-volume. They should be >> symlinks and you should delete them and run `systemctl daemon-reload`. >> Same story if you see any @lvm symlinks but with incorrect OSD IDs or >> fsids. All of this should be recreated by the next step anyway if >> deleted, so it should be safe to delete any symlinks in there that you >> think might be wrong. >> >> 3) Run `ceph-volume lvm activate --all` >> >> At this point `df` and `mount` should show tmpfs mounts for all your LVM >> OSDs, and they should be up. List the OSD directories and check that >> both `block` and `block.db` entries are symlinks to the right devices. >> The right target symlinks should also have been created/enabled in >> /etc/systemd/system/multi-user.target.wants. >> >> The LVM dump you provided is correct. I suspect what happened is that >> somewhere during this experiment OSDs were activated into the root >> filesystem (instead of a tmpfs), perhaps using the ceph-volume simple >> mode, perhaps something else. Since all the metadata is in LVM, it's >> safe to move or delete all those OSD directories for BlueStore OSDs and >> try activating them cleanly again, which hopefully will do the right >> thing. >> >> In the end this all might fix your device ownership woes too, making the >> udev rule unnecessary. If it all works out, try a reboot and see if >> everything comes back up as it should. >> >> -- >> Hector Martin (hec...@marcansoft.com) >> Public Key: https://mrcn.st/pub >> > > > > -- > *Mami Hayashida* > > *Research Computing Associate* > Research Computing Infrastructure > University of Kentucky Information Technology Services > 301 Rose Street | 102 James F. Hardymon Building > Lexington, KY 40506-0495 > mami.hayash...@uky.edu > (859)323-7521 > -- *Mami Hayashida* *Research Computing Associate* Research Computing Infrastructure University of Kentucky Information Technology Services 301 Rose Street | 102 James F. Hardymon Building Lexington, KY 40506-0495 mami.hayash...@uky.edu (859)323-7521
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com