>From what I observed, however, until I made that last change in the UDEV
rule, I simply could not get those OSDs started.  I will try converting the
next 10 OSDs (osd.70-79) tomorrow, following all the steps you have shown
me in in this email thread, and will  report back to you guys if/where I
encounter any errors.  I am planning on trying to start the OSDs (once they
are converted to Bluestore) without the udev rule first.

On Mon, Nov 5, 2018 at 4:42 PM, Alfredo Deza <ad...@redhat.com> wrote:

> On Mon, Nov 5, 2018 at 4:21 PM Hayashida, Mami <mami.hayash...@uky.edu>
> wrote:
> >
> > Yes, I still have the volume log showing the activation process for
> ssd0/db60 (and 61-69 as well).   I will email it to you directly as an
> attachment.
>
> In the logs, I see that ceph-volume does set the permissions correctly:
>
> [2018-11-02 16:20:07,238][ceph_volume.process][INFO  ] Running
> command: chown -h ceph:ceph /dev/hdd60/data60
> [2018-11-02 16:20:07,242][ceph_volume.process][INFO  ] Running
> command: chown -R ceph:ceph /dev/dm-10
> [2018-11-02 16:20:07,246][ceph_volume.process][INFO  ] Running
> command: ln -s /dev/hdd60/data60 /var/lib/ceph/osd/ceph-60/block
> [2018-11-02 16:20:07,249][ceph_volume.process][INFO  ] Running
> command: ceph --cluster ceph --name client.bootstrap-osd --keyring
> /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o
> /var/lib/ceph/osd/ceph-60/activate.monmap
> [2018-11-02 16:20:07,530][ceph_volume.process][INFO  ] stderr got monmap
> epoch 2
> [2018-11-02 16:20:07,547][ceph_volume.process][INFO  ] Running
> command: ceph-authtool /var/lib/ceph/osd/ceph-60/keyring
> --create-keyring --name osd.60 --add-key
> AQBysdxbNgdBNhAA6NQ/UWDHqGAZfFuryCWfxQ==
> [2018-11-02 16:20:07,579][ceph_volume.process][INFO  ] stdout creating
> /var/lib/ceph/osd/ceph-60/keyring
> added entity osd.60 auth auth(auid = 18446744073709551615
> key=AQBysdxbNgdBNhAA6NQ/UWDHqGAZfFuryCWfxQ== with 0 caps)
> [2018-11-02 16:20:07,583][ceph_volume.process][INFO  ] Running
> command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-60/keyring
> [2018-11-02 16:20:07,587][ceph_volume.process][INFO  ] Running
> command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-60/
> [2018-11-02 16:20:07,591][ceph_volume.process][INFO  ] Running
> command: chown -h ceph:ceph /dev/ssd0/db60
> [2018-11-02 16:20:07,594][ceph_volume.process][INFO  ] Running
> command: chown -R ceph:ceph /dev/dm-0
>
> And the failures from osd.60 are *before* those successful chown calls
> (15:39:00). I wonder if somehow in the process there was a missing
> step and then it all got corrected. I am certain that the UDEV rule
> should *not*
> be in place for this to work.
>
> The changes in the path for /dev/dm-* is expected, as that is created
> every time the system boots.
>
> >
> >
> > On Mon, Nov 5, 2018 at 4:14 PM, Alfredo Deza <ad...@redhat.com> wrote:
> >>
> >> On Mon, Nov 5, 2018 at 4:04 PM Hayashida, Mami <mami.hayash...@uky.edu>
> wrote:
> >> >
> >> > WOW.  With you two guiding me through every step, the 10 OSDs in
> question are now added back to the cluster as Bluestore disks!!!  Here are
> my responses to the last email from Hector:
> >> >
> >> > 1. I first checked the permissions and they looked like this
> >> >
> >> > root@osd1:/var/lib/ceph/osd/ceph-60# ls -l
> >> > total 56
> >> > -rw-r--r-- 1 ceph ceph         384 Nov  2 16:20 activate.monmap
> >> > -rw-r--r-- 1 ceph ceph 10737418240 Nov  2 16:20 block
> >> > lrwxrwxrwx 1 ceph ceph          14 Nov  2 16:20 block.db ->
> /dev/ssd0/db60
> >> >
> >> > root@osd1:~# ls -l /dev/ssd0/
> >> > ...
> >> > lrwxrwxrwx 1 root root 7 Nov  5 12:38 db60 -> ../dm-2
> >> >
> >> > root@osd1:~# ls -la /dev/
> >> > ...
> >> > brw-rw----  1 root disk    252,   2 Nov  5 12:38 dm-2
> >>
> >> This looks like a bug. You mentioned you are running 12.2.9, and we
> >> haven't seen problems in ceph-volume that fail to update the
> >> permissions on OSD devices. No one should need a UDEV rule to set the
> >> permissions for
> >> devices, this is a ceph-volume task.
> >>
> >> When a system starts and the OSD activation happens, it always ensures
> >> that the permissions are set correctly. Could you find the section of
> >> the logs in /var/log/ceph/ceph-volume.log that shows the activation
> >> process for ssd0/db60 ?
> >>
> >> Hopefully you still have those around, it would help us determine why
> >> the permissions aren't being set correctly.
> >>
> >> > ...
> >> >
> >> > 2. I then ran ceph-volume activate --all again.  Saw the same error
> for osd.67 I described many emails ago..  None of the permissions changed.
> I tried restarting ceph-osd@60, but got the same error as before:
> >> >
> >> > 2018-11-05 15:34:52.001782 7f5a15744e00  0 set uid:gid to 64045:64045
> (ceph:ceph)
> >> > 2018-11-05 15:34:52.001808 7f5a15744e00  0 ceph version 12.2.9 (
> 9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable), process
> ceph-osd, pid 36506
> >> > 2018-11-05 15:34:52.021717 7f5a15744e00  0 pidfile_write: ignore
> empty --pid-file
> >> > 2018-11-05 15:34:52.033478 7f5a15744e00  0 load: jerasure load: lrc
> load: isa
> >> > 2018-11-05 15:34:52.033557 7f5a15744e00  1 bdev create path
> /var/lib/ceph/osd/ceph-60/block type kernel
> >> > 2018-11-05 15:34:52.033572 7f5a15744e00  1 bdev(0x5651bd1b8d80
> /var/lib/ceph/osd/ceph-60/block) open path /var/lib/ceph/osd/ceph-60/block
> >> > 2018-11-05 15:34:52.033888 7f5a15744e00  1 bdev(0x5651bd1b8d80
> /var/lib/ceph/osd/ceph-60/block) open size 10737418240 (0x280000000,
> 10GiB) block_size 4096 (4KiB) rotational
> >> > 2018-11-05 15:34:52.033958 7f5a15744e00  1
> bluestore(/var/lib/ceph/osd/ceph-60) _set_cache_sizes cache_size
> 1073741824 meta 0.4 kv 0.4 data 0.2
> >> > 2018-11-05 15:34:52.033984 7f5a15744e00  1 bdev(0x5651bd1b8d80
> /var/lib/ceph/osd/ceph-60/block) close
> >> > 2018-11-05 15:34:52.318993 7f5a15744e00  1
> bluestore(/var/lib/ceph/osd/ceph-60) _mount path /var/lib/ceph/osd/ceph-60
> >> > 2018-11-05 15:34:52.319064 7f5a15744e00  1 bdev create path
> /var/lib/ceph/osd/ceph-60/block type kernel
> >> > 2018-11-05 15:34:52.319073 7f5a15744e00  1 bdev(0x5651bd1b8fc0
> /var/lib/ceph/osd/ceph-60/block) open path /var/lib/ceph/osd/ceph-60/block
> >> > 2018-11-05 15:34:52.319356 7f5a15744e00  1 bdev(0x5651bd1b8fc0
> /var/lib/ceph/osd/ceph-60/block) open size 10737418240 (0x280000000,
> 10GiB) block_size 4096 (4KiB) rotational
> >> > 2018-11-05 15:34:52.319415 7f5a15744e00  1
> bluestore(/var/lib/ceph/osd/ceph-60) _set_cache_sizes cache_size
> 1073741824 meta 0.4 kv 0.4 data 0.2
> >> > 2018-11-05 15:34:52.319491 7f5a15744e00  1 bdev create path
> /var/lib/ceph/osd/ceph-60/block.db type kernel
> >> > 2018-11-05 15:34:52.319499 7f5a15744e00  1 bdev(0x5651bd1b9200
> /var/lib/ceph/osd/ceph-60/block.db) open path /var/lib/ceph/osd/ceph-60/
> block.db
> >> > 2018-11-05 15:34:52.319514 7f5a15744e00 -1 bdev(0x5651bd1b9200
> /var/lib/ceph/osd/ceph-60/block.db) open open got: (13) Permission denied
> >> > 2018-11-05 15:34:52.319648 7f5a15744e00 -1
> bluestore(/var/lib/ceph/osd/ceph-60) _open_db add block
> device(/var/lib/ceph/osd/ceph-60/block.db) returned: (13) Permission
> denied
> >> > 2018-11-05 15:34:52.319666 7f5a15744e00  1 bdev(0x5651bd1b8fc0
> /var/lib/ceph/osd/ceph-60/block) close
> >> > 2018-11-05 15:34:52.598249 7f5a15744e00 -1 osd.60 0 OSD:init: unable
> to mount object store
> >> > 2018-11-05 15:34:52.598269 7f5a15744e00 -1  ** ERROR: osd init
> failed: (13) Permission denied
> >> >
> >> > 3. Finally, I literally copied and pasted the udev rule Hector wrote
> out for me, then rebooted the server.
> >> >
> >> > 4. I tried restarting ceph-osd@60 -- this time it came right up!!!
> I was able to start all the rest, including ceph-osd@67 which I thought
> did not get activated by lvm.
> >> >
> >> > 5. I checked from the admin node and verified osd.60-69 are all in
> the cluster as Bluestore OSDs and they indeed are.
> >> >
> >> > ********************
> >> > Thank you SO MUCH, both of you, for putting up with my novice
> questions all the way.  I am planning to convert the rest of the cluster
> the same way by reviewing this entire thread to trace what steps need to be
> taken.
> >> >
> >> > Mami
> >> >
> >> > On Mon, Nov 5, 2018 at 3:00 PM, Hector Martin <hec...@marcansoft.com>
> wrote:
> >> >>
> >> >>
> >> >>
> >> >> On 11/6/18 3:31 AM, Hayashida, Mami wrote:
> >> >> > 2018-11-05 12:47:01.075573 7f1f2775ae00 -1
> bluestore(/var/lib/ceph/osd/ceph-60) _open_db add block
> device(/var/lib/ceph/osd/ceph-60/block.db) returned: (13) Permission
> denied
> >> >>
> >> >> Looks like the permissions on the block.db device are wrong. As far
> as I
> >> >> know ceph-volume is responsible for setting this at activation time.
> >> >>
> >> >> > I already ran the "ceph-volume lvm activate --all "  command right
> after
> >> >> > I prepared (using "lvm prepare") those OSDs.  Do I need to run the
> >> >> > "activate" command again?
> >> >>
> >> >> The activation is required on every boot to create the
> >> >> /var/lib/ceph/osd/* directory, but that should be automatically done
> by
> >> >> systemd units (since you didn't run it after the reboot and yet the
> >> >> directories exist, it seems to have worked).
> >> >>
> >> >> Can you ls -l the OSD directory (/var/lib/ceph/osd/ceph-60/) and also
> >> >> any devices symlinked to from there, to see the permissions?
> >> >>
> >> >> Then run the activate command again and list the permissions again to
> >> >> see if they have changed, and if they have, try to start the OSD
> again.
> >> >>
> >> >> I found one Ubuntu bug that suggests there may be a race condition:
> >> >>
> >> >> https://bugs.launchpad.net/bugs/1767087
> >> >>
> >> >> I get the feeling the ceph-osd activation may be happening before the
> >> >> block.db device is ready, so when it gets created by LVM it's already
> >> >> too late and doesn't have the right permissions. You could fix it
> with a
> >> >> udev rule (like Ubuntu did) but if this is indeed your issue then it
> >> >> sounds like something that should be fixed in Ceph. Perhaps all you
> need
> >> >> is a systemd unit override to make sure ceph-volume@* services only
> >> >> start after LVM is ready.
> >> >>
> >> >> A usable udev rule could look like this (e.g. put it in
> >> >> /etc/udev/rules.d/90-lvm-permisions.rules):
> >> >>
> >> >> ACTION=="change", SUBSYSTEM=="block", ENV{DEVTYPE}=="disk", \
> >> >> ENV{DM_LV_NAME}=="db*", ENV{DM_VG_NAME}=="ssd0", \
> >> >> OWNER="ceph", GROUP="ceph", MODE="660"
> >> >>
> >> >> Reboot after that and see if the OSDs come up without further action.
> >> >>
> >> >> --
> >> >> Hector Martin (hec...@marcansoft.com)
> >> >> Public Key: https://mrcn.st/pub
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Mami Hayashida
> >> > Research Computing Associate
> >> >
> >> > Research Computing Infrastructure
> >> > University of Kentucky Information Technology Services
> >> > 301 Rose Street | 102 James F. Hardymon Building
> >> > Lexington, KY 40506-0495
> >> > mami.hayash...@uky.edu
> >> > (859)323-7521
> >
> >
> >
> >
> > --
> > Mami Hayashida
> > Research Computing Associate
> >
> > Research Computing Infrastructure
> > University of Kentucky Information Technology Services
> > 301 Rose Street | 102 James F. Hardymon Building
> > Lexington, KY 40506-0495
> > mami.hayash...@uky.edu
> > (859)323-7521
>



-- 
*Mami Hayashida*

*Research Computing Associate*
Research Computing Infrastructure
University of Kentucky Information Technology Services
301 Rose Street | 102 James F. Hardymon Building
Lexington, KY 40506-0495
mami.hayash...@uky.edu
(859)323-7521
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to