[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-29 Thread Dimitri John Ledkov
** Package changed: systemd (Ubuntu) => ceph (Ubuntu)

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in ceph package in Ubuntu:
  New

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-29 Thread Corey Bryant
Thanks for all the details.

I need to confirm this but I think the block.db and block.wal symlinks
are created as a result of 'ceph-volume lvm prepare --bluestore --data
 --block.wal  --block.db '.

That's coded in the ceph-osd charm around here:
https://opendev.org/openstack/charm-ceph-
osd/src/branch/master/lib/ceph/utils.py#L1558

Can you confirm that the symlinks are ok prior to reboot? I'd like to
figure out if they are correctly set up by the charm initially.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  New

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-28 Thread Xav Paice
udevadm info -e >/tmp/1828617-2.out

~# ls -l /var/lib/ceph/osd/ceph*
-rw--- 1 ceph ceph  69 May 21 08:44 
/var/lib/ceph/osd/ceph.client.osd-upgrade.keyring

/var/lib/ceph/osd/ceph-11:
total 24
lrwxrwxrwx 1 ceph ceph 93 May 28 22:12 block -> 
/dev/ceph-33de740d-bd8c-4b47-a601-3e6e634e489a/osd-block-33de740d-bd8c-4b47-a601-3e6e634e489a
-rw--- 1 ceph ceph 37 May 28 22:12 ceph_fsid
-rw--- 1 ceph ceph 37 May 28 22:12 fsid
-rw--- 1 ceph ceph 56 May 28 22:12 keyring
-rw--- 1 ceph ceph  6 May 28 22:12 ready
-rw--- 1 ceph ceph 10 May 28 22:12 type
-rw--- 1 ceph ceph  3 May 28 22:12 whoami

/var/lib/ceph/osd/ceph-18:
total 24
lrwxrwxrwx 1 ceph ceph 93 May 28 22:12 block -> 
/dev/ceph-eb5270dc-1110-420f-947e-aab7fae299c9/osd-block-eb5270dc-1110-420f-947e-aab7fae299c9
lrwxrwxrwx 1 ceph ceph 94 May 28 22:12 block.db -> 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-db-eb5270dc-1110-420f-947e-aab7fae299c9
lrwxrwxrwx 1 ceph ceph 95 May 28 22:12 block.wal -> 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-wal-eb5270dc-1110-420f-947e-aab7fae299c9
-rw--- 1 ceph ceph 37 May 28 22:12 ceph_fsid
-rw--- 1 ceph ceph 37 May 28 22:12 fsid
-rw--- 1 ceph ceph 56 May 28 22:12 keyring
-rw--- 1 ceph ceph  6 May 28 22:12 ready
-rw--- 1 ceph ceph 10 May 28 22:12 type
-rw--- 1 ceph ceph  3 May 28 22:12 whoami

/var/lib/ceph/osd/ceph-24:
total 24
lrwxrwxrwx 1 ceph ceph 93 May 28 22:12 block -> 
/dev/ceph-d38a7e91-cf06-4607-abbe-53eac89ac5ea/osd-block-d38a7e91-cf06-4607-abbe-53eac89ac5ea
-rw--- 1 ceph ceph 37 May 28 22:12 ceph_fsid
-rw--- 1 ceph ceph 37 May 28 22:12 fsid
-rw--- 1 ceph ceph 56 May 28 22:12 keyring
-rw--- 1 ceph ceph  6 May 28 22:12 ready
-rw--- 1 ceph ceph 10 May 28 22:12 type
-rw--- 1 ceph ceph  3 May 28 22:12 whoami

/var/lib/ceph/osd/ceph-31:
total 24
lrwxrwxrwx 1 ceph ceph 93 May 28 22:12 block -> 
/dev/ceph-053e000a-76ed-427e-98b3-e5373e263f2d/osd-block-053e000a-76ed-427e-98b3-e5373e263f2d
lrwxrwxrwx 1 ceph ceph 94 May 28 22:12 block.db -> 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-db-053e000a-76ed-427e-98b3-e5373e263f2d
lrwxrwxrwx 1 ceph ceph 95 May 28 22:12 block.wal -> 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-wal-053e000a-76ed-427e-98b3-e5373e263f2d
-rw--- 1 ceph ceph 37 May 28 22:12 ceph_fsid
-rw--- 1 ceph ceph 37 May 28 22:12 fsid
-rw--- 1 ceph ceph 56 May 28 22:12 keyring
-rw--- 1 ceph ceph  6 May 28 22:12 ready
-rw--- 1 ceph ceph 10 May 28 22:12 type
-rw--- 1 ceph ceph  3 May 28 22:12 whoami

/var/lib/ceph/osd/ceph-38:
total 24
lrwxrwxrwx 1 ceph ceph 93 May 28 22:12 block -> 
/dev/ceph-c2669da2-63aa-42e2-b049-cf00a478e076/osd-block-c2669da2-63aa-42e2-b049-cf00a478e076
lrwxrwxrwx 1 ceph ceph 94 May 28 22:12 block.db -> 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-db-c2669da2-63aa-42e2-b049-cf00a478e076
lrwxrwxrwx 1 ceph ceph 95 May 28 22:12 block.wal -> 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-wal-c2669da2-63aa-42e2-b049-cf00a478e076
-rw--- 1 ceph ceph 37 May 28 22:12 ceph_fsid
-rw--- 1 ceph ceph 37 May 28 22:12 fsid
-rw--- 1 ceph ceph 56 May 28 22:12 keyring
-rw--- 1 ceph ceph  6 May 28 22:12 ready
-rw--- 1 ceph ceph 10 May 28 22:12 type
-rw--- 1 ceph ceph  3 May 28 22:12 whoami

/var/lib/ceph/osd/ceph-4:
total 24
lrwxrwxrwx 1 ceph ceph 93 May 28 22:12 block -> 
/dev/ceph-7478edfc-f321-40a2-a105-8e8a2c8ca3f6/osd-block-7478edfc-f321-40a2-a105-8e8a2c8ca3f6
lrwxrwxrwx 1 ceph ceph 94 May 28 22:12 block.db -> 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-db-7478edfc-f321-40a2-a105-8e8a2c8ca3f6
lrwxrwxrwx 1 ceph ceph 95 May 28 22:12 block.wal -> 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-wal-7478edfc-f321-40a2-a105-8e8a2c8ca3f6
-rw--- 1 ceph ceph 37 May 28 22:12 ceph_fsid
-rw--- 1 ceph ceph 37 May 28 22:12 fsid
-rw--- 1 ceph ceph 55 May 28 22:12 keyring
-rw--- 1 ceph ceph  6 May 28 22:12 ready
-rw--- 1 ceph ceph 10 May 28 22:12 type
-rw--- 1 ceph ceph  2 May 28 22:12 whoami

/var/lib/ceph/osd/ceph-45:
total 24
lrwxrwxrwx 1 ceph ceph 93 May 28 22:12 block -> 
/dev/ceph-12e68fcb-d2b6-459f-97f2-d3eb4e28c75e/osd-block-12e68fcb-d2b6-459f-97f2-d3eb4e28c75e
lrwxrwxrwx 1 ceph ceph 94 May 28 22:12 block.db -> 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-db-12e68fcb-d2b6-459f-97f2-d3eb4e28c75e
lrwxrwxrwx 1 ceph ceph 95 May 28 22:12 block.wal -> 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-wal-12e68fcb-d2b6-459f-97f2-d3eb4e28c75e
-rw--- 1 ceph ceph 37 May 28 22:12 ceph_fsid
-rw--- 1 ceph ceph 37 May 28 22:12 fsid
-rw--- 1 ceph ceph 56 May 28 22:12 keyring
-rw--- 1 ceph ceph  6 May 28 22:12 ready
-rw--- 1 ceph ceph 10 May 28 22:12 type
-rw--- 1 ceph ceph  3 May 28 22:12 whoami


** Attachment added: "1828617-2.out"
   
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+attachment/5267247/+files/1828617-2.out

-- 
You received this bug 

[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-28 Thread Xav Paice
journalctl --no-pager -lu systemd-udevd.service >/tmp/1828617-1.out

Hostname obfusticated

lsblk:

NAME
 MAJ:MIN  RM   SIZE RO TYPE  MOUNTPOINT
loop0   
   7:0 0  88.4M  1 loop  /snap/core/6964
loop1   
   7:1 0  89.4M  1 loop  /snap/core/6818
loop2   
   7:2 0   8.4M  1 loop  
/snap/canonical-livepatch/77
sda 
   8:0 0   1.8T  0 disk  
├─sda1  
   8:1 0   476M  0 part  /boot/efi
├─sda2  
   8:2 0   3.7G  0 part  /boot
└─sda3  
   8:3 0   1.7T  0 part  
  └─bcache7 
 252:896   0   1.7T  0 disk  /
sdb 
   8:160   1.8T  0 disk  
└─bcache0   
 252:0 0   1.8T  0 disk  
sdc 
   8:320   1.8T  0 disk  
└─bcache6   
 252:768   0   1.8T  0 disk  
  └─crypt-7478edfc-f321-40a2-a105-8e8a2c8ca3f6  
 253:0 0   1.8T  0 crypt 

└─ceph--7478edfc--f321--40a2--a105--8e8a2c8ca3f6-osd--block--7478edfc--f321--40a2--a105--8e8a2c8ca3f6
253:2 0   1.8T  0 lvm   
sdd 
   8:480   1.8T  0 disk  
└─bcache4   
 252:512   0   1.8T  0 disk  
  └─crypt-33de740d-bd8c-4b47-a601-3e6e634e489a  
 253:4 0   1.8T  0 crypt 

└─ceph--33de740d--bd8c--4b47--a601--3e6e634e489a-osd--block--33de740d--bd8c--4b47--a601--3e6e634e489a
253:5 0   1.8T  0 lvm   
sde 
   8:640   1.8T  0 disk  
└─bcache3   
 252:384   0   1.8T  0 disk  
  └─crypt-eb5270dc-1110-420f-947e-aab7fae299c9  
 253:1 0   1.8T  0 crypt 

└─ceph--eb5270dc--1110--420f--947e--aab7fae299c9-osd--block--eb5270dc--1110--420f--947e--aab7fae299c9
253:3 0   1.8T  0 lvm   
sdf 
   8:800   1.8T  0 disk  
└─bcache1   
 252:128   0   1.8T  0 disk  
  └─crypt-d38a7e91-cf06-4607-abbe-53eac89ac5ea  
 253:6 0   1.8T  0 crypt 

└─ceph--d38a7e91--cf06--4607--abbe--53eac89ac5ea-osd--block--d38a7e91--cf06--4607--abbe--53eac89ac5ea
253:7 0   1.8T  0 lvm   
sdg 
   8:960   1.8T  0 disk  
└─bcache5   
 252:640   0   1.8T  0 disk  
  └─crypt-053e000a-76ed-427e-98b3-e5373e263f2d  
 253:8 0   1.8T  0 crypt 

└─ceph--053e000a--76ed--427e--98b3--e5373e263f2d-osd--block--053e000a--76ed--427e--98b3--e5373e263f2d
253:9 0   1.8T  0 lvm   
sdh 
   8:112   0   1.8T  0 disk  
└─bcache8   
 252:1024  0   1.8T  0 disk  
  └─crypt-c2669da2-63aa-42e2-b049-cf00a478e076  
 253:250   1.8T  0 crypt 


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-28 Thread Corey Bryant
Andrey, I don't know if you saw James' comment as yours may have
coincided but if you can get the ceph-osd package version that would be
helpful. Thanks!

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  New

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-28 Thread Andrey Grebennikov
Yes, it is latest - the cluster is being re-deployed as part of
Bootstack handover.

Corey,
The bug you point to is fixing the sequence of ceph/udev. Here however udev 
can't create any devices as they don't exist at the moment of udev run seems so 
- when the host boots and settles down - there is no PVs exist at all.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Incomplete

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-28 Thread James Page
Please can you confirm which version of the ceph-osd package you have
installed; older versions rely on a charm shipped udev ruleset, rather
than it being provided by the packaging.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Incomplete

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-28 Thread Corey Bryant
This feels similar to https://bugs.launchpad.net/charm-ceph-
osd/+bug/1812925. First question, are you running with the latest stable
charms which have the fix for that bug?

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Incomplete

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-28 Thread James Page
The ceph-osd package provide udev rules which should switch the owner
for all ceph related LVM VG's to ceph:ceph.

# OSD LVM layout example
# VG prefix: ceph-
# LV prefix: osd-
ACTION=="add", SUBSYSTEM=="block", \
  ENV{DEVTYPE}=="disk", \
  ENV{DM_LV_NAME}=="osd-*", \
  ENV{DM_VG_NAME}=="ceph-*", \
  OWNER:="ceph", GROUP:="ceph", MODE:="660"
ACTION=="change", SUBSYSTEM=="block", \
  ENV{DEVTYPE}=="disk", \
  ENV{DM_LV_NAME}=="osd-*", \
  ENV{DM_VG_NAME}=="ceph-*", \
  OWNER="ceph", GROUP="ceph", MODE="660"

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Incomplete

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-28 Thread James Page
by-dname udev rules are created by MAAS/curtin as part of the server
install I think.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Incomplete

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-28 Thread Andrey Grebennikov
Steve,
This is MAAS who creates these udev rules. We requested this feature to be 
implemented in order to be able to use persistent names in further services 
configuration (using templating). We couldn't go with /dev/sdX names as they 
may change after the reboot, and can't use wwn names as they are unique per 
node and don't allow us to use templates with FCB.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Incomplete

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-28 Thread Steve Langasek
> LVM module is supposed to create PVs from devices using the links in 
> /dev/disk/by-dname/
> folder that are created by udev.

Created by udev how?  disk/by-dname is not part of the hierarchy that is
populated by the standard udev rules, nor is this created by lvm2.  Is
there something in the ceph-osd packaging specifically which generates
these links - and, in turn, depends on them for assembling LVs?

Can you provide udev logs (journalctl --no-pager -lu systemd-
udevd.service; udevadm info -e) from the system following a boot when
this race is hit?

** Changed in: systemd (Ubuntu)
   Status: Confirmed => Incomplete

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Incomplete

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-22 Thread Xav Paice
Just one update, if I change the perms of the symlink made (chown -h)
the OSD will actually start.

After rebooting, however, I found that the links I had made had gone
again and the whole process needed repeating in order to start the OSD.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Confirmed

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-22 Thread Xav Paice
I'm seeing this in a slightly different manner, on Bionic/Queens.

We have LVMs encrypted (thanks Vault), and rebooting a host results in
at least one OSD not returning fairly consistently.  The LVs appear in
the list, however the difference between a working and a non-working OSD
is the lack of links to block.db and block.wal on a non-working OSD.

See https://pastebin.canonical.com/p/rW3VgMMkmY/ for some info.

If I made the links manually:

cd /var/lib/ceph/osd/ceph-4
ln -s 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-db-7478edfc-f321-40a2-a105-8e8a2c8ca3f6
 block.db
ln -s 
/dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/osd-wal-7478edfc-f321-40a2-a105-8e8a2c8ca3f6
 block.wal

This resulted in a perms error accessing the device
"bluestore(/var/lib/ceph/osd/ceph-4) _open_db
/var/lib/ceph/osd/ceph-4/block.db symlink exists but target unusable:
(13) Permission denied"

ls -l /dev/ceph-wal-4de27554-2d05-440e-874a-9921dfc6f47e/
total 0
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 
osd-db-053e000a-76ed-427e-98b3-e5373e263f2d -> ../dm-20
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 
osd-db-12e68fcb-d2b6-459f-97f2-d3eb4e28c75e -> ../dm-24
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 
osd-db-33de740d-bd8c-4b47-a601-3e6e634e489a -> ../dm-14
lrwxrwxrwx 1 root root 8 May 22 23:04 
osd-db-7478edfc-f321-40a2-a105-8e8a2c8ca3f6 -> ../dm-12
lrwxrwxrwx 1 root root 8 May 22 23:04 
osd-db-c2669da2-63aa-42e2-b049-cf00a478e076 -> ../dm-22
lrwxrwxrwx 1 root root 8 May 22 23:04 
osd-db-d38a7e91-cf06-4607-abbe-53eac89ac5ea -> ../dm-18
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 
osd-db-eb5270dc-1110-420f-947e-aab7fae299c9 -> ../dm-16
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 
osd-wal-053e000a-76ed-427e-98b3-e5373e263f2d -> ../dm-19
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 
osd-wal-12e68fcb-d2b6-459f-97f2-d3eb4e28c75e -> ../dm-23
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 
osd-wal-33de740d-bd8c-4b47-a601-3e6e634e489a -> ../dm-13
lrwxrwxrwx 1 root root 8 May 22 23:04 
osd-wal-7478edfc-f321-40a2-a105-8e8a2c8ca3f6 -> ../dm-11
lrwxrwxrwx 1 root root 8 May 22 23:04 
osd-wal-c2669da2-63aa-42e2-b049-cf00a478e076 -> ../dm-21
lrwxrwxrwx 1 root root 8 May 22 23:04 
osd-wal-d38a7e91-cf06-4607-abbe-53eac89ac5ea -> ../dm-17
lrwxrwxrwx 1 ceph ceph 8 May 22 23:04 
osd-wal-eb5270dc-1110-420f-947e-aab7fae299c9 -> ../dm-15

I tried to change the perms to ceph.ceph ownership, but no change.

I have also tried (using `systemctl edit lvm2-monitor.service`) adding
the following to lvm2, but that's not changed the behavior either:

# cat /etc/systemd/system/lvm2-monitor.service.d/override.conf 
[Service]
ExecStartPre=/bin/sleep 60

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Confirmed

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-22 Thread Xav Paice
** Tags added: canonical-bootstack

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Confirmed

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-13 Thread David A. Desrosiers
This manifests itself as the following, as reported by lsblk(1). Note
the missing Ceph LVM volume on the 6th NVME disk:

$ cat sos_commands/block/lsblk
NAME
  MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda 
8:00  1.8T  0 disk 
|-sda1  
8:10  512M  0 part /boot/efi
`-sda2  
8:20  1.8T  0 part 
  |-foobar--vg-root 
 253:00  1.8T  0 lvm  /
  `-foobar--vg-swap_1   
 253:10  976M  0 lvm  [SWAP]
nvme0n1 
  259:00  1.8T  0 disk 
`-ceph--c576f63e--dfd4--48f7--9d60--6a7708cbccf6-osd--block--9fdd78b2--0745--47ae--b8d4--04d9803ab448
 253:60  1.8T  0 lvm  
nvme1n1 
  259:10  1.8T  0 disk 
`-ceph--6eb6565f--6392--44a8--9213--833b09f7c0bc-osd--block--a7d3629c--724f--4218--9d15--593ec64781da
 253:50  1.8T  0 lvm  
nvme2n1 
  259:20  1.8T  0 disk 
`-ceph--c14f9ee5--90d0--4306--9b18--99576516f76a-osd--block--bbf5bc79--edea--4e43--8414--b5140b409397
 253:40  1.8T  0 lvm  
nvme3n1 
  259:30  1.8T  0 disk 
`-ceph--a821146b--7674--4bcc--b5e9--0126c4bd5e3b-osd--block--b9371499--ff99--4d3e--ab3f--62ec3cf918c4
 253:30  1.8T  0 lvm  
nvme4n1 
  259:40  1.8T  0 disk 
`-ceph--2e39f75a--5d2a--49ee--beb1--5d0a2991fd6c-osd--block--a1be083e--1fa7--4397--acfa--2ff3d3491572
 253:20  1.8T  0 lvm  
nvme5n1 
  259:50  1.8T  0 disk

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Confirmed

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

2019-05-13 Thread Launchpad Bug Tracker
Status changed to 'Confirmed' because the bug affects multiple users.

** Changed in: systemd (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1828617

Title:
  Hosts randomly 'losing' disks, breaking ceph-osd service enumeration

Status in systemd package in Ubuntu:
  Confirmed

Bug description:
  Ubuntu 18.04.2 Ceph deployment.

  Ceph OSD devices utilizing LVM volumes pointing to udev-based physical 
devices.
  LVM module is supposed to create PVs from devices using the links in 
/dev/disk/by-dname/ folder that are created by udev.
  However on reboot it happens (not always, rather like race condition) that 
Ceph services cannot start, and pvdisplay doesn't show any volumes created. The 
folder /dev/disk/by-dname/ however has all necessary device created by the end 
of boot process.

  The behaviour can be fixed manually by running "#/sbin/lvm pvscan
  --cache --activate ay /dev/nvme0n1" command for re-activating the LVM
  components and then the services can be started.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1828617/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp