On Sat, Feb 24, 2018 at 1:26 PM, Oliver Freyermuth
<[email protected]> wrote:
> Dear Cephalopodians,
>
> when purging a single OSD on a host (created via ceph-deploy 2.0, i.e. using 
> ceph-volume lvm), I currently proceed as follows:
>
> On the OSD-host:
> $ systemctl stop [email protected]
> $ ls -la /var/lib/ceph/osd/ceph-4
> # Check block und block.db links:
> lrwxrwxrwx.  1 ceph ceph   93 23. Feb 01:28 block -> 
> /dev/ceph-69b1fbe5-f084-4410-a99a-ab57417e7846/osd-block-cd273506-e805-40ac-b23d-c7b9ff45d874
> lrwxrwxrwx.  1 root root   43 23. Feb 01:28 block.db -> 
> /dev/ceph-osd-blockdb-ssd-1/db-for-disk-sda
> # resolve actual underlying device:
> $ pvs | grep ceph-69b1fbe5-f084-4410-a99a-ab57417e7846
>   /dev/sda   ceph-69b1fbe5-f084-4410-a99a-ab57417e7846 lvm2 a--    <3,64t     > 0
> # Zap the device:
> $ ceph-volume lvm zap --destroy /dev/sda
>
> Now, on the mon:
> # purge the OSD:
> $ ceph osd purge osd.4 --yes-i-really-mean-it
>
> Then I re-deploy using:
> $ ceph-deploy --overwrite-conf osd create --bluestore --block-db 
> ceph-osd-blockdb-ssd-1/db-for-disk-sda --data /dev/sda osd001
>
> from the admin-machine.
>
> This works just fine, however, it leaves a stray ceph-volume service behind:
> $ ls -la /etc/systemd/system/multi-user.target.wants/ -1 | grep 
> ceph-volume@lvm-4
> lrwxrwxrwx.  1 root root   44 24. Feb 18:30 
> [email protected] -> 
> /usr/lib/systemd/system/[email protected]
> lrwxrwxrwx.  1 root root   44 23. Feb 01:28 
> [email protected] -> 
> /usr/lib/systemd/system/[email protected]
>
> This stray service then, after reboot of the machine, stays in activating 
> state (since the disk will of course never come back):
> -----------------------------------
> $ systemctl status 
> [email protected]
> ● [email protected] - Ceph 
> Volume activation: lvm-4-cd273506-e805-40ac-b23d-c7b9ff45d874
>    Loaded: loaded (/usr/lib/systemd/system/[email protected]; enabled; 
> vendor preset: disabled)
>    Active: activating (start) since Sa 2018-02-24 19:21:47 CET; 1min 12s ago
>  Main PID: 1866 (timeout)
>    CGroup: 
> /system.slice/system-ceph\x2dvolume.slice/[email protected]
>            ├─1866 timeout 10000 /usr/sbin/ceph-volume-systemd 
> lvm-4-cd273506-e805-40ac-b23d-c7b9ff45d874
>            └─1872 /usr/bin/python2.7 /usr/sbin/ceph-volume-systemd 
> lvm-4-cd273506-e805-40ac-b23d-c7b9ff45d874
>
> Feb 24 19:21:47 osd001.baf.physik.uni-bonn.de systemd[1]: Starting Ceph 
> Volume activation: lvm-4-cd273506-e805-40ac-b23d-c7b9ff45d874...
> -----------------------------------
> Manually, I can fix this by running:
> $ systemctl disable 
> [email protected]
>
> My question is: Should I really remove that manually?
> Should "ceph-volume lvm zap --destroy" have taken care of it (bug)?

You should remove it manually. The problem with zapping is that we
might not have the information we need to remove the systemd unit.
Since an OSD can be made out of different devices, ceph-volume might
be asked to "zap" a device which it can't compute to what OSD it
belongs. The systemd units are tied to the ID and UUID of the OSD.



> Am I missing a step?
>
> Cheers,
>         Oliver
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to