On Thursday, 15 July 2021 14:49:45 CEST Gianluca Cecchi wrote:
> On Fri, Apr 23, 2021 at 7:15 PM Nir Soffer <nsof...@redhat.com> wrote:
> > >> > 1) Is this the expected behavior?
> > >> 
> > >> yes, before removing multipath devices, you need to unzone LUN on
> > 
> > storage
> > 
> > >> server. As oVirt doesn't manage storage server in case of iSCSI, it has
> > 
> > to be
> > 
> > >> done by storage sever admin and therefore oVirt cannot manage whole
> > 
> > flow.
> > 
> > > Thank you for the information. Perhaps you can expand then on how the
> > 
> > volumes are picked up once mapped from the Storage system?  Traditionally
> > when mapping storage from an iSCSI or Fibre Channel storage we have to
> > initiate a LIP or iSCSI login. How is it that oVirt doesn't need to do
> > this?> 
> > >> > 2) Are we supposed to go to each KVM host and manually remove the
> > >> > underlying multipath devices?
> > >> 
> > >> oVirt provides ansible script for it:
> > >> 
> > >> https://github.com/oVirt/ovirt-ansible-collection/blob/master/examples/
> > >> remove_mpath_device.yml
> > >> 
> > >> Usage is as follows:
> > >> 
> > >> ansible-playbook --extra-vars "lun=<LUN_ID>" remove_mpath_device.yml
> 
> I had to decommission one iSCSI based storage domain, after having added
> one new iSCSI one (with another portal) and moved all the objects into the
> new one (vm disks, template disks, iso disks, leases).
> The Environment is based on 4.4.6, with 3 hosts, external engine.
> So I tried the ansible playbook way to verify it.
> 
> Initial situation is this below; the storage domain to decommission is the
> ovsd3750, based on the 5Tb LUN.
> 
> $ sudo multipath -l
> 364817197c52f98316900666e8c2b0b2b dm-13 EQLOGIC,100E-00
> size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
> `-+- policy='round-robin 0' prio=0 status=active
> 
>   |- 16:0:0:0 sde 8:64 active undef running
> 
>   `- 17:0:0:0 sdf 8:80 active undef running
> 36090a0d800851c9d2195d5b837c9e328 dm-2 EQLOGIC,100E-00
> size=5.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
> `-+- policy='round-robin 0' prio=0 status=active
> 
>   |- 13:0:0:0 sdb 8:16 active undef running
> 
>   `- 14:0:0:0 sdc 8:32 active undef running
> 
> Connections are using iSCSI multipathing (iscsi1 and iscs2 in web admin
> gui), so that I have two paths to each LUN:
> 
> $sudo  iscsiadm -m node
> 10.10.100.7:3260,1
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
> 10.10.100.7:3260,1
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
> 10.10.100.9:3260,1
> iqn.2001-05.com.equallogic:4-771816-31982fc59-2b0b2b8c6e660069-ovsd3920
> 10.10.100.9:3260,1
> iqn.2001-05.com.equallogic:4-771816-31982fc59-2b0b2b8c6e660069-ovsd3920
> 
> $ sudo iscsiadm -m session
> tcp: [1] 10.10.100.7:3260,1
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
> (non-flash)
> tcp: [2] 10.10.100.7:3260,1
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
> (non-flash)
> tcp: [4] 10.10.100.9:3260,1
> iqn.2001-05.com.equallogic:4-771816-31982fc59-2b0b2b8c6e660069-ovsd3920
> (non-flash)
> tcp: [5] 10.10.100.9:3260,1
> iqn.2001-05.com.equallogic:4-771816-31982fc59-2b0b2b8c6e660069-ovsd3920
> (non-flash)
> 
> One point not taken in consideration inside the previously opened bugs in
> my opinion is the deletion of iSCSI connections and node at host side
> (probably to be done by the os admin, but it could be taken in charge by
> the ansible playbook...)
> The bugs I'm referring are:
> Bug 1310330 - [RFE] Provide a way to remove stale LUNs from hypervisors
> Bug 1928041 - Stale DM links after block SD removal
> 
> Actions done:
> put storage domain into maintenance
> detach storage domain
> remove storage domain
> remove access from equallogic admin gui
> 
> I have a group named ovirt in ansible inventory composed by my 3 hosts:
> ov200, ov300 and ov301
> executed
> $ ansible-playbook -b -l ovirt --extra-vars
> "lun=36090a0d800851c9d2195d5b837c9e328" remove_mpath_device.yml
> 
> it went all ok with ov200 and ov300, but for ov301 I got
> 
> fatal: [ov301: FAILED! => {"changed": true, "cmd": "multipath -f
> \"36090a0d800851c9d2195d5b837c9e328\"", "delta": "0:00:00.009003", "end":
> "2021-07-15 11:17:37.340584", "msg": "non-zero return code", "rc": 1,
> "start": "2021-07-15 11:17:37.331581", "stderr": "Jul 15 11:17:37 |
> 36090a0d800851c9d2195d5b837c9e328: map in use", "stderr_lines": ["Jul 15
> 11:17:37 | 36090a0d800851c9d2195d5b837c9e328: map in use"], "stdout": "",
> "stdout_lines": []}
> 
> the complete output:
> 
> $ ansible-playbook -b -l ovirt --extra-vars
> "lun=36090a0d800851c9d2195d5b837c9e328" remove_mpath_device.yml
> 
> PLAY [Cleanly remove unzoned storage devices (LUNs)]
> *************************************************************
> 
> TASK [Gathering Facts]
> ****************************************************************************
> *************** ok: [ov200]
> ok: [ov300]
> ok: [ov301]
> 
> TASK [Get underlying disks (paths) for a multipath device and turn them
> into a list.] ****************************
> changed: [ov300]
> changed: [ov200]
> changed: [ov301]
> 
> TASK [Remove from multipath device.]
> ****************************************************************************
> * changed: [ov200]
> changed: [ov300]
> fatal: [ov301]: FAILED! => {"changed": true, "cmd": "multipath -f
> \"36090a0d800851c9d2195d5b837c9e328\"", "delta": "0:00:00.009003", "end":
> "2021-07-15 11:17:37.340584", "msg": "non-zero return code", "rc": 1,
> "start": "2021-07-15 11:17:37.331581", "stderr": "Jul 15 11:17:37 |
> 36090a0d800851c9d2195d5b837c9e328: map in use", "stderr_lines": ["Jul 15
> 11:17:37 | 36090a0d800851c9d2195d5b837c9e328: map in use"], "stdout": "",
> "stdout_lines": []}
> 
> TASK [Remove each path from the SCSI subsystem.]
> *****************************************************************
> changed: [ov300] => (item=sdc)
> changed: [ov300] => (item=sdb)
> changed: [ov200] => (item=sdc)
> changed: [ov200] => (item=sdb)
> 
> PLAY RECAP
> ****************************************************************************
> *************************** ov200 : ok=4    changed=3    unreachable=0   
> failed=0    skipped=0 rescued=0    ignored=0
> ov300 : ok=4    changed=3    unreachable=0    failed=0    skipped=0
>  rescued=0    ignored=0
> ov301 : ok=2    changed=1    unreachable=0    failed=1    skipped=0
>  rescued=0    ignored=0
> 
> Indeed going to the server I get:
> 
> root@ov301 ~]# multipath -f 36090a0d800851c9d2195d5b837c9e328
> Jul 15 11:24:37 | 36090a0d800851c9d2195d5b837c9e328: map in use
> [root@ov301 ~]#
> 
> the dm device under the multipath one is dm-2 and
> [root@ov301 ~]# ll /dev/dm-2
> brw-rw----. 1 root disk 253, 2 Jul 15 11:28 /dev/dm-2
> [root@ov301 ~]#
> 
> [root@ov301 ~]# lsof | grep "253,2"
> 
> I get no lines, but other devices with minor beginning with 2 (eg. 24, 25,
> 27..)
> . . .
> qemu-kvm    10638   10653 vnc_worke            qemu   84u      BLK
>     253,24       0t0  112027277 /dev/dm-24
> qemu-kvm    11479                              qemu   43u      BLK
>     253,27       0t0  112135384 /dev/dm-27
> qemu-kvm    11479                              qemu  110u      BLK
>     253,25       0t0  112140523 /dev/dm-25
> 
> so nothing for dm-2
> 
> What to do to crosscheck what is using the device and so preventing the
> "-f" to complete?

can you try 

    dmsetup info  /dev/mapper/36090a0d800851c9d2195d5b837c9e328

and check "Open count" filed to see if there is still anything open?

Also, you can try

    fuser /dev/dm-2

to see which process is using the device


> Now I get
> 
> # multipath -l
> 364817197c52f98316900666e8c2b0b2b dm-14 EQLOGIC,100E-00
> size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
> `-+- policy='round-robin 0' prio=0 status=active
> 
>   |- 16:0:0:0 sde 8:64 active undef running
> 
>   `- 17:0:0:0 sdf 8:80 active undef running
> 36090a0d800851c9d2195d5b837c9e328 dm-2 ##,##
> size=5.0T features='0' hwhandler='0' wp=rw
> 
> Another thing to improve perhaps in the ansible playbook is that usually
> when in general I remove FC or iSCSI LUNs under multipath on a Linux
> system, after the "multipath -f" command and before the "echo 1 > ...
> /device/delete" one I run also, for safeness:
> 
> blockdev --flushbufs /dev/$i
> where $i loops over the devices composing the multipath.
> 
> I see that inside the web admin Gui under Datacenter--> iSCSI multipath
> iscsi1
> iscsi2
> there is no more the connection to the removed SD.
> But at the host side nothing changed from the iSCSI point of view.
> So I executed:
> 
> log out from the sessions:
> [root@ov300 ~]# iscsiadm -m session -r 1 -u
> Logging out of session [sid: 1, target:
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750,
> portal: 10.10.100.7,3260]
> Logout of [sid: 1, target:
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750,
> portal: 10.10.100.7,3260] successful.
> [root@ov300 ~]# iscsiadm -m session -r 2 -u
> Logging out of session [sid: 2, target:
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750,
> portal: 10.10.100.7,3260]
> Logout of [sid: 2, target:
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750,
> portal: 10.10.100.7,3260] successful.
> [root@ov300 ~]#
> 
> and then removal of the node
> [root@ov300 ~]# iscsiadm -m node -T
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750 -o
> delete
> [root@ov300 ~]# ll /var/lib/iscsi/nodes/
> total 4
> drw-------. 3 root root 4096 Jul 13 11:18
> iqn.2001-05.com.equallogic:4-771816-31982fc59-2b0b2b8c6e660069-ovsd3920
> [root@ov300 ~]#
> 
> while previously I had:
> [root@ov300 ~]# ll /var/lib/iscsi/nodes/
> total 8
> drw-------. 3 root root 4096 Jan 12  2021
> iqn.2001-05.com.equallogic:0-8a0906-9d1c8500d-28e3c937b8d59521-ovsd3750
> drw-------. 3 root root 4096 Jul 13 11:18
> iqn.2001-05.com.equallogic:4-771816-31982fc59-2b0b2b8c6e660069-ovsd3920
> [root@ov301 ~]#
> 
> Otherwise I think that at reboot the host will try to reconnect to the no
> more existing portal...
> 
> Comments welcome
> 
> Gianluca

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4GYVBKZJ23R3OERD6SE3BUZVPTYKE2MY/

Reply via email to