On Sun, Mar 5, 2017 at 2:40 PM, Pavel Gashev <p...@acronis.com> wrote:
> Please also consider a case when a single iSCSI target has several LUNs and 
> you remove one of them.
> In this case you should not logout.

Right. ovirt manages connections to storage. When you remove the last
usage of a connection, we should disconnect from the target.

If this is not the case, this is ovirt-engine bug.

>
> -----Original Message-----
> From: <users-boun...@ovirt.org> on behalf of Nelson Lameiras 
> <nelson.lamei...@lyra-network.com>
> Date: Friday, 3 March 2017 at 20:55
> To: Nir Soffer <nsof...@redhat.com>
> Cc: users <users@ovirt.org>
> Subject: Re: [ovirt-users] best way to remove SAN lun
>
> Hello Nir,
>
> I think I was not clear in my explanations, let me try again :
>
> we have a oVirt 4.0.5.5 cluster with multiple hosts (centos 7.2).
> In this cluster, we added a SAN volume (iscsi) a few months ago directly in 
> GUI
> Later we had to remove a DATA volume (SAN iscsi). Below the steps we have 
> taken :
>
> 1- we migrated all disks outside the volume (oVirt)
> 2- we put volume on maintenance (oVirt)
> 3- we detach volume (oVirt)
> 4- we removed/destroyed volume (oVirt)
>
> In SAN :
> 5- we put it offline on SAN
> 6- we delete it from SAN
>
> We thought this would be enough, but later we has a serious incident when log 
> partition went full (partially our fault) :
> /var/log/messages was continuously logging that it was still trying to reach 
> the SAN volumes (we have since taken care of the issue of log space issue => 
> more aggressive logrotate, etc)
>
> The real solution was to add two more steps, using shell in ALL hosts :
> 4a - logout from SAN : iscsiadm -m node --logout -T iqn.XXXXXXXX
> 4b - remove iscsi targets : rm -fr /var/lib/iscsi/nodes/iqn.XXXXXXXXX
>
> This effectively solved our problem, but was fastidious since we had to do it 
> manually in all hosts (imagine if we had hundreds of hosts...)
>
> So my question was : shouldn't it be oVirt job to system "logout" and "remove 
> iscsi targets" automatically when a volume is removed from oVirt ? maybe not, 
> and I'm missing something?
>
> cordialement, regards,
>
> Nelson LAMEIRAS
> Ingénieur Systèmes et Réseaux / Systems and Networks engineer
> Tel: +33 5 32 09 09 70
> nelson.lamei...@lyra-network.com
>
> www.lyra-network.com | www.payzen.eu
>
>
>
>
>
> Lyra Network, 109 rue de l'innovation, 31670 Labège, FRANCE
>
> ----- Original Message -----
> From: "Nir Soffer" <nsof...@redhat.com>
> To: "Nelson Lameiras" <nelson.lamei...@lyra-network.com>
> Cc: "Gianluca Cecchi" <gianluca.cec...@gmail.com>, "Adam Litke" 
> <ali...@redhat.com>, "users" <users@ovirt.org>
> Sent: Wednesday, February 22, 2017 8:27:26 AM
> Subject: Re: [ovirt-users] best way to remove SAN lun
>
> On Wed, Feb 22, 2017 at 9:03 AM, Nelson Lameiras
> <nelson.lamei...@lyra-network.com> wrote:
>> Hello,
>>
>> Not sure it is the same issue, but we have had a "major" issue recently in 
>> our production system when removing a ISCSI volume from oVirt, and then 
>> removing it from SAN.
>
> What version? OS version?
>
> The order must be:
>
> 1. remove the LUN from storage domain
>     will be available in next 4.1 release. in older versions you have
> to remove the storage domain
>
> 2. unzone the LUN on the server
>
> 3. remove the multipath devices and the paths on the nodes
>
>> The issue being that each host was still trying to access regularly to the 
>> SAN volume in spite of not being completely removed from oVirt.
>
> What do you mean by "not being completely removed"?
>
> Who was accessing the volume?
>
>> This led to an massive increase of error logs, which filled completely 
>> /var/log partition,
>
> Which log was full with errors?
>
>> which snowballed into crashing vdsm and other nasty consequences.
>
> You should have big enough /var/log to avoid such issues.
>
>>
>> Anyway, the solution was to manually logout from SAN (in each host) with 
>> iscsiadm and manually remove iscsi targets (again in each host). It was not 
>> difficult once the problem was found because currently we only have 3 hosts 
>> in this cluster, but I'm wondering what would happen if we had hundreds of 
>> hosts ?
>>
>> Maybe I'm being naive but shouldn't this be "oVirt job" ? Is there a RFE 
>> still waiting to be included on this subject or should I write one ?
>
> We have RFE for this here:
> https://bugzilla.redhat.com/1310330
>
> But you must understand that ovirt does not control your storage server,
> you are responsible to add devices on the storage server, and remove
> them. We are only consuming the devices.
>
> Even we we provide a way to remove devices on all hosts, you will have
> to remove the device on the storage server before removing it from
> hosts. If not, ovirt will find the removed devices again in the next
> scsi rescan,
> and we do lot of these to support automatic discovery of new devices
> or resized devices.
>
> Nir
>
>>
>> cordialement, regards,
>>
>>
>> Nelson LAMEIRAS
>> Ingénieur Systèmes et Réseaux / Systems and Networks engineer
>> Tel: +33 5 32 09 09 70
>> nelson.lamei...@lyra-network.com
>>
>> www.lyra-network.com | www.payzen.eu
>>
>>
>>
>>
>>
>> Lyra Network, 109 rue de l'innovation, 31670 Labège, FRANCE
>>
>> ----- Original Message -----
>> From: "Nir Soffer" <nsof...@redhat.com>
>> To: "Gianluca Cecchi" <gianluca.cec...@gmail.com>, "Adam Litke" 
>> <ali...@redhat.com>
>> Cc: "users" <users@ovirt.org>
>> Sent: Tuesday, February 21, 2017 6:32:18 PM
>> Subject: Re: [ovirt-users] best way to remove SAN lun
>>
>> On Tue, Feb 21, 2017 at 7:25 PM, Gianluca Cecchi
>> <gianluca.cec...@gmail.com> wrote:
>>> On Tue, Feb 21, 2017 at 6:10 PM, Nir Soffer <nsof...@redhat.com> wrote:
>>>>
>>>> This is caused by active lvs on the remove storage domains that were not
>>>> deactivated during the removal. This is a very old known issue.
>>>>
>>>> You have remove the remove device mapper entries - you can see the devices
>>>> using:
>>>>
>>>>     dmsetup status
>>>>
>>>> Then you can remove the mapping using:
>>>>
>>>>     dmsetup remove device-name
>>>>
>>>> Once you removed the stale lvs, you will be able to remove the multipath
>>>> device and the underlying paths, and lvm will not complain about read
>>>> errors.
>>>>
>>>> Nir
>>>
>>>
>>> OK Nir, thanks for advising.
>>>
>>> So what I run with success on the 2 hosts
>>>
>>> [root@ovmsrv05 vdsm]# for dev in $(dmsetup status | grep
>>> 900b1853--e192--4661--a0f9--7c7c396f6f49 | cut -d ":" -f 1)
>>> do
>>>    dmsetup remove $dev
>>> done
>>> [root@ovmsrv05 vdsm]#
>>>
>>> and now I can run
>>>
>>> [root@ovmsrv05 vdsm]# multipath -f 3600a0b80002999020000cd3c5501458f
>>> [root@ovmsrv05 vdsm]#
>>>
>>> Also, with related names depending on host,
>>>
>>> previous maps to single devices were for example in ovmsrv05:
>>>
>>> 3600a0b80002999020000cd3c5501458f dm-4 IBM     ,1814      FAStT
>>> size=2.0T features='2 pg_init_retries 50' hwhandler='1 rdac' wp=rw
>>> |-+- policy='service-time 0' prio=0 status=enabled
>>> | |- 0:0:0:2 sdb        8:16  failed undef running
>>> | `- 1:0:0:2 sdh        8:112 failed undef running
>>> `-+- policy='service-time 0' prio=0 status=enabled
>>>   |- 0:0:1:2 sdg        8:96  failed undef running
>>>   `- 1:0:1:2 sdn        8:208 failed undef running
>>>
>>> And removal of single path devices:
>>>
>>> [root@ovmsrv05 root]# for dev in sdb sdh sdg sdn
>>> do
>>>   echo 1 > /sys/block/${dev}/device/delete
>>> done
>>> [root@ovmsrv05 vdsm]#
>>>
>>> All clean now... ;-)
>>
>> Great!
>>
>> I think we should have a script doing all these steps.
>>
>> Nir
>> _______________________________________________
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
> _______________________________________________
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to