On Wed, Feb 22, 2017 at 9:27 AM Nir Soffer <[email protected]> wrote:
> On Wed, Feb 22, 2017 at 9:03 AM, Nelson Lameiras > <[email protected]> wrote: > > Hello, > > > > Not sure it is the same issue, but we have had a "major" issue recently > in our production system when removing a ISCSI volume from oVirt, and then > removing it from SAN. > > What version? OS version? > > The order must be: > > 1. remove the LUN from storage domain > will be available in next 4.1 release. in older versions you have > to remove the storage domain > > 2. unzone the LUN on the server > > 3. remove the multipath devices and the paths on the nodes > > > The issue being that each host was still trying to access regularly to > the SAN volume in spite of not being completely removed from oVirt. > > What do you mean by "not being completely removed"? > > Who was accessing the volume? > > > This led to an massive increase of error logs, which filled completely > /var/log partition, > > Which log was full with errors? > > > which snowballed into crashing vdsm and other nasty consequences. > > You should have big enough /var/log to avoid such issues. > - Log rotation should be set better not to consume excessive amounts of space. I'm seeing /etc/vdsm/logrotate/vdsm - not sure why it's not under /etc/logrotate.d . Looking at the file, seems like there's a 15M limit and 100 files, which translates to 1.5GB - and it is supposed to be compressed (not sure XZ is a good choice - it's very CPU intensive). Others (Gluster?) do not seem to have a size limit, just weekly. Need to look at other components as well. - At least on ovirt-node, we'd like to separate some directories to different partitions. So for example core dumps (which should be limited as well) on /var/core do not fill the same partition as /var/log is and thus render the host unusable. And again, looking at file, we have a 'size 0' on /var/log/core/*.dump - and 'rotate 1' - not sure what it means - but it should not be in /var/log/core, but /var/core, I reckon. Y. > > > > Anyway, the solution was to manually logout from SAN (in each host) with > iscsiadm and manually remove iscsi targets (again in each host). It was not > difficult once the problem was found because currently we only have 3 hosts > in this cluster, but I'm wondering what would happen if we had hundreds of > hosts ? > > > > Maybe I'm being naive but shouldn't this be "oVirt job" ? Is there a RFE > still waiting to be included on this subject or should I write one ? > > We have RFE for this here: > https://bugzilla.redhat.com/1310330 > > But you must understand that ovirt does not control your storage server, > you are responsible to add devices on the storage server, and remove > them. We are only consuming the devices. > > Even we we provide a way to remove devices on all hosts, you will have > to remove the device on the storage server before removing it from > hosts. If not, ovirt will find the removed devices again in the next > scsi rescan, > and we do lot of these to support automatic discovery of new devices > or resized devices. > > Nir > > > > > cordialement, regards, > > > > > > Nelson LAMEIRAS > > Ingénieur Systèmes et Réseaux / Systems and Networks engineer > > Tel: +33 5 32 09 09 70 <+33%205%2032%2009%2009%2070> > > [email protected] > > > > www.lyra-network.com | www.payzen.eu > > > > > > > > > > > > Lyra Network, 109 rue de l'innovation, 31670 Labège, FRANCE > > > > ----- Original Message ----- > > From: "Nir Soffer" <[email protected]> > > To: "Gianluca Cecchi" <[email protected]>, "Adam Litke" < > [email protected]> > > Cc: "users" <[email protected]> > > Sent: Tuesday, February 21, 2017 6:32:18 PM > > Subject: Re: [ovirt-users] best way to remove SAN lun > > > > On Tue, Feb 21, 2017 at 7:25 PM, Gianluca Cecchi > > <[email protected]> wrote: > >> On Tue, Feb 21, 2017 at 6:10 PM, Nir Soffer <[email protected]> wrote: > >>> > >>> This is caused by active lvs on the remove storage domains that were > not > >>> deactivated during the removal. This is a very old known issue. > >>> > >>> You have remove the remove device mapper entries - you can see the > devices > >>> using: > >>> > >>> dmsetup status > >>> > >>> Then you can remove the mapping using: > >>> > >>> dmsetup remove device-name > >>> > >>> Once you removed the stale lvs, you will be able to remove the > multipath > >>> device and the underlying paths, and lvm will not complain about read > >>> errors. > >>> > >>> Nir > >> > >> > >> OK Nir, thanks for advising. > >> > >> So what I run with success on the 2 hosts > >> > >> [root@ovmsrv05 vdsm]# for dev in $(dmsetup status | grep > >> 900b1853--e192--4661--a0f9--7c7c396f6f49 | cut -d ":" -f 1) > >> do > >> dmsetup remove $dev > >> done > >> [root@ovmsrv05 vdsm]# > >> > >> and now I can run > >> > >> [root@ovmsrv05 vdsm]# multipath -f 3600a0b80002999020000cd3c5501458f > >> [root@ovmsrv05 vdsm]# > >> > >> Also, with related names depending on host, > >> > >> previous maps to single devices were for example in ovmsrv05: > >> > >> 3600a0b80002999020000cd3c5501458f dm-4 IBM ,1814 FAStT > >> size=2.0T features='2 pg_init_retries 50' hwhandler='1 rdac' wp=rw > >> |-+- policy='service-time 0' prio=0 status=enabled > >> | |- 0:0:0:2 sdb 8:16 failed undef running > >> | `- 1:0:0:2 sdh 8:112 failed undef running > >> `-+- policy='service-time 0' prio=0 status=enabled > >> |- 0:0:1:2 sdg 8:96 failed undef running > >> `- 1:0:1:2 sdn 8:208 failed undef running > >> > >> And removal of single path devices: > >> > >> [root@ovmsrv05 root]# for dev in sdb sdh sdg sdn > >> do > >> echo 1 > /sys/block/${dev}/device/delete > >> done > >> [root@ovmsrv05 vdsm]# > >> > >> All clean now... ;-) > > > > Great! > > > > I think we should have a script doing all these steps. > > > > Nir > > _______________________________________________ > > Users mailing list > > [email protected] > > http://lists.ovirt.org/mailman/listinfo/users > _______________________________________________ > Users mailing list > [email protected] > http://lists.ovirt.org/mailman/listinfo/users >
_______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

