Marcy
I was getting ready to call this in to SuSE and figured I should put the latest
service on the system first (Because that is the first thing they will ask).
Looks like that resolved the issue.
The system was at:
The system is a SLES 12 SP4: Linux lxnebnp2 4.12.14-95.16-default
#1 SMP Mon May 6 09:58:58 UTC 2019 (1da26c7) s390x s390x s390x GNU/Linux
Multipath Version:
multipath-tools-0.7.3+129+suse.e8ca031-2.8.1.s390x
The system is now:
SLES 12 SP4: Linux lxnebnp2 4.12.14-95.54-default #1 SMP Thu Jun 4
12:49:28 UTC 2020 (892ef1f) s390x s390x s390x GNU/Linux
multipath-tools-0.7.3+153+suse.80d9ed4-2.16.1.s390x
lvm2-2.02.180-9.34.8.s390x
Thanks everyone for helping out with this. I got a lot of good information and
learned a lot about the internals of LVM. Still far from an expert, but at
least a I will panic less next time.
From what I read on a number of forms, there were a number of things fixed in
multipath and lvm on SLES 12 SP4.
Duane
-----Original Message-----
From: Linux on 390 Port <[email protected]> On Behalf Of Marcy Cortes
Sent: Sunday, December 6, 2020 9:08 PM
To: [email protected]
Subject: Re: LVM does not come online after reboot. PVMOVE used to migrate data.
When it's up, try running
grub2-mkconfig -o /boot/grub2/grub.cfg
grub2-install
dracut -f
-----Original Message-----
From: Linux on 390 Port <[email protected]> On Behalf Of Duane Beyer
Sent: Sunday, December 6, 2020 8:14 AM
To: [email protected]
Subject: Re: [LINUX-390] LVM does not come online after reboot. PVMOVE used to
migrate data.
Thanks Mike,
That is the temporary plan as a work around, but I really need to fix the root
cause.
I did receive a message external to the list about checking the global_filter
in /etc/lvm/lvm.conf. That’s what I am going to look at today. I'll post my
results.
Duane
-----Original Message-----
From: Linux on 390 Port <[email protected]> On Behalf Of Michael MacIsaac
Sent: Sunday, December 6, 2020 7:07 AM
To: [email protected]
Subject: Re: LVM does not come online after reboot. PVMOVE used to migrate data.
Duane,
> Issuing lvchange -ay /dev/xxx/xxx changes the status to available.
A bit kludgy perhaps, but could you define a service that does the lvchange
before /etc/fstab is read?
-Mike M
On Sat, Dec 5, 2020 at 6:18 PM Duane Beyer <[email protected]> wrote:
> Marcio
>
> The FCP devices are online and active after a reboot and the LUNS are
> showing up in multipath. The system is dropping into emergency mode
> because the LV's are defined in /etc/fstab. If I remove them, the
> fstab, the system IPL's with no issues. The LV's are just in Not
> Available status.
>
> Issuing lvchange -ay /dev/xxx/xxx changes the status to available.
>
> My guess is it is something in the LV's metadata that get reread/reset
> when the lvchange command is issued. The problem is that's not
> persistent, so the next reboot, we are in the same situation.
>
> Duane
>
> -----Original Message-----
> From: Linux on 390 Port <[email protected]> On Behalf Of Marcio
> da Silva Nunes
> Sent: Saturday, December 5, 2020 2:58 PM
> To: [email protected]
> Subject: Re: LVM does not come online after reboot. PVMOVE used to
> migrate data.
>
> Duane,
>
>
> A possibility that can be verified.
>
>
> When use the text-based interface yast zfcp. If cio_ignore is enabled,
> you might need to free blacklisted FCP devices before by using yast cio.
>
>
> Can this be the problem?
>
>
>
> Marcio da Silva Nunes
> Analista
> Superintendência de Produtos e Serviços-Centro de Dados Diretoria de
> Operações
> +55 (61)2021-9099
> +55 (61) 99981-0376
>
> ----- Mensagem original -----
> De: "Duane Beyer" <[email protected]>
> Para: "Linux on 390 Port" <[email protected]>
> Enviadas: Sábado, 5 de dezembro de 2020 12:28:39
> Assunto: Re: LVM does not come online after reboot. PVMOVE used to
> migrate data.
>
> Marcio,
>
> All of the disks are for data. The root file system is ECKD.
>
> The FCP LUNs were added via yast so all the configuration was done
> with yast. I do see the associated rules files and the correct fcp
> definitions in /dev/disk/....
> The system is a SLES 12 SP4: Linux lxnebnp2 4.12.14-95.16-default #1
> SMP Mon May 6 09:58:58 UTC 2019 (1da26c7) s390x s390x s390x GNU/Linux
> Multipath
> Version: multipath-tools-0.7.3+129+suse.e8ca031-2.8.1.s390x
>
> /etc/udev/rules.d/41-zfcp-host-0.0.4000.rules
> /etc/udev/rules.d/41-zfcp-lun-0.0.5000.rules
> /etc/udev/rules.d/41-zfcp-lun-0.0.4000.rules
> /etc/udev/rules.d/41-zfcp-host-0.0.5000.rules
> /dev/disk/by-path/ccw-0.0.5000-zfcp-0x5005076810264e3c:0x0001000000000
> 000
> /dev/disk/by-path/ccw-0.0.4000-zfcp-0x5005076810154da7:0x0002000000000
> 000
> /dev/disk/by-path/ccw-0.0.4000-zfcp-0x5005076810154da7:0x0003000000000
> 000
> .... cut to save space ....
> /dev/disk/by-path/ccw-0.0.4000-zfcp-0x5005076810154e3c:0x0001000000000
> 000
> /dev/disk/by-path/ccw-0.0.4000-zfcp-0x5005076810154e3a:0x0000000000000
> 000
>
> Duane
> -----Original Message-----
> From: Linux on 390 Port <[email protected]> On Behalf Of Marcio
> da Silva Nunes
> Sent: Saturday, December 5, 2020 9:27 AM
> To: [email protected]
> Subject: Re: LVM does not come online after reboot. PVMOVE used to
> migrate data.
>
> Duane,
>
> Let's see if it helps.
>
> Were the new FCP that are not associated with root included in the
> zfcp.conf file?
>
> If they are part of the root system disk then they must be in zipl.conf.
>
> Maybe it works.
>
>
> Marcio da Silva Nunes
> Analista
> Superintendência de Produtos e Serviços-Centro de Dados Diretoria de
> Operações
> +55 (61)2021-9099
> +55 (61) 99981-0376
>
> ----- Mensagem original -----
> De: "Duane Beyer" <[email protected]>
> Para: [email protected]
> Enviadas: Sexta-feira, 4 de dezembro de 2020 18:01:08
> Assunto: Re: LVM does not come online after reboot. PVMOVE used to
> migrate data.
>
> I should have included this. Status, NOT available
>
> This is what LVDISPLAY displays for the LVM disks:
> --- Logical volume ---
> LV Path /dev/u01/u01
> LV Name u01
> VG Name u01
> LV UUID 6qzW2g-xJr4-osII-ULfn-d9nb-VscF-saGdBL
> LV Write Access read/write
> LV Creation host, time lxnebnp2, 2019-06-25 13:18:38 -0400
> LV Status NOT available
> LV Size 100.00 GiB
> Current LE 25599
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> Persistent major 254
> Persistent minor 123
>
>
> PVS output:
> LV VG Attr LSize Pool Origin Data% Meta% Move Log
> Cpy%Sync Convert
> home hometmp -wi-a----- 23.44g
> tmp hometmp -wi-a----- 23.44g
> u01 u01 -wim------ 100.00g
> u02 u02 -wim------ 299.99g
> u03 u034 -wim------ 50.00g
> u04 u034 -wim------ 49.00g
> u05 u056 -wim------ 49.00g
> u06 u056 -wim------ 49.00g
>
> VGDISPLAY:
> --- Volume group ---
> VG Name u01
> System ID
> Format lvm2
> Metadata Areas 1
> Metadata Sequence No 8
> VG Access read/write
> VG Status resizable
> MAX LV 0
> Cur LV 1
> Open LV 0
> Max PV 0
> Cur PV 1
> Act PV 1
> VG Size 100.00 GiB
> PE Size 4.00 MiB
> Total PE 25599
> Alloc PE / Size 25599 / 100.00 GiB
> Free PE / Size 0 / 0
> VG UUID pzRsQG-o76z-SBw2-Xu9T-T2cC-eXCZ-NU6WcQ
>
>
> Duane
>
> -----Original Message-----
> From: Duane Beyer
> Sent: Friday, December 4, 2020 3:45 PM
> To: Linux on 390 Port <[email protected]>
> Subject: LVM does not come online after reboot. PVMOVE used to migrate
> data.
>
>
> We have installed a new IBM FS7200 and are in the process of moving
> systems off the old disks and on to the new 7200. For many systems, we
> are doing this while the service is up, so we are using pvmove.
>
> All of these systems have direct attached SCSI devices that are part
> of LVM's.
>
> Since the systems are production, we used pvmove to move the data from
> the old disks to the new ones. The following process was used.
>
> pvcreate the new disk
> vgextend
> pvmove to the new pv
> vgreduce to remove the old disk
> pvremove to remove the old scsi disk definition.
>
> The old disks and new ones used different FCP devices. After the
> pvmove, we cleaned up the scsi disk infomation and verified all was
> good with multipath -ll and pvs. Everything looked good. The old fcp
> devices were taken offline with zfcp_host_configure 0.0.2000 0
> followed by vmcp det fcpdevice.
> Everything was working fine until we rebooted one of the systems using
> the new LUNS.
>
> The system went into emergency mode.
> The following action was taken to recover:
>
> The LVM disks were commented out of fstab and the system was rebooted.
> The system came up clean without the SCSI disks.
>
> At this point, I issued a lvchange to see if that would correct the
> issue.
>
> lvchange -ay /dev/u01/u01
> lvchange -ay /dev/u02/u02
> lvchange -ay /dev/u034/u03
> lvchange -ay /dev/u034/u04
> lvchange -ay /dev/u056/u05
> lvchange -ay /dev/u056/u06
>
> It worked like a charm, but this is only dynamic temp fix. Reboot the
> system again and we are back to square one.
>
> I next tried to make the change persistant with the follow, lvchange
> -ay -M y --minor 145 u01
>
> This did not apear to do anything. I still get the NOT AVAILABLE
> status in lvdisplay.
>
> However, some of the VG's have two LV's and now I get a duplicate
> minor nunber when I run lvchange on that set.
> lvchange -ay /dev/u056/u05
> lvchange -ay /dev/u056/u06
> (For now, lets focus on the original issue)
>
> At this point I am at a loss. I am not sure why the LVM will not come
> online at boot. As a work-a-round, I have removed the LVM's volumes
> from FSTAB and manually issue the lvchange after the system comes up
> and then mount the LVM's. This is not a valid fix, but at least we are
> not dead in the water.
>
> Degug time:
> I have turned on debug for lvm in lvm.conf and have captured a LVM
> debug log.
>
> In the log, I can see each of the SCSI disks are found:
> device/dev-cache.c:714 Found dev 8:48
> /dev/disk/by-id/scsi-3600507681081026d3800000000000119 - exists.
>
> I also see this orphans message in the log that may be part of my issue.
> cache/lvmcache.c:2080 lvmcache /dev/sda: now in VG #orphans_lvm2
> (#orphans_lvm2) with 0 mda(s).
> filters/filter-signature.c:31 filter signature deferred /dev/sdao
> filters/filter-md.c:99 filter md deferred /dev/sdao
> format_text/text_label.c:423 /dev/sda: PV header extension version 2
> found
> filters/filter-persistent.c:346 filter caching good /dev/sdao
> format_text/format-text.c:331 Reading mda header sector from /dev/sda
> at 4096
>
> I found all the audit information in /etc/lvm/backup and
> /etc/lvm/archive and can see the changes that were made when I moved
> from the old SCSI disks to the new ones on the fs7200.
>
> What I don't understand is why LVM is not able to bring the logical
> volume's online when the PV's and VG's look good after reboot and
> simply issuing “lvchange -ay /dev/xxxx” dynamically fixes the issue.
>
> Any help would be greatly appriciated. Until I can resolve this, I
> have stopped the migration of any additional systems from the old
> disks to the new. In total, I have moved about 15 systems so far.
>
> I do have a backout plan, but that requires bringing down the
> production systems and doing a lot of manual work. If I can fix this
> dynamically, that would be great. If I need to issue a few commands
> and reboot the system to fix it, that would also be acceptable since the
> outage would be mininal.
> What I don't want to do is have to take the systems down for an
> extended amount of time to rebuild and copy to new disks.
>
> If additional information is needed, just let me know and I will
> provide it.
>
> Thanks in advance.
> Duane
>
> Duane Beyer
> Marist College
>
>
> Scanned by McAfee and confirmed virus-free.
> Find out more here: https://bit.ly/2zCJMrO
>
> ----------------------------------------------------------------------
> For LINUX-390 subscribe / signoff / archive access instructions, send
> email to [email protected] with the message: INFO LINUX-390 or
> visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
>
> ----------------------------------------------------------------------
> For LINUX-390 subscribe / signoff / archive access instructions, send
> email to [email protected] with the message: INFO LINUX-390 or
> visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
>
> ----------------------------------------------------------------------
> For LINUX-390 subscribe / signoff / archive access instructions, send
> email to [email protected] with the message: INFO LINUX-390 or
> visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
>
> ----------------------------------------------------------------------
> For LINUX-390 subscribe / signoff / archive access instructions, send
> email to [email protected] with the message: INFO LINUX-390 or
> visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
>
> ----------------------------------------------------------------------
> For LINUX-390 subscribe / signoff / archive access instructions, send
> email to [email protected] with the message: INFO LINUX-390 or
> visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
>
--
-Mike MacIsaac
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions, send email to
[email protected] with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions, send email to
[email protected] with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions, send email to
[email protected] with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390