Duane,

A possibility that can be verified.


When use the text-based interface yast zfcp. If cio_ignore is enabled, you 
might need to free blacklisted FCP devices before by using yast cio.


Can this be the problem?



Marcio da Silva Nunes 
Analista 
Superintendência de Produtos e Serviços-Centro de Dados 
Diretoria de Operações 
+55 (61)2021-9099 
+55 (61) 99981-0376

----- Mensagem original -----
De: "Duane Beyer" <[email protected]>
Para: "Linux on 390 Port" <[email protected]>
Enviadas: Sábado, 5 de dezembro de 2020 12:28:39
Assunto: Re: LVM does not come online after reboot. PVMOVE used to migrate data.

Marcio,

All of the disks are for data.  The root file system is ECKD. 

The FCP LUNs were added via yast so all the configuration was done with yast.  
I do see the associated rules files and the correct fcp definitions in 
/dev/disk/....
The system is a SLES 12 SP4:  Linux lxnebnp2 4.12.14-95.16-default #1 SMP Mon 
May 6 09:58:58 UTC 2019 (1da26c7) s390x s390x s390x GNU/Linux
Multipath Version: multipath-tools-0.7.3+129+suse.e8ca031-2.8.1.s390x

/etc/udev/rules.d/41-zfcp-host-0.0.4000.rules
/etc/udev/rules.d/41-zfcp-lun-0.0.5000.rules
/etc/udev/rules.d/41-zfcp-lun-0.0.4000.rules
/etc/udev/rules.d/41-zfcp-host-0.0.5000.rules
/dev/disk/by-path/ccw-0.0.5000-zfcp-0x5005076810264e3c:0x0001000000000000
/dev/disk/by-path/ccw-0.0.4000-zfcp-0x5005076810154da7:0x0002000000000000
/dev/disk/by-path/ccw-0.0.4000-zfcp-0x5005076810154da7:0x0003000000000000
.... cut to save space ....
/dev/disk/by-path/ccw-0.0.4000-zfcp-0x5005076810154e3c:0x0001000000000000
/dev/disk/by-path/ccw-0.0.4000-zfcp-0x5005076810154e3a:0x0000000000000000

Duane
-----Original Message-----
From: Linux on 390 Port <[email protected]> On Behalf Of Marcio da Silva 
Nunes
Sent: Saturday, December 5, 2020 9:27 AM
To: [email protected]
Subject: Re: LVM does not come online after reboot. PVMOVE used to migrate data.

Duane, 

Let's see if it helps.

Were the new FCP that are not associated with root included in the zfcp.conf 
file?

If they are part of the root system disk then they must be in zipl.conf.

Maybe it works.


Marcio da Silva Nunes
Analista
Superintendência de Produtos e Serviços-Centro de Dados Diretoria de Operações 
+55 (61)2021-9099
+55 (61) 99981-0376

----- Mensagem original -----
De: "Duane Beyer" <[email protected]>
Para: [email protected]
Enviadas: Sexta-feira, 4 de dezembro de 2020 18:01:08
Assunto: Re: LVM does not come online after reboot. PVMOVE used to migrate data.

I should have included this.   Status, NOT available

This is what LVDISPLAY displays for the LVM disks:
  --- Logical volume ---
  LV Path                /dev/u01/u01
  LV Name                u01
  VG Name                u01
  LV UUID                6qzW2g-xJr4-osII-ULfn-d9nb-VscF-saGdBL
  LV Write Access        read/write
  LV Creation host, time lxnebnp2, 2019-06-25 13:18:38 -0400
  LV Status              NOT available
  LV Size                100.00 GiB
  Current LE             25599
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  Persistent major       254
  Persistent minor       123


PVS output:
  LV   VG      Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync 
Convert
  home hometmp -wi-a-----  23.44g
  tmp  hometmp -wi-a-----  23.44g
  u01  u01     -wim------ 100.00g
  u02  u02     -wim------ 299.99g
  u03  u034    -wim------  50.00g
  u04  u034    -wim------  49.00g
  u05  u056    -wim------  49.00g
  u06  u056    -wim------  49.00g

VGDISPLAY:
  --- Volume group ---
  VG Name               u01
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  8
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               100.00 GiB
  PE Size               4.00 MiB
  Total PE              25599
  Alloc PE / Size       25599 / 100.00 GiB
  Free  PE / Size       0 / 0
  VG UUID               pzRsQG-o76z-SBw2-Xu9T-T2cC-eXCZ-NU6WcQ


Duane

-----Original Message-----
From: Duane Beyer
Sent: Friday, December 4, 2020 3:45 PM
To: Linux on 390 Port <[email protected]>
Subject: LVM does not come online after reboot. PVMOVE used to migrate data.


We have installed a new IBM FS7200 and are in the process of moving systems off 
the old disks and on to the new 7200. For many systems, we are doing this while 
the service is up, so we are using pvmove. 

All of these systems have direct attached SCSI devices that are part of LVM's.  

Since the systems are production, we used pvmove to move the data from the old 
disks to the new ones.  The following process was used. 

pvcreate the new disk
vgextend
pvmove to the new pv
vgreduce to remove the old disk
pvremove to remove the old scsi disk definition. 

The old disks and new ones used different FCP devices. After the pvmove, we 
cleaned up the scsi disk infomation and verified all was good with multipath 
-ll and pvs.  Everything looked good. The old fcp devices were taken offline 
with zfcp_host_configure 0.0.2000 0 followed by vmcp det fcpdevice.  
Everything was working fine until we rebooted one of the systems using the new 
LUNS.  

The system went into emergency mode.   
The following action was taken to recover:

The LVM disks were commented out of fstab and the system was rebooted. The 
system came up clean without the SCSI disks. 

At this point, I issued a lvchange to see if that would correct the issue.  

lvchange -ay /dev/u01/u01
lvchange -ay /dev/u02/u02
lvchange -ay /dev/u034/u03
lvchange -ay /dev/u034/u04
lvchange -ay /dev/u056/u05
lvchange -ay /dev/u056/u06

It worked like a charm, but this is only dynamic temp fix.  Reboot the system 
again and we are back to square one. 

I next tried to make the change persistant with the follow, lvchange  -ay -M y 
--minor 145 u01

This did not apear to do anything.  I still get the NOT AVAILABLE status in 
lvdisplay. 

However,  some of the VG's have two LV's and now I get a duplicate minor nunber 
when I run lvchange on that set. 
lvchange -ay /dev/u056/u05
lvchange -ay /dev/u056/u06
(For now, lets focus on the original issue)

At this point I am at a loss.  I am not sure why the LVM will not come online 
at boot. As a work-a-round,  I have removed the LVM's volumes from FSTAB and 
manually issue the lvchange after the system comes up and then mount the LVM's. 
This is not a valid fix, but at least we are not dead in the water. 

Degug time:
I have turned on debug for lvm in lvm.conf and have captured a LVM debug log.   

In the log, I can see each of the SCSI disks are found:
device/dev-cache.c:714     Found dev 8:48 
/dev/disk/by-id/scsi-3600507681081026d3800000000000119 - exists.

I also see this orphans message in the log that may be part of my issue.
cache/lvmcache.c:2080    lvmcache /dev/sda: now in VG #orphans_lvm2 
(#orphans_lvm2) with 0 mda(s).
filters/filter-signature.c:31      filter signature deferred /dev/sdao
filters/filter-md.c:99      filter md deferred /dev/sdao
format_text/text_label.c:423     /dev/sda: PV header extension version 2 found
filters/filter-persistent.c:346     filter caching good /dev/sdao
format_text/format-text.c:331     Reading mda header sector from /dev/sda at 
4096

I found all the audit information in /etc/lvm/backup and /etc/lvm/archive and 
can see the changes that were made when I moved from the old SCSI disks to the 
new ones on the fs7200. 

What I don't understand is why LVM is not able to bring the logical volume's 
online when the PV's and VG's look good after reboot and simply issuing 
“lvchange -ay /dev/xxxx” dynamically fixes the issue. 

Any help would be greatly appriciated.  Until I can resolve this, I have 
stopped the migration of any additional  systems from the old disks to the new. 
 In total, I have moved about 15 systems so far. 

I do have a backout plan, but that requires bringing down the production 
systems and doing a lot of manual work. If I can fix this dynamically, that 
would be great.  If I need to issue a few commands and reboot the system to fix 
it, that would also be acceptable since the outage would be mininal.  What I 
don't want to do is have to take the systems down for an extended amount of 
time to rebuild and copy to new disks. 

If additional information is needed, just let me know and I will provide it.  

Thanks in advance.
Duane

Duane Beyer
Marist College


Scanned by McAfee and confirmed virus-free.     
Find out more here: https://bit.ly/2zCJMrO

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to