On 14 November 2016 at 11:49, bugproxy <[email protected]> wrote:
> ------- Comment From [email protected] 2016-11-14 06:39 EDT-------
> (In reply to comment #7)
>> > (In reply to comment #1)
>> > The installation was on a FCP SCSI SAN volumes each with two active paths.
>> > Multipath was involved. The system IPLed fine up to the point that we
>> > expanded the /root filesystem to span volumes. At boot time, the system
>> > was unable to locate the second segment of the /root filesystem. The
>> > error
>> > message indicated this was due to lvmetad not being not active.
>> For the zfcp case, did you use the chzdev tool to activate the paths of your
>> new additional LVM physical volume (PV)?
>
> Initially, the paths to the second luns were brought online manually
> with "echo 0x4000400f00000000 > unit_add". Then I followed up by
> running the "chzdev zfcp-lun -e --online" command and verified they were
After chzdev, one must run $ update-initramfs -u; such that the below
generated udev rules are copied into the initramfs.
To recover this system. Ubuntu initramfs should drop you into a
busybox shell, navigate sysfs to online the required device, execute
vgscan. At that point it should be sufficient to exit the busybox
shell and initramfs should continue boot as normal.
Once the boot is complete, run $ sudo update-initramfs -u, and reboot.
It should boot fine from now on.
> online and persistent with the lszdev command.
>
> Below are the rules files for the 0.0.e100 and 0.0.e300 paths. Below
> that is the output of the lszdev command. The date on these files is
> 10/26/2016. The output from the lszdev command is also from 10/26/2016.
>
> cat 41-zfcp-lun-0.0.e100.rules
> # Generated by chzdev
> ACTION=="add", SUBSYSTEMS=="ccw", KERNELS=="0.0.e100",
> GOTO="start_zfcp_lun_0.0.e100"
> GOTO="end_zfcp_lun_0.0.e100"
>
> LABEL="start_zfcp_lun_0.0.e100"
> SUBSYSTEM=="fc_remote_ports", ATTR{port_name}=="0x5005076306135700",
> GOTO="cfg_fc_0.0.e100_0x5005076306135700"
> SUBSYSTEM=="scsi", ENV{DEVTYPE}=="scsi_device", KERNEL=="*:1074675712",
> KERNELS=="rport-*",
> ATTRS{fc_remote_ports/$id/port_name}=="0x5005076306135700",
> GOTO="cfg_scsi_0.0.e100_0x5005076306135700_0x4000400e00000000"
> SUBSYSTEM=="scsi", ENV{DEVTYPE}=="scsi_device", KERNEL=="*:1074741248",
> KERNELS=="rport-*",
> ATTRS{fc_remote_ports/$id/port_name}=="0x5005076306135700",
> GOTO="cfg_scsi_0.0.e100_0x5005076306135700_0x4000400f00000000"
> GOTO="end_zfcp_lun_0.0.e100"
>
> LABEL="cfg_fc_0.0.e100_0x5005076306135700"
> ATTR{[ccw/0.0.e100]0x5005076306135700/unit_add}="0x4000400e00000000"
> ATTR{[ccw/0.0.e100]0x5005076306135700/unit_add}="0x4000400f00000000"
> ATTR{[ccw/0.0.e100]0x5005076306135700/unit_add}="0x4000401200000000"
> ATTR{[ccw/0.0.e100]0x5005076306135700/unit_add}="0x4001400d00000000"
> ATTR{[ccw/0.0.e100]0x5005076306135700/unit_add}="0x4001401100000000"
> GOTO="end_zfcp_lun_0.0.e100"
>
> LABEL="cfg_scsi_0.0.e100_0x5005076306135700_0x4000400e00000000"
> ATTR{queue_depth}="32"
> GOTO="end_zfcp_lun_0.0.e100"
>
> LABEL="cfg_scsi_0.0.e100_0x5005076306135700_0x4000400f00000000"
> ATTR{queue_depth}="32"
> GOTO="end_zfcp_lun_0.0.e100"
>
> LABEL="end_zfcp_lun_0.0.e100"
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
> cat 41-zfcp-lun-0.0.e300.rules
> # Generated by chzdev
> ACTION=="add", SUBSYSTEMS=="ccw", KERNELS=="0.0.e300",
> GOTO="start_zfcp_lun_0.0.e300"
> GOTO="end_zfcp_lun_0.0.e300"
>
> LABEL="start_zfcp_lun_0.0.e300"
> SUBSYSTEM=="fc_remote_ports", ATTR{port_name}=="0x500507630618d700",
> GOTO="cfg_fc_0.0.e300_0x500507630618d700"
> SUBSYSTEM=="scsi", ENV{DEVTYPE}=="scsi_device", KERNEL=="*:1074675712",
> KERNELS=="rport-*",
> ATTRS{fc_remote_ports/$id/port_name}=="0x500507630618d700",
> GOTO="cfg_scsi_0.0.e300_0x500507630618d700_0x4000400e00000000"
> SUBSYSTEM=="scsi", ENV{DEVTYPE}=="scsi_device", KERNEL=="*:1074741248",
> KERNELS=="rport-*",
> ATTRS{fc_remote_ports/$id/port_name}=="0x500507630618d700",
> GOTO="cfg_scsi_0.0.e300_0x500507630618d700_0x4000400f00000000"
> GOTO="end_zfcp_lun_0.0.e300"
>
> LABEL="cfg_fc_0.0.e300_0x500507630618d700"
> ATTR{[ccw/0.0.e300]0x500507630618d700/unit_add}="0x4000400e00000000"
> ATTR{[ccw/0.0.e300]0x500507630618d700/unit_add}="0x4000400f00000000"
> ATTR{[ccw/0.0.e300]0x500507630618d700/unit_add}="0x4000401200000000"
> ATTR{[ccw/0.0.e300]0x500507630618d700/unit_add}="0x4001400d00000000"
> ATTR{[ccw/0.0.e300]0x500507630618d700/unit_add}="0x4001401100000000"
> GOTO="end_zfcp_lun_0.0.e300"
>
> LABEL="cfg_scsi_0.0.e300_0x500507630618d700_0x4000400e00000000"
> ATTR{queue_depth}="32"
> GOTO="end_zfcp_lun_0.0.e300"
>
> LABEL="cfg_scsi_0.0.e300_0x500507630618d700_0x4000400f00000000"
> ATTR{queue_depth}="32"
> GOTO="end_zfcp_lun_0.0.e300"
>
> LABEL="end_zfcp_lun_0.0.e300"
>
> ----------------------------------------------------------------------------------------------------------------------------------
>
> Output from lfzdev
> zfcp-lun 0.0.e100:0x5005076306135700:0x4000400e00000000 yes yes sda sg0
> zfcp-lun 0.0.e100:0x5005076306135700:0x4000400f00000000 yes yes sdc sg2
> zfcp-lun 0.0.e300:0x500507630618d700:0x4000400e00000000 yes yes sdb sg1
> zfcp-lun 0.0.e300:0x500507630618d700:0x4000400f00000000 yes yes sdd sg3
> zfcp-lun 0.0.e500:0x500507630633d700:0x4000401300000000 yes yes sde sg4
> zfcp-lun 0.0.e500:0x500507630633d700:0x4000401400000000 yes yes sdf sg5
> zfcp-lun 0.0.e500:0x500507630633d700:0x4000401500000000 yes yes sdg sg6
> zfcp-lun 0.0.e500:0x500507630633d700:0x4000401600000000 yes yes sdh sg7
> zfcp-lun 0.0.e500:0x500507630633d700:0x4000401700000000 yes yes sdi sg8
> zfcp-lun 0.0.e500:0x500507630633d700:0x4000401800000000 yes yes sdj sg9
> zfcp-lun 0.0.e500:0x500507630633d700:0x4000401900000000 yes yes sdk sg10
> zfcp-lun 0.0.e500:0x500507630633d700:0x4000401a00000000 yes yes sdl sg11
> zfcp-lun 0.0.e500:0x500507630633d700:0x4001401200000000 yes yes sdm sg12
> zfcp-lun 0.0.e500:0x500507630633d700:0x4001401300000000 yes yes sdn sg13
> zfcp-lun 0.0.e500:0x500507630633d700:0x4001401400000000 yes yes sdo sg14
> zfcp-lun 0.0.e500:0x500507630633d700:0x4001401500000000 yes yes sdp sg15
> zfcp-lun 0.0.e500:0x500507630633d700:0x4001401600000000 yes yes sdq sg16
> zfcp-lun 0.0.e500:0x500507630633d700:0x4001401700000000 yes yes sdr sg17
> zfcp-lun 0.0.e700:0x5005076306389700:0x4000401300000000 yes yes sds sg18
> zfcp-lun 0.0.e700:0x5005076306389700:0x4000401400000000 yes yes sdt sg19
> zfcp-lun 0.0.e700:0x5005076306389700:0x4000401500000000 yes yes sdu sg20
> zfcp-lun 0.0.e700:0x5005076306389700:0x4000401600000000 yes yes sdv sg21
> zfcp-lun 0.0.e700:0x5005076306389700:0x4000401700000000 yes yes sdw sg22
> zfcp-lun 0.0.e700:0x5005076306389700:0x4000401800000000 yes yes sdx sg23
> zfcp-lun 0.0.e700:0x5005076306389700:0x4000401900000000 yes yes sdy sg24
> zfcp-lun 0.0.e700:0x5005076306389700:0x4000401a00000000 yes yes sdz sg25
> zfcp-lun 0.0.e700:0x5005076306389700:0x4001401200000000 yes yes sdaa sg26
> zfcp-lun 0.0.e700:0x5005076306389700:0x4001401300000000 yes yes sdab sg27
> zfcp-lun 0.0.e700:0x5005076306389700:0x4001401400000000 yes yes sdac sg28
> zfcp-lun 0.0.e700:0x5005076306389700:0x4001401500000000 yes yes sdad sg29
> zfcp-lun 0.0.e700:0x5005076306389700:0x4001401600000000 yes yes sdae sg30
> zfcp-lun 0.0.e700:0x5005076306389700:0x4001401700000000 yes yes sdaf sg31
>
>> This is the only supported post-install method to (dynamically and)
>> persistently activate zfcp-attached FCP LUNs. See also
>> http://www.ibm.com/support/knowledgecenter/linuxonibm/com.ibm.linux.z.ludd/
>> ludd_t_fcp_wrk_addu.html.
>>
>> > PV Volume information:
>> > physical_volumes {
>> >
>> > pv0 {
>> > device = "/dev/sdb5" # Hint only
>>
>> > pv1 {
>> > device = "/dev/sda" # Hint only
>>
>> This does not look very good, having single path scsi disk devices mentioned
>> by LVM. With zfcp-attached SCSI disks, LVM must be on top of multipathing.
>> Could you please double check if your installation with LVM and multipathing
>> does the correct layering? If not, this would be an independent bug. See
>> also [1, slide 28 "Multipathing for Disks ? LVM on Top"].
>>
>> > Additional testing has been done with CKD volumes and we see the same
>> > behavior.
>> > Because of this behavior, I do not
>> > believe the problem is related to SAN disk or multipath. I think it is
>> > due
>> > to the system not being able to read the UUID on any PV in the VG other
>> > then
>> > the IPL disk.
>>
>> For any disk device type, the initrd must contain all information how to
>> enable/activate all paths of the entire block device dependency tree
>> required to mount the root file system. An example for a dependency tree is
>> in [1, slide 37] and such example is independent of any particular Linux
>> distribution.
>> I don't know how much automatic dependency tracking Ubuntu does for the
>> user, especially regarding additional z-specific device activation steps
>> ("setting online" as for DASD or zFCP). Potentially the user must take care
>> of the dependency tree himself and ensure the necessary information lands in
>> the initrd.
>>
>> Once the dependency tree of the root-fs has changed (such as adding a PV to
>> an LVM containing the root-fs as in your case), you must re-create the
>> initrd with the following command before any reboot:
>> $ update-initramfs -u
>
> The "update-initramfs -u" command was never explicitly run after the system
> was built.
> The second PV volume was added to VG on 10/26/2016. However, it was not
> until early November that the root FS was extended.
>
> Between 10/16/2016 and the date the root fs was extended, the second PV was
> always online and and active in a VG and LV display after every Reboot.
> I have a note in my runlog with the following from 10/26/2016
>>>>Rebooted the system and all is working. Both disks are there and everything
>>>>is online.
> lsscsi
> [0:0:0:1074675712]disk IBM 2107900 1.69 /dev/sdb <----- This would
> be 0x400E4000
> [0:0:0:1074741248]disk IBM 2107900 1.69 /dev/sdd <----- This would
> be 0x400F4000
> [1:0:0:1074675712]disk IBM 2107900 1.69 /dev/sda
> [1:0:0:1074741248]disk IBM 2107900 1.69 /dev/sdc
>
>>
>> On z Systems, this also contains the necessary step to re-write the boot
>> record (using the zipl bootloader management tool) so it correctly points to
>> the new initrd.
>> See also
>> http://www.ibm.com/support/knowledgecenter/linuxonibm/com.ibm.linux.z.ludd/
>> ludd_t_fcp_wrk_on.html.
>>
>>
>> In your case on reboot, it only activated 2 paths to FCP LUN
>> 0x4000400e00000000 (I cannot determine the target port WWPN(s) from below
>> output because it does not convey this info) from two different FCP devices
>> 0.0.e300 and 0.0.e100.
>> From attachment 113696 [details]:
>> [ 6.666977] scsi host0: zfcp
>> [ 6.671670] random: nonblocking pool is initialized
>> [ 6.672622] qdio: 0.0.e300 ZFCP on SC 2cc5 using AI:1 QEBSM:0 PRI:1 TDD:1
>> SIGA: W AP
>> [ 6.722312] scsi host1: zfcp
>> [ 6.724547] scsi 0:0:0:1074675712: Direct-Access IBM 2107900
>> 1.69 PQ: 0 ANSI: 5
>> [ 6.725159] sd 0:0:0:1074675712: alua: supports implicit TPGS
>> [ 6.725164] sd 0:0:0:1074675712: alua: device
>> naa.6005076306ffd700000000000000000e port group 0 rel port 303
>> [ 6.725287] sd 0:0:0:1074675712: Attached scsi generic sg0 type 0
>> [ 6.728234] qdio: 0.0.e100 ZFCP on SC 2c85 using AI:1 QEBSM:0 PRI:1 TDD:1
>> SIGA: W AP
>> [ 6.747662] sd 0:0:0:1074675712: alua: transition timeout set to 60
>> seconds
>> [ 6.747667] sd 0:0:0:1074675712: alua: port group 00 state A preferred
>> supports tolusnA
>> [ 6.747801] sd 0:0:0:1074675712: [sda] 209715200 512-byte logical blocks:
>> (107 GB/100 GiB)
>> [ 6.748652] sd 0:0:0:1074675712: [sda] Write Protect is off
>> [ 6.749024] sd 0:0:0:1074675712: [sda] Write cache: enabled, read cache:
>> enabled, doesn't support DPO or FUA
>> [ 6.752076] sda: sda1 sda2 < sda5 >
>> [ 6.754107] sd 0:0:0:1074675712: [sda] Attached SCSI disk
>> [ 6.760935] scsi 1:0:0:1074675712: Direct-Access IBM 2107900
>> 1.69 PQ: 0 ANSI: 5
>> [ 6.761444] sd 1:0:0:1074675712: alua: supports implicit TPGS
>> [ 6.761448] sd 1:0:0:1074675712: alua: device
>> naa.6005076306ffd700000000000000000e port group 0 rel port 231
>> [ 6.761514] sd 1:0:0:1074675712: Attached scsi generic sg1 type 0
>> [ 6.787710] sd 1:0:0:1074675712: [sdb] 209715200 512-byte logical blocks:
>> (107 GB/100 GiB)
>> [ 6.787770] sd 1:0:0:1074675712: alua: port group 00 state A preferred
>> supports tolusnA
>> [ 6.788464] sd 1:0:0:1074675712: [sdb] Write Protect is off[ 6.788728]
>> sd 1:0:0:1074675712: [sdb] Write cache: enabled, read cache: enabled,
>> doesn't support DPO or FUA
>> [ 6.790829] sdb: sdb1 sdb2 < sdb5 >
>> [ 6.792535] sd 1:0:0:1074675712: [sdb] Attached SCSI disk
>
> I see what you are saying, only 1074675712 (0x400E4000) is coming
> online at boot. 107474128 (0x400f4000) does not come online at boot.
> The second device must be coming online after boot has completed and
> that is why lsscsi shows it online. And since the boot partition is on
> the first segment, the system can read initrd and start the boot. But
> when it goes to mount root, it is not aware of the second segment. Do
> I have this right?
>
> If so, that brings me to the next question. If this is the case, do
> you have a procedure where I could bring up a rescue system, bring
> volumes 1074675712 (0x400E4000) & 107474128 (0x400f4000) online, chroot
> and then update the initrd with the second volume? or do I need to
> rebuild the system from scratch?
>
>>
>>
>> REFERENCE
>>
>> [1]
>> http://www-05.ibm.com/de/events/linux-on-z/pdf/day2/4_Steffen_Maier_zfcp-
>> best-practices-2015.pdf
>
> --
> You received this bug notification because you are a bug assignee.
> Matching subscriptions: s390x
> https://bugs.launchpad.net/bugs/1641078
>
> Title:
> System cannot be booted up when root filesystem is on an LVM on two
> disks
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu-z-systems/+bug/1641078/+subscriptions
--
Regards,
Dimitri.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1641078
Title:
System cannot be booted up when root filesystem is on an LVM on two
disks
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1641078/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs