Thanks Rick. Yes, I think chzdev should not be concerned about the UUIDs but
it is possible that it needs to. I can't say.
However, the problem with chzdev is something that can be dealt with easily.
What is most concerning is that chzdev will at times prompt you to allow it to
call mkinitrd and dracut if it senses you have changes something to do with the
root device. This is where the bigger problem exists. If you allow chzdev to
rebuild the initrd, then you have a serious problem because the system will
fail to boot at the next boot because it is looking for devices that don't
exist. This is not a chzdev problem but a bug that has to do with dracut or
mkinitrd itself I think.
Here is how you can recreate the problem:
1-Define server 150 as root, 151 as /usr and 152 as /var (anything will work
but this is my setup)
2-Define temp disk as 159 (example 50 cyl tdisk or anything)
3-boot system
4-chccwdev --online 0.0.0159
5-lsdasd (to see which disk is the temp disk 159, dasdf for me)
6-dasdfmt -b 4096 /dev/dasdf
7-fdasd -a /dev/dasdf (auto-create a single partition on the test disk)
8-mkfs.ext4 /dev/dasdf1
9-blkid (list current devices and UUIDs associated with them, then copy UUID
for root device 150 for me)
10-tune2fs -U "f4aeb050-c56a-4d15-87af-99a64fdd7f1d" /dev/dasdf1 (set the
temp disk UUID the same as the root device UUID)
11-blkid (confirm the root and temp disk UUIDs are the same)
12-chzdev dasd 0159 --enable --persistent (make device persistent, then
respond yes to update initrd)
ECKD DASD 0.0.0159 configured
Note: The initial RAM-disk must be updated for these changes to take effect:
- ECKD DASD 0.0.0159
Update initial RAM-disk now? (yes/no) yes
13-Now reboot. The system reboots and gets stuck at:
[ OK ] Reached target Initrd Root Device.
It's looking for a missing device now. After about 2 minutes of nothing (low
CPU utilization), you start seeing messages like this:
dracut-initqueue320Ù: Warning: dracut-initqueue timeout - starting timeout
scripts
And then you are in emergency mode. Here you can enter Dracut emergency shell
or boot into a rescue system, mount the partitions, chroot, then
mkinitrd/grub2-mkconfig/grub2-install to fix the boot problem but if you leave
things as is, the next time you rebuild initrd on the system you will have the
problem return.
Also, in the above steps, starting with step 12, I used chzdev to activate the
disk. You can just as easily go into Yast, DASD, activate the disk (yast will
call chzdev), then go into Yast, Bootloader and make a simple change (like the
boot timeout delay) to force a new initrd to be created. Reboot and you have a
problem.
I think these two problems are unrelated. I have not been able to get anyone
at SUSE to listen. I am hoping someone here will have more clout.
Thanks,
Aria
-----Original Message-----
From: Linux on 390 Port <[email protected]> On Behalf Of Rick Troth
Sent: Sunday, October 3, 2021 6:37 PM
To: [email protected]
Subject: Re: Warning if upgrading SLES12 to SLES15 SP3
Aria has found a bug where 'chzdev' appears to be tripping over matching
UUIDs.
The closest relation to z/VM would be matching volsers, which can
certainly be confusing for z/VM when the CP Dir defines minidisks by
volser, but not so much when CP Dir defines minidisks by DEVNO.
Attaching to SYSTEM can still be konfoozing.
In this case, Linux should be following the I/O subsystems (subchannels,
paths, and devnos), not the UUIDs.
Logical Volume Manager can get confused when different (unrelated) PVs
claim to be part of the same volume group. Like when you copy a full set
of PVs. Suddenly you've got two strings representing, say, "sysvg1". How
does the system know *which* set of PVs to actually bring active when
you bring up volume group "sysvg1"?
I found that in a 'chroot' you can get away with it. (You can have, for
example, "sysvg1" active in the host *and* in the 'chroot', even though
they're backed by different PV sets. Scary, but works.)
The "mounted" flag is just another bit stamped on the media (or not).
Agreed, it's mo betta to copy when filesystems are not mounted. But the
flag itself could be false.
-- R; <><
On 10/3/21 11:10 AM, Aria Bamdad wrote:
> Thanks Alan. In all cases, the cloning is done with server down so this is
> not caused because of that. Also, note that I said all you have to do is
> format a test empty disk and then simply set the UUID of the disk to one of
> the existing OS disks, even when not mounted, it is seen as 'in use' by
> chzdev.
>
> I am not too concerned about the chzdev behavior but more concerned with the
> fact that having a disk around with identical UUID could cause boot problems
> if you were to make changes to boot options.
>
> Thanks,
> Aria
>
> -----Original Message-----
> From: Linux on 390 Port <[email protected]> On Behalf Of Alan Altmark
> Sent: Sunday, October 3, 2021 11:00 AM
> To: [email protected]
> Subject: Re: Warning if upgrading SLES12 to SLES15 SP3
>
> On Friday, 10/01/2021 at 08:20 GMT, "Aria Bamdad" <[email protected]>
> wrote:
>> It seems that there may be a problem with the new s390-tools chzdev
> command
>> at the least. Consider the following situation. You have a Linux guest
>> with root file system on device 150, usr on 151, var on 152. Then you
> make
>> a copy of these disk (DDR/flashcopy/Etc.), say you are cloning the
> guest.
>> The copied disks are now called F50, F51 and F52 respectively. Now
> consider
>> that you boot from your normal 15x disks while you have the F5x disks
> linked
>> and defined to the virtual machine. Then you use chzdev command to
> enable
>> any of the cloned disks, say F50 (clone of root). You can do this either
>> using commands like chzdev or via Yast DASD tool to activate. So far so
>> good. But if you then try to disable the same disk (F50) using chzdev
> (or
>> Yast DASD), chzdev complains with:
>>
>> Warning: ECKD DASD 0.0.0f50 is in use!
>> The following resources may be affected:
>> - Mount point /
>> Continue with operation? (yes/no)
>>
>> Well, it's neither in use (not mounted anywhere, just enabled), nor is
> it
>> the root mount point. The same will be true for F51 and F52. Chzdev
> states
>> that these are in use and mount points /usr and /var. This does not
> happen
>> when the disk is not a clone.
> Traditionally the file systems mark the disks as 'in use' when they are
> mounted so that that message can be issued, among other things.
> Consequently, if you clone it while it's mounted, the clone will have the
> same marking and appear to be in use. IMO, cloning should be done with
> the file system unmounted.
>
> Once the disk is cloned, it's no longer the same disk in terms of content,
> but if memory serves, the logical volume manager remembers UUIDs so that
> it can automatically form the logical volume. If it were to see a desired
> UUID on a different disk, it could grab the wrong disk, resulting in a
> corrupted file system. Timing is everything.
>
> If you're cloning and immediately giving it to a different server, then it
> wouldn't make any difference since the "universe" of UUIDs is limited to
> what the server has access to.
>
> Alan Altmark
>
> Senior Managing z/VM and Linux Consultant
> IBM Systems Lab Services
> IBM Z Delivery Practice
> ibm.com/systems/services/labservices
> office: 607.429.3323
> mobile; 607.321.7556
> [email protected]
> IBM Endicott
>
>
> ----------------------------------------------------------------------
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to [email protected] with the message: INFO LINUX-390 or
> visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
>
> ----------------------------------------------------------------------
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to [email protected] with the message: INFO LINUX-390 or visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
--
-- R; <><
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390