We were finally able to run another test with the Cirrus DMS device and I was 
able to capture the trace when it failed.  One point, the root file system is 
on ECKD disk, the problem we are having is accessing the LUNs that contain our 
other file systems.  Here is the trace, the only thing I can see is that the 
second path is coming up late.

[    1.201641] SCSI subsystem initialized
[    1.208476] qeth.87067b: loading core functions
[    1.226248] NET: Registered protocol family 10
[    1.228215] qeth.933eb7: register layer 2 discipline
[    1.231415] qdio: 0.0.0602 OSA on SC e using AI:1 QEBSM:0 PRI:1 TDD:1 
SIGA:RW A
[    1.235242] scsi0 : zfcp
[    1.235309] scsi_eh_0: sleeping
[    1.237109] qdio: 0.0.0400 ZFCP on SC 0 using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: 
W A
[    1.248175] netif_napi_add() called with weight 128 on device eth%d
[    1.248737] qeth.cc0c57: 0.0.0600: MAC address 02:00:0b:00:00:12 
successfully registered on device eth0
[    1.248748] qeth.736dae: 0.0.0600: Device is a Guest LAN QDIO card (level: 
V642)
[    1.248751] with link type GuestLAN QDIO (portname: whatever)
Mounting debugfs at /sys/kernel/debug..done

No efivars filesystem driver available.Starting udevd: ..done

Loading drivers, configuring devices:
[    2.261765] scsi1 : zfcp
[    2.261768] scsi_eh_1: sleeping
[    2.262638] qdio: 0.0.0401 ZFCP on SC 1 using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: 
W A
[    2.263937] scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36
[    3.304488] scsi 1:0:0:0: scsi scan: INQUIRY pass 1 length 36
..done

[    3.318574] Adding 259956k swap on /dev/dasdb1.  Priority:-1 extents:1 
across:259956k
[    3.319113] Adding 519924k swap on /dev/dasdc1.  Priority:-2 extents:1 
across:519924k
[    3.352258] device-mapper: uevent: version 1.0.3
[    3.352314] device-mapper: ioctl: 4.25.0-ioctl (2012-07-25) initialised: 
dm-de...@redhat.com
[    3.386135] device-mapper: multipath: version 1.6.0 loaded
Activating swap-devices in /etc/fstab...

..doneSetting up the System Clock..done

..done

Activating device mapper...

..done

Loading required kernel modules

..doneCreating multipath targets: (waiting for udev)..done

Starting MD RAID ..unused

[    3.475733] scsi scan: INQUIRY successful with code 0x0
[    3.475741] scsi 0:0:0:0: scsi scan: INQUIRY pass 2 length 149
[    3.475978] scsi scan: INQUIRY successful with code 0x0
[    3.475985] scsi 0:0:0:0: Direct-Access     EMC      SYMMETRIX        5874 
PQ: 0 ANSI: 5
[    3.476061] scsi scan: Sending REPORT LUNS to host 0 channel 0 id 0 (try 0)
[    3.476513] scsi scan: REPORT LUNS successful (try 0) result 0x0
[    3.476519] scsi 0:0:0:0: scsi scan: REPORT LUN scan
[    3.476524] scsi scan: device exists on 0:0:0:0
[    3.477382] scsi 0:0:0:1: scsi scan: INQUIRY pass 1 length 36
[    3.477872] scsi scan: INQUIRY successful with code 0x0
[    3.477880] scsi 0:0:0:1: scsi scan: INQUIRY pass 2 length 149
[    3.478311] scsi scan: INQUIRY successful with code 0x0

[    3.478331] scsi 0:0:0:1: Direct-Access     EMC      SYMMETRIX        5874 
PQ: 0 ANSI: 5
[    3.486130] sd 0:0:0:0: Done: SUCCESS
[    3.486135] sd 0:0:0:0:  Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[    3.486140] sd 0:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
[    3.486158] sd 0:0:0:0:  Sense Key : Unit Attention [current]
[    3.486177] sd 0:0:0:0:  Add. Sense: I_T nexus loss occurred
[    3.486616] sd 0:0:0:1: Done: SUCCESS
[    3.486622] sd 0:0:0:1:  Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[    3.486629] sd 0:0:0:1: CDB: Test Unit Ready: 00 00 00 00 00 00
[    3.486644] sd 0:0:0:1:  Sense Key : Unit Attention [current]
[    3.486652] sd 0:0:0:1:  Add. Sense: I_T nexus loss occurred
[    3.487964] sd 0:0:0:1: [sdb] 141434880 512-byte logical blocks: (72.4 
GB/67.4 GiB)
[    3.488130] sd 0:0:0:0: [sda] 141434880 512-byte logical blocks: (72.4 
GB/67.4 GiB)
[    3.491496] sd 0:0:0:1: [sdb] Write Protect is off
[    3.491504] sd 0:0:0:1: [sdb] Mode Sense: 8b 00 00 08
[    3.491893] sd 0:0:0:0: [sda] Write Protect is off

[    3.491917] sd 0:0:0:0: [sda] Mode Sense: 8b 00 00 08
[    3.492273] sd 0:0:0:1: [sdb] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[    3.492750] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[    3.499550]  sdb: sdb1
[    3.500401]  sda: sda1
[    3.504773] sg_alloc: dev=0
[    3.504795] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    3.504802] sg_alloc: dev=1
[    3.504818] sd 0:0:0:1: Attached scsi generic sg1 type 0
[    3.504831] sd 0:0:0:1: [sdb] Attached SCSI disk
[    3.510375] sd 0:0:0:0: [sda] Attached SCSI disk
Waiting for udev to settle...

Scanning for LVM volume groups...

  Reading all physical volumes.  This may take a while...

  Found volume group "system-vg" using metadata type lvm2

Activating LVM volume groups...

  PARTIAL MODE. Incomplete logical volumes will be processed.

  5 logical volume(s) in volume group "system-vg" now active

..done

Waiting for /dev/user-vg/app-lv

/dev/user-vg/ibmitm-lv

/dev/user-vg/vantage-lv  no more events

Checking file systems...

fsck from util-linux 2.19.1

Checking all file systems.

/dev/user-vg/app-lv: nonexistent device ("nofail" fstab option may be used to 
skip this device)

/dev/user-vg/ibmitm-lv: nonexistent device ("nofail" fstab option may be used 
to skip this device)

/dev/user-vg/vantage-lv: nonexistent device ("nofail" fstab option may be used 
to skip this device)

[/sbin/fsck.ext3 (1) -- /home] fsck.ext3 -a -C0 /dev/mapper/system--vg-home--lv 

/dev/mapper/system--vg-home--lv: clean, 310/131072 files, 30559/524288 blocks

[/sbin/fsck.ext3 (1) -- /opt] fsck.ext3 -a -C0 /dev/mapper/system--vg-opt--lv 

/dev/mapper/system--vg-opt--lv: clean, 174/131072 files, 38932/524288 blocks

[/sbin/fsck.ext3 (1) -- /tmp] fsck.ext3 -a -C0 /dev/mapper/system--vg-tmp--lv 

/dev/mapper/system--vg-tmp--lv: clean, 87/131072 files, 25736/524288 blocks

[/sbin/fsck.ext3 (1) -- /usr] fsck.ext3 -a -C0 /dev/mapper/system--vg-usr--lv


/dev/mapper/system--vg-usr--lv: clean, 59550/212992 files, 491523/837632 blocks

[/sbin/fsck.ext3 (1) -- /var] fsck.ext3 -a -C0 /dev/mapper/system--vg-var--lv 

Give root password for maintenance


(or type Control-D to continue):
blogd: no message logging because /var file system is not accessible

/dev/mapper/system--vg-var--lv: clean, 3529/131072 files, 187603/524288 blocks

[/sbin/fsck.ext3 (1) -- /app] fsck.ext3 -a -C0 /dev/user-vg/app-lv 

fsck.ext3: No such file or directory while trying to open /dev/user-vg/app-lv


/dev/user-vg/app-lv: 

The superblock could not be read or does not describe a correct ext2

filesystem.  If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate superblock:

    e2fsck -b 8193 <device>



[/sbin/fsck.ext3 (1) -- /app/IBM/ITM] fsck.ext3 -a -C0 /dev/user-vg/ibmitm-lv 

fsck.ext3: No such file or directory while trying to open /dev/user-vg/ibmitm-lv


/dev/user-vg/ibmitm-lv: 

The superblock could not be read or does not describe a correct ext2

filesystem.  If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate sup
erblock:

    e2fsck -b 8193 <device>



[/sbin/fsck.ext3 (1) -- /app/ecotools] fsck.ext3 -a -C0 /dev/user-vg/vantage-lv 

fsck.ext3: No such file or directory while trying to open 
/dev/user-vg/vantage-lv


/dev/user-vg/vantage-lv: 

The superblock could not be read or does not describe a correct ext2

filesystem.  If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate superblock:

    e2fsck -b 8193 <device>



..failed

fsck failed for at least one filesystem (not /).

Please repair manually and reboot.

The root file system is already mounted read-write.



Attention: Only CONTROL-D will reboot the system in this

maintanance mode. shutdown or reboot will not work.



[    5.021680] scsi scan: INQUIRY successful with code 0x0
[    5.021688] scsi 1:0:0:0: scsi scan: INQUIRY pass 2 length 149
[    5.021899] scsi scan: INQUIRY successful with code 0x0
[    5.021906] scsi 1:0:0:0: Direct-Access     EMC      SYMMETRIX        5874 
PQ: 0 ANSI: 5
[    5.021985] sg_alloc: dev=2
[    5.022009] sd 1:0:0:0: Attached scsi generic sg2 type 0
[    5.022033] scsi scan: Sending REPORT LUNS to host 1 channel 0 id 0 (try 0)
[    5.022419] scsi scan: REPORT LUNS successful (try 0) result 0x0
[    5.022424] sd 1:0:0:0: scsi scan: REPORT LUN scan
[    5.022428] scsi scan: device exists on 1:0:0:0
[    5.023321] scsi 1:0:0:1: scsi scan: INQUIRY pass 1 length 36
[    5.023508] sd 1:0:0:0: Done: SUCCESS
[    5.023514] sd 1:0:0:0:  Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[    5.023522] sd 1:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
[    5.023542] sd 1:0:0:0:  Sense Key : Unit Attention [current]
[    5.023551] sd 1:0:0:0:  Add. Sense: I_T nexus loss occurred
[    5.024722] scsi scan: INQUIRY successful with code 0x0

[    5.024750] scsi 1:0:0:1: scsi scan: INQUIRY pass 2 length 149
[    5.024865] sd 1:0:0:0: [sdc] 141434880 512-byte logical blocks: (72.4 
GB/67.4 GiB)
[    5.024954] scsi scan: INQUIRY successful with code 0x0
[    5.024959] scsi 1:0:0:1: Direct-Access     EMC      SYMMETRIX        5874 
PQ: 0 ANSI: 5
[    5.025016] sg_alloc: dev=3
[    5.025042] sd 1:0:0:1: Attached scsi generic sg3 type 0
[    5.026523] sd 1:0:0:1: Done: SUCCESS
[    5.026529] sd 1:0:0:1:  Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[    5.026535] sd 1:0:0:1: CDB: Test Unit Ready: 00 00 00 00 00 00
[    5.026550] sd 1:0:0:1:  Sense Key : Unit Attention [current]
[    5.026558] sd 1:0:0:1:  Add. Sense: I_T nexus loss occurred
[    5.027998] sd 1:0:0:1: [sdd] 141434880 512-byte logical blocks: (72.4 
GB/67.4 GiB)
[    5.028987] sd 1:0:0:0: [sdc] Write Protect is off
[    5.028999] sd 1:0:0:0: [sdc] Mode Sense: 8b 00 00 08
[    5.029462] sd 1:0:0:0: [sdc] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[    5.030860] sd 1:0:0:1: [sdd] Write Protect is off

[    5.030872] sd 1:0:0:1: [sdd] Mode Sense: 8b 00 00 08
[    5.032427] sd 1:0:0:1: [sdd] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
[    5.034279]  sdc: sdc1
[    5.037147]  sdd: sdd1
[    5.041471] sd 1:0:0:0: [sdc] Attached SCSI disk
[    5.043020] sd 1:0:0:1: [sdd] Attached SCSI disk

Chris Will
Enterprise Linux/UNIX (ELU)
(313) 549-9729 Cell
cw...@bcbsm.com

-----Original Message-----
From: Linux on 390 Port <LINUX-390@VM.MARIST.EDU> On Behalf Of Benjamin Block
Sent: Wednesday, February 20, 2019 5:06 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Question on auto lun scan

Hey Alan,

On Tue, Feb 19, 2019 at 01:06:17PM -0500, Alan Altmark wrote:
> On Tuesday, 02/19/2019 at 01:49 GMT, Benjamin Block <bbl...@linux.ibm.com> 
> wrote:
> > udev_timeout=30
> >
> > The default is 30s, you can add it to the kernel command line
> > (/etc/zipl.conf), and increase it (e.g.: udev_timeout=90).
> >
> > This will increase the timeout for for both udevadm and the step where
> > it waits for the root-volume to appear.
> 
> Does that mean ALL of the needed needed LUNs must be brought online within 
> 30 seconds?  Or that REPORT LUNS (LUN 0) or INQUIRY (LUN 0 and/or all 
> others) must EACH respond within 30 seconds?
> 

It regulates two things at once.

First, during the early phase of the INITRD, the init process will start
'udev', and then use 'udevadm settle --timeout=${udev_timeout}' to wait
till it udev processed all the events happening in the kernel (such as
adding devices, executing rules upon it saw specific devices being added
[such as loading drivers and so on], and so on). This timeout is
deceptive though, it will only wait up to this amount OR if there is no
events in the queue anymore.. if an event happens 1s after it already
stopped waiting, bad luck - although, because udev keeps running as
daemon, it should sill process the event if I am not mistaken, its just
that init will continue doing other things meanwhile (such as:).

The second mechanism it controls is the late phase during run of the
INITRD, where init will wait till it sees the root-volume appear. In
this phase it will wait up to ${udev_timeout} seconds till it sees it,
and after that it will fall back to ask the user for guidance (which is
what I think is happening here, although strictly spoken, this is only a
guess).

That is as far as I have read the code - mind you, I didn't write it, so
it may well be that I have miss-interpreted something.



But this all changes with SLES12, because SLES12 uses a different kind
of INITRD (dracut) and the code there probably doesn't even know this
parameter. So this is strictly SLES11 only.

-- 
With Best Regards, Benjamin Block      /      Linux on IBM Z Kernel Development
IBM Systems & Technology Group   /  IBM Deutschland Research & Development GmbH
Vorsitz. AufsR.: Matthias Hartmann       /      Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


The information contained in this communication is highly confidential and is 
intended solely for the use of the individual(s) to whom this communication is 
directed. If you are not the intended recipient, you are hereby notified that 
any viewing, copying, disclosure or distribution of this information is 
prohibited. Please notify the sender, by electronic mail or telephone, of any 
unintended receipt and delete the original message without making any copies.
 
 Blue Cross Blue Shield of Michigan and Blue Care Network of Michigan are 
nonprofit corporations and independent licensees of the Blue Cross and Blue 
Shield Association.

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to