Re: Question on auto lun scan

Will, Chris Fri, 15 Feb 2019 06:35:16 -0800

Thanks for your input.  One question, for SLES 11 is there a timeout on waiting 
for the LUN information after the REPORT LUNS is issued?  If it times out does 
it retry.  Again this is during the boot process.  I have include the output 
from one of the servers I updated with the zipl changes.  This one is without 
the Cirrus device and was successful.
 
<5>[  112.205720] SCSI subsystem initialized
<6>[  112.271921] qeth.87067b: loading core functions
<6>[  112.285690] qeth.933eb7: register layer 2 discipline
<6>[  112.287030] qdio: 0.0.0602 OSA on SC e using AI:1 QEBSM:0 PRI:1 TDD:1 
SIGA:RW A 
<6>[  112.288191] NET: Registered protocol family 10
<4>[  112.291115] scsi_eh_0: sleeping
<6>[  112.291122] scsi0 : zfcp
<6>[  112.292088] qdio: 0.0.0400 ZFCP on SC 0 using AI:1 QEBSM:1 PRI:1 TDD:1 
SIGA: W A 
<3>[  112.300521] netif_napi_add() called with weight 128 on device eth%d
<6>[  112.300903] qeth.cc0c57: 0.0.0600: MAC address 02:00:0b:00:00:1c 
successfully registered on device eth0
<6>[  112.300912] qeth.736dae: 0.0.0600: Device is a Guest LAN QDIO card 
(level: V642)
<6>[  112.300914] with link type GuestLAN QDIO (portname: whatever)
<6>[  113.327487] scsi1 : zfcp
<4>[  113.327493] scsi_eh_1: sleeping
<6>[  113.328655] qdio: 0.0.0401 ZFCP on SC 1 using AI:1 QEBSM:1 PRI:1 TDD:1 
SIGA: W A 
<6>[  113.329271] scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36
<6>[  113.333636] scsi scan: INQUIRY successful with code 0x0
<6>[  113.333646] scsi 0:0:0:0: scsi scan: INQUIRY pass 2 length 149
<6>[  113.333830] scsi scan: INQUIRY successful with code 0x0
<5>[  113.333839] scsi 0:0:0:0: Direct-Access     EMC      SYMMETRIX        
5874 PQ: 0 ANSI: 5
<6>[  113.333917] scsi scan: Sending REPORT LUNS to host 0 channel 0 id 0 (try 
0)
<6>[  113.334151] scsi scan: REPORT LUNS successful (try 0) result 0x0
<6>[  113.334157] scsi 0:0:0:0: scsi scan: REPORT LUN scan
<6>[  113.334162] scsi scan: device exists on 0:0:0:0
<6>[  113.334494] scsi 0:0:0:1: scsi scan: INQUIRY pass 1 length 36
<6>[  113.335371] scsi scan: INQUIRY successful with code 0x0
<6>[  113.335381] scsi 0:0:0:1: scsi scan: INQUIRY pass 2 length 149
<6>[  113.335558] scsi scan: INQUIRY successful with code 0x0
<5>[  113.335568] scsi 0:0:0:1: Direct-Access     EMC      SYMMETRIX        
5874 PQ: 0 ANSI: 5
<6>[  113.354229] sd 0:0:0:0: Done: SUCCESS
<6>[  113.354234] sd 0:0:0:0:  Result: hostbyte=DID_OK driverbyte=DRIVER_OK
<6>[  113.354239] sd 0:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
<6>[  113.354247] sd 0:0:0:0:  Sense Key : Unit Attention [current] 
<6>[  113.354252] sd 0:0:0:0:  Add. Sense: I_T nexus loss occurred
<6>[  113.355153] sd 0:0:0:1: Done: SUCCESS
<6>[  113.355157] sd 0:0:0:1:  Result: hostbyte=DID_OK driverbyte=DRIVER_OK
<6>[  113.355161] sd 0:0:0:1: CDB: Test Unit Ready: 00 00 00 00 00 00
<6>[  113.355168] sd 0:0:0:1:  Sense Key : Unit Attention [current] 
<6>[  113.355172] sd 0:0:0:1:  Add. Sense: I_T nexus loss occurred
<5>[  113.356076] sd 0:0:0:0: [sda] 141434880 512-byte logical blocks: (72.4 
GB/67.4 GiB)
<5>[  113.356341] sd 0:0:0:1: [sdb] 141434880 512-byte logical blocks: (72.4 
GB/67.4 GiB)
<5>[  113.357164] sd 0:0:0:0: [sda] Write Protect is off
<7>[  113.357169] sd 0:0:0:0: [sda] Mode Sense: 8b 00 00 08
<5>[  113.357461] sd 0:0:0:1: [sdb] Write Protect is off
<5>[  113.357465] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
<7>[  113.357470] sd 0:0:0:1: [sdb] Mode Sense: 8b 00 00 08
<5>[  113.357749] sd 0:0:0:1: [sdb] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
<6>[  113.361465]  sda: sda1
<6>[  113.368085]  sdb: sdb1
<5>[  113.368196] sd 0:0:0:0: [sda] Attached SCSI disk
<5>[  113.370166] sd 0:0:0:1: [sdb] Attached SCSI disk
<4>[  113.374833] sg_alloc: dev=0 
<5>[  113.374858] sd 0:0:0:0: Attached scsi generic sg0 type 0
<4>[  113.374863] sg_alloc: dev=1 
<5>[  113.374879] sd 0:0:0:1: Attached scsi generic sg1 type 0
<6>[  114.358225] scsi 1:0:0:0: scsi scan: INQUIRY pass 1 length 36
<6>[  114.364101] scsi scan: INQUIRY successful with code 0x0
<6>[  114.364110] scsi 1:0:0:0: scsi scan: INQUIRY pass 2 length 149
<6>[  114.364288] scsi scan: INQUIRY successful with code 0x0
<5>[  114.364297] scsi 1:0:0:0: Direct-Access     EMC      SYMMETRIX        
5874 PQ: 0 ANSI: 5
<4>[  114.364432] sg_alloc: dev=2 
<5>[  114.364462] sd 1:0:0:0: Attached scsi generic sg2 type 0
<6>[  114.364489] scsi scan: Sending REPORT LUNS to host 1 channel 0 id 0 (try 
0)
<6>[  114.365903] sd 1:0:0:0: Done: SUCCESS
<6>[  114.365910] sd 1:0:0:0:  Result: hostbyte=DID_OK driverbyte=DRIVER_OK
<6>[  114.365917] sd 1:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
<6>[  114.365936] sd 1:0:0:0:  Sense Key : Unit Attention [current] 
<6>[  114.365945] sd 1:0:0:0:  Add. Sense: I_T nexus loss occurred
<6>[  114.365981] scsi scan: REPORT LUNS successful (try 0) result 0x0
<6>[  114.365989] sd 1:0:0:0: scsi scan: REPORT LUN scan
<6>[  114.365995] scsi scan: device exists on 1:0:0:0
<6>[  114.366408] scsi 1:0:0:1: scsi scan: INQUIRY pass 1 length 36
<5>[  114.366607] sd 1:0:0:0: [sdc] 141434880 512-byte logical blocks: (72.4 
GB/67.4 GiB)
<6>[  114.366617] scsi scan: INQUIRY successful with code 0x0
<6>[  114.366622] scsi 1:0:0:1: scsi scan: INQUIRY pass 2 length 149
<6>[  114.366775] scsi scan: INQUIRY successful with code 0x0
<5>[  114.366781] scsi 1:0:0:1: Direct-Access     EMC      SYMMETRIX        
5874 PQ: 0 ANSI: 5
<4>[  114.366837] sg_alloc: dev=3 
<5>[  114.366857] sd 1:0:0:1: Attached scsi generic sg3 type 0
<6>[  114.367330] sd 1:0:0:1: Done: SUCCESS
<6>[  114.367334] sd 1:0:0:1:  Result: hostbyte=DID_OK driverbyte=DRIVER_OK
<6>[  114.367338] sd 1:0:0:1: CDB: Test Unit Ready: 00 00 00 00 00 00
<6>[  114.367345] sd 1:0:0:1:  Sense Key : Unit Attention [current] 
<6>[  114.367349] sd 1:0:0:1:  Add. Sense: I_T nexus loss occurred
<5>[  114.368387] sd 1:0:0:1: [sdd] 141434880 512-byte logical blocks: (72.4 
GB/67.4 GiB)
<5>[  114.368752] sd 1:0:0:0: [sdc] Write Protect is off
<7>[  114.368756] sd 1:0:0:0: [sdc] Mode Sense: 8b 00 00 08
<5>[  114.369051] sd 1:0:0:0: [sdc] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
<5>[  114.369502] sd 1:0:0:1: [sdd] Write Protect is off
<7>[  114.369506] sd 1:0:0:1: [sdd] Mode Sense: 8b 00 00 08
<5>[  114.369810] sd 1:0:0:1: [sdd] Write cache: disabled, read cache: enabled, 
doesn't support DPO or FUA
<6>[  114.371017]  sdc: sdc1
<6>[  114.371976]  sdd: sdd1
<5>[  114.374989] sd 1:0:0:0: [sdc] Attached SCSI disk
<5>[  114.375565] sd 1:0:0:1: [sdd] Attached SCSI disk


Chris Will
Enterprise Linux/UNIX (ELU)
(313) 549-9729 Cell
cw...@bcbsm.com

-----Original Message-----
From: Benjamin Block <bbl...@linux.ibm.com> 
Sent: Friday, February 08, 2019 11:35 AM
To: Will, Chris <cw...@bcbsm.com>
Cc: Linux on 390 Port <LINUX-390@vm.marist.edu>
Subject: Re: Question on auto lun scan

Hello Chris,

On Fri, Feb 08, 2019 at 01:18:49PM +0000, Will, Chris wrote:
> Thank you for all the information so far.  We had another meeting with 
> the Cirrus team and they said that when they get the initial login 
> they have to insert their wwpn so if they are delayed in doing this 
> they report back a "busy" status.  For the next shot at implementing 
> this they are going to pre-define all the npiv wwpns to prevent this 
> delay.  They could also delay up to 10 seconds reporting back a "busy"
> status to the server.  I did see a console log captured by z/VM which 
> showed the server going into emergency repair mode but the LUNs were 
> discovered later (not sure how much later since there were no
> timestamps) but by this point the server was already in trouble.  We 
> are using NPIV and have autolun scan enabled for the servers.  So far 
> none of the SLES12 SP3 guests have had issues but most of the SLES11
> SP4 guests go into this emergency repair mode.
>

Steffen and I suspect this is not really a problem with the Cirrus devices. It 
just exposes a shortcoming in the SLES11 initial ramdisk and how this handles 
device-discovery, by adding a big'ish delay to the LUN-scanning.

As you say, you see devices being discovered after all, so the LUN-Scan Linux 
does, does in fact work as designed. It sounds like the initrd aborts its 
scanning - or rather: waiting - to early. This is a bit complicated, because 
there is no clear interface between the kernel and the userspace that tells the 
userspace that scanning is now over, and nothing will pop up anymore. SLES12 
got better here.

But anyway, Steffen wrote some advice what additional debugging you could do, 
by adding the additional kernel command line parameters he mentioned (at the 
bottom of this mail). With them it would probably become more clear what 
component is failing here. It also adds some relative timestamps and such.

You can also instruct z/VM to spool the console output, and later print that 
spool-space into a file, so its easier to handle than just with whatever 
console you use.

> 
> -----Original Message-----
> From: Linux on 390 Port <LINUX-390@VM.MARIST.EDU> On Behalf Of Steffen 
> Maier
> Sent: Friday, February 08, 2019 6:36 AM
> To: LINUX-390@VM.MARIST.EDU
> Subject: Re: Question on auto lun scan
> 
> On 02/05/2019 12:02 PM, Benjamin Block wrote:
> > On Mon, Feb 04, 2019 at 04:37:52PM +0000, Will, Chris wrote:
> >> We have auto lun scan turned on for our SLES 11 SP4 hosts and it is 
> >> on by default for our SLES 12 SP3 hosts.  We are trying to insert a 
> >> Cirrus device which has the capability to discover LUNs and NPIV WWNs.
> >> My very limited understanding of the process, during the initial 
> >> boot process (z/VM IPL) it initially will report back no LUNs to 
> >> the guest, logs into the storage device and then reports back to 
> >> the guest with the LUNs that are masked for the NPIV WWPN.
> 
> Is it this?: https://www.cdsi.us.com/technology/
> If so, it sounds as if the discovery on the appliance is done on 
> initializing the multi-step data migration. After that, I would assume 
> it persistently knows both ends and answers on behalf of its attached 
> opposite ends (host, storage).
> 
> Actually it has quite some commonality with SAN volume virtualizers 
> using "pass-through" mirrored "image" volumes. One of the main 
> differences being that it can be inserted into the data path without 
> having to reconfigure the host as the host still gets to see something 
> that looks like the old storage one is migrating from.
> 
> > Auto-LUN-Scan in Linux works roughly as follows:
> >   - in z/VM guests you need dedicated FCP devices that are on CHPIDs with
> >     NPIV enabled (I assume you have that already). Apart from this z/VM
> >     does not play any role in this, it doesn't help or intercept
> >     anything.
> >   - during boot - when the zFCP driver is loaded, usually by the initial
> >     RAM-Disk - we (the driver) scan the FC-Network for available remote
> >     ports (ports that are in the same zone as the initiator ports on
> >     your) and open them automatically
> >     - this happens completely transparent, regardless of whether NPIV
> >       and/or auto-LUN-scan is enabled or not
> >   - for each successfully opened remote port the Linux Kernel SCSI code
> >     will issue LUN-scanning. If you have auto-LUN-scan enabled and use a
> >     NPIV enabled FCP device the zFCP driver will allow the scan to
> >     happen - otherwise we intercept it.
> >   - such a scan entails sending of the SCSI Command REPORT LUNS (support
> >     for this commands is mandatory for every SCSI device type), and for
> >     all reported devices the Linux kernel will create SCSI-Devices
> >   - note that "opening" a LUN does not actually generate any traffic in
> >     the network, only the remote port open and REPORT LUNS does generate
> >     traffic, and for all found LUNs Linux will also send some
> >     INQUIRY commands and such, but there is no "open LUN" command as
> >     such
> >   - Linux doesn't retry this LUN-scanning later without any reason (port
> >     recovery or such), so if the network stays quit it doesn't
> >
> > Now, I have never worked with the devices you want to use, so its 
> > guessing after this.
> >
> > If I understand you correctly the Cirrus device sits between your 
> > initiator FC-Ports (on the Z side) and the storage-server, somewhere 
> > in the network and intercepts your traffic? So Linux will open the 
> > port on this device and it will see the initial traffic by Linux, 
> > and forward it to whatever storage-server it thinks is correct.
> >
> > If it does not report back the correct list of LUNs for the initial 
> > REPORT LUNS command, you have a problem (and I'd consider this 
> > device broken). There are some fall-backs in the kernel for cases 
> > where the storage device is buggy and/or doesn't properly support 
> > modern SCSI standards, but that doesn't mean it'll help you.
> >
> >> This does not seem to give the SLES 12 guests issues but most of 
> >> the SLES 11 guests have issues.  Is there anyway for the guest to 
> >> do a retry, otherwise it ends up in emergency repair mode (setting 
> >> we have in fstab).
> >
> > Its hard to give you proper advice without knowing in what state the 
> > system halts, and what exactly happend during the whole process I 
> > described above.
> >
> > But in the absence of this:
> >
> > Like Mark said, it might help to just trigger a scan manually with 
> > the script rescan-scsi-bus.sh, or you could just try writing into 
> > the scan attribute of the SCSI-Hosts in question (as root):
> >
> >                                     +-- Device-Bus-ID of your FCP device
> >                                     |
> >                                     v echo "- - -" > 
> > /sys/devices/css*/*/0.0.1900/host*/scsi_host/host*/scan
> >
> > This should issue an other rescan of all the attached remote ports, 
> > and you don't need any extra tools. You might also be able to script 
> > that and put it in your initial ram-disk (although that might be not 
> > as easy as it sounds; you'd have to find a proper trigger and time 
> > to issue the script during the boot-process). I don't know any way 
> > to activate something like this "out-of-the-box" with SLES11 (or 12 
> > for that matter).
> 
> Appending the following at IPL to the kernel parameters, might help 
> you debug further at which point things break in the way that some 
> disk block device it depends on is not configured or ready. This 
> includes any initrd processing, and auto lun scan processing during 
> initrd and after initrd. Otherwise things can be quite silence by 
> default even for "error" cases:
> 
> linuxrc=trace scsi_mod.scsi_logging_level=4605 printk.time=1 
> ignore_loglevel
> 
> This provides a lot of output on the console.
> 
> Optionally also "shell=1", if you want to get a root shell at the end 
> of initrd processing to have a look what the device setup is at this 
> boot point in time before it attempts to mount the root file system.
> 
> linuxrc=trace and shell=1 are specific to SLES11 initrd [man 8 mkinitrd].
> Kernel parameters:
> https://www.ibm.com/support/knowledgecenter/linuxonibm/liaaf/lnz_r_s11
> 4.html Device Drivers, Features, and Commands on SUSE Linux Enterprise 
> Server Chapter 3. Kernel and module parameters Chapter 38. Booting 
> Linux
> 
> With SLES12, the initrd is from dracut and different.
> https://www.ibm.com/support/knowledgecenter/linuxonibm/com.ibm.linux.z
> .lhdd/lhdd_c_ipl_kernparm.html
> https://mirrors.edge.kernel.org/pub/linux/utils/boot/dracut/dracut.htm
> l#_description_7 
> https://mirrors.edge.kernel.org/pub/linux/utils/boot/dracut/dracut.htm
> l#debugging-dracut 
> https://mirrors.edge.kernel.org/pub/linux/utils/boot/dracut/dracut.htm
> l#dracutkerneldebug
> 
> For both SLES versions, the content of 
> /etc/udev/rules.d/51-zfcp-*.rules managed by "yast zfcp" and 
> zfcp_{host|disk}_configure is also relevant.
> 

-- 
With Best Regards, Benjamin Block      /      Linux on IBM Z Kernel Development
IBM Systems & Technology Group   /  IBM Deutschland Research & Development GmbH
Vorsitz. AufsR.: Matthias Hartmann       /      Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294



The information contained in this communication is highly confidential and is 
intended solely for the use of the individual(s) to whom this communication is 
directed. If you are not the intended recipient, you are hereby notified that 
any viewing, copying, disclosure or distribution of this information is 
prohibited. Please notify the sender, by electronic mail or telephone, of any 
unintended receipt and delete the original message without making any copies.
 
 Blue Cross Blue Shield of Michigan and Blue Care Network of Michigan are 
nonprofit corporations and independent licensees of the Blue Cross and Blue 
Shield Association.

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Question on auto lun scan

Reply via email to