On 09/16/2011 11:55 AM, Lu GL Gao wrote:
We have use 4 FCP type channel on Mainframe side. Every channel is enabled
NPIV with 64 sub-channels whose WWPN are available.
We use 4 hba-ports on DS8000 side.
We will add 20 LUN disks for zlinux system. Every disk has two paths to be
reached.(multipath was enabled)

Just curious: If you have 4 ports on the host side and 4 ports on the storage side, wouldn't you have at least 4 paths for each disk, or even 16 if you make use of "cross-over" i.e. all combinations of 4*4 ports?

Howevery, in our system, I found some sub-channel can normally get WWPN of
hba-port, some can not get.
Strange thing is that these sub-channel is belong to same physical channel.

Is there a particular reason why you use more than one subchannel of the same FCP channel within the same virtual machine? I would have expected to see four active FCP subchannels, each on a different channel, in your logs below. This would also mean less host WWPNs to allow on the storage side.

I find some error messages shown on zlinux guest console from z/VM side:
-----------------------------------------------------------------------------------
zfcp.e78dec: 0.0.3f24: A QDIO problem occurred
zfcp.e78dec: 0.0.3f1e: A QDIO problem occurred
zfcp.e78dec: 0.0.3f18: A QDIO problem occurred
zfcp.e78dec: 0.0.3f00: A QDIO problem occurred
zfcp.e78dec: 0.0.3f36: A QDIO problem occurred
zfcp.e78dec: 0.0.3f0c: A QDIO problem occurred
zfcp.e78dec: 0.0.3f2a: A QDIO problem occurred
zfcp.e78dec: 0.0.3f06: A QDIO problem occurred
zfcp.e78dec: 0.0.3f12: A QDIO problem occurred
zfcp.3dff9c: 0.0.3f24: Setting up the QDIO connection to the FCP adapter
failed

While talking to the FCP channel, the zfcp device driver got notified about errors from the communication mechanism QDIO. See also the corresponding distro-specific kernel messages book on http://www.ibm.com/developerworks/linux/linux390/distribution_hints.html to decode zfcp.e78dec or zfcp.3dff9c.

The zfcp error recovery could not reestablish the connection,
so it went into full adapter (meaning FCP subchannel) recovery,
which apparently helped reestablishing the basic QDIO connection:

qdio: 0.0.3f24 ZFCP on SC 19 using AI:1 QEBSM:1 PCI:1 TDD:1 SIGA: W AO
qdio: 0.0.3f1e ZFCP on SC 18 using AI:1 QEBSM:1 PCI:1 TDD:1 SIGA: W AO
qdio: 0.0.3f06 ZFCP on SC 14 using AI:1 QEBSM:1 PCI:1 TDD:1 SIGA: W AO
qdio: 0.0.3f18 ZFCP on SC 17 using AI:1 QEBSM:1 PCI:1 TDD:1 SIGA: W AO
qdio: 0.0.3f12 ZFCP on SC 16 using AI:1 QEBSM:1 PCI:1 TDD:1 SIGA: W AO
qdio: 0.0.3f0c ZFCP on SC 15 using AI:1 QEBSM:1 PCI:1 TDD:1 SIGA: W AO
qdio: 0.0.3f00 ZFCP on SC 13 using AI:1 QEBSM:1 PCI:1 TDD:1 SIGA: W AO
qdio: 0.0.3f2a ZFCP on SC 1a using AI:1 QEBSM:1 PCI:1 TDD:1 SIGA: W AO
qdio: 0.0.3f36 ZFCP on SC 1c using AI:1 QEBSM:1 PCI:1 TDD:1 SIGA: W AO

However, you might get QDIO errors again as soon as zfcp sends requests to the channel. That seems to be the case with your setup.

INIT: Id "4" respawning too fast: disabled for 5 minutes

Don't know if that's related or what it means without context.

Based on your guys experience, what caused this error, is it a hardware
error or system error?

You seem to have an issue between the zfcp device driver and the FCP channel, that's where QDIO is used to communicate. Inbetween there might be z/VM, in case you run under VM. If the latter, what's the z/VM version?
What version of FICON Express channels do you use?
Is this SLES11? SP1?
What's the output of "lszfcp -Ha"?
Maybe we also need the mapping of device bus IDs, subchannels, and CHPIDs: "lscss -t 1732/03". The content of /var/log/messages might also have additional information since the above logs that happen to appear on the console(s) are just a subset of kernel messages. You might also find log entries regarding the FCP channels on the service element.

Steffen

Linux on System z Development

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Reply via email to