<snip>
We're getting reports from RMF of excessively high I/O retries on some
systems. In one case, we've seen the number of retries go higher than the
actual number of I/Os!
</snip>

It certainly is possible that the number of retries could be larger than the number of Start I/O commands. I have some data in one of the PDBs sent to me by a CPExpert user that show this particular site had between 5 and 12 retries for the average start I/O instruction.



<snip>
- Reports do not detail the I/Os being retried by channel or device; just
a total number
</snip>

What release are you on? With z/OS V1R4, Type 78(3) records describe causes of retries. See:
      R783ICPB    Number of times an I/O was retried due to channel path busy.
R783IDPB Number of times an I/O was retried due to director port busy.
      R783ICUB    Number of times an I/O was retried due to control unit busy.
      R783IDVB    Number of times an I/O was retried due to device busy.

In most data that I've looked at, the retries have been due to path busy, but YMMV.



- No channel looks excessively busy according to RMF
- We see this on more than one CEC (all z900)
- SAP/IOP processors are 'unbalanced'

This last point might be an issue. We have a fairly large number of now
unused ESCON channels whose function has moved over to FICON. It appears
that the three SAPs divide up all channels among themselves, but one of
them--which happens to have the most idle channels--is less than half as
busy as the other two. Maybe we need to physically remove the unused
channels in order to redistribute the SAPs more evenly across the active
channels.


When the distribution of I/O activity to the SAPs and to the channels is very imbalanced, it is increasingly likely that path busy can cause a very large number of retries.

Have you read IBM zSeries 900 Technical Guide (SG24-5975-01)? Section 2.6.6 ("Channels to SAP assignment") describes the assignment process with z900. Note that this assignment is done at activation time and is simply based on spreading the channels across SAPs (and, of course, the assignment has no intelligence about the prospective activity among the channels).

I have SMF data from some of my users that show 70-85% of I/O activity to a single SAP (with 3 SAPs available).

Don

******
Don Deese, Computer Management Sciences, Inc.
Voice: (703) 922-7027  Fax: (703) 922-7305
http://www.cpexpert.org
******



--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.1.375 / Virus Database: 268.1.1/271 - Release Date: 2/28/2006

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to