NTAC:3NS-20

The performance impact was slow I/O to that group of devices, one at a
time, until the path was taken offline. To the users, transactions that
usually take less than a second were taking in some cases minutes. Bad
enough to be considered an 'outage' from the user's perspective.
No issue with console flooding. We have been suppressing IOS050I from
the console for a long time.

Jim

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On
Behalf Of Jousma, David
Sent: Wednesday, December 28, 2016 8:03 AM
To: [email protected]
Subject: Re: Recommendations for RECOVERY options

What specifically was the performance impact?  The loss of the ficon
channel and reduced i/o bandwidth?  Or was it the console message
flooding?  If the latter, implementing Message Flood automation will
stop the flooding of messages.  It is pretty easy to implement.

Dave

_________________________________________________________________
Dave Jousma
Manager Mainframe Engineering, Assistant Vice President
[email protected]
1830 East Paris, Grand Rapids, MI  49546 MD RSCB2H p 616.653.8429 f
616.653.2717


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On
Behalf Of James Peddycord
Sent: Wednesday, December 28, 2016 8:56 AM
To: [email protected]
Subject: Recommendations for RECOVERY options

NTAC:3NS-20
We had a situation with a bad cable that resulted in a huge performance
impact due to the default way that z/OS (we are at 1.13) handles error
recovery on Ficon paths.
The symptoms were many (thousands) of IOS050I messages in the task's
joblog, followed by an IOS450E message, which took the path offline to a
single device.
This was happening for every device (around 3000) that the affected path
was attached to.
As soon as I saw the messages I configured the CHPID offline and the
problem stopped.
We have put in automation that will immediately configure a CHPID
offline as soon as a single IOS450E message is detected, and now I am
experimenting with RECOVERY options.
IBM recommended to set RECOVERY,PATH_SCOPE=CU, set the PATH_INTERVAL to
1 and leave PATH_THRESHOLD=10, and adjust from there.

Due to the paperwork involved with making any change in our environment,
I would like to implement this with a minimum of 'adjustment'.

Does anyone have any recommendations?
We are running on z13s, 16G Ficon through Brokade switches to IBM DS88xx
DASD.

Thanks,
Jim


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send
email to [email protected] with the message: INFO IBM-MAIN

This e-mail transmission contains information that is confidential and
may be privileged.   It is intended only for the addressee(s) named
above. If you receive this e-mail in error, please do not read, copy or
disseminate it in any manner. If you are not the intended recipient, any
disclosure, copying, distribution or use of the contents of this
information is prohibited. Please reply to the message immediately by
informing the sender that the message was misdirected. After replying,
please erase it from your computer system. Your assistance in correcting
this error is appreciated.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send
email to [email protected] with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to