On Thu, 12 Sep 2013 15:45:54 +0000, KUMAR Anil <[email protected]> wrote:
>Hi All, > >Recently we had an issue in one LPARs where in a product STC (which starts, >plants a hook and then ends) was looping. It consumed csa/ecsa and the LPAR >was affected very much. > >We are looking at suggestions /inputs on how to handle these situations. If >there are any automations put in place to handle such situations (for e.g. a >particular STC should not start more than 5 times or a HOOK resource's status >being captured), please provide details. > Some suggestions: 1) System Health Checker. Depending on how gradual or fast the (E)CSA exhaustion took place, this most likely would have given you plenty of time to react. The automation would be based on messages HC issues and could do something like page / email someone or could cancel an offending task etc. 2) Any other threshold monitor (Omegamon, TMON, Sysview, Mainview, etc). Health Checker however has the right price if you don't have one of those (free!). 3) PFA (Predictive Failure Analysis). A good tool also, but probably less help for a runaway situation like this. For gradual growth it is more helpful and it also understands that at 2 pm peak your CSA usage is expected to be higher than at 5 a.m. before you start all your CICS regions (just an example). As far as the loop itself, you know "blank happens" and the best thing to do is be prepared and have tools in place to monitor your system for abnormalities, have a robust paging system etc. YMMV Mark -- Mark Zelden - Zelden Consulting Services - z/OS, OS/390 and MVS mailto:[email protected] ITIL v3 Foundation Certified Mark's MVS Utilities: http://www.mzelden.com/mvsutil.html Systems Programming expert at http://search390.techtarget.com/ateExperts/ ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
