The plot thickens. Be sure to mention to management that you have a few thousand of your closest friends with literally millennia of cumulative experience working on the issue with you :-)
Of course, working with IBM is the better path, I think. Still, it might be fun to try to recreate the problem by running and cancelling that test job. -----Original Message----- From: IBM Mainframe Discussion List [mailto:[email protected]] On Behalf Of JE Thinnes Sent: Thursday, January 08, 2009 8:58 AM To: [email protected] Subject: Re: MVS 4 minute 'outage' Thanks for all the suggestions. Update. We experienced another 'hang' lasting 2 minutes. LOGREC indicated SFTABN (S0222) on the same low priority batch job before both hangs. The programmer indicated the job was doing many inserts/updates to various DB2 tables before it was canceled (this was on a test DB2 system). We opened an incident with IBM - they indicated a DB2 roll back could not have caused the hang of the Console address space. We are still persuing with IBM. Are the any z/OS 1.9 uniprocessor configs out there? Have any default settings changed in 1.9 that I should look at? I am reviewing a uniprocessor white paper from Dec 2006. Any other ideas?? Answers to questions from other posters questions: LOGREC did not indicated any ABEND071. Environment. 2096-R01 (1CP with 1 zIIP and 2 IFL). 4 LPARs (2 z/OS 2 LINUX). Only 2 are active (the z/OS 1.9 in question and 1 test LINUX). z/OS does not run under z/VM. IBM reports no 'phone home' events. No messages on the HMC. Problems happened at 10:47-10:51 and 13:28-13:30. SMF/RMF INTVAL(30) SYNCVAL(30) at top and bottom of hour. Nothing of consequence ended before the problem (except the DB2 test batch job mentioned above). The highest priority batch job we have is IMP=4. Not sure about the DASD caching stats in RMF. The DASD is an EMC DMX4 (new as of Dec 08). It is shared with the other LPARs and some open systems. The new DASD is connected via 4 FICON. This replaced a DMX2 connected by 24 ESCON. The devices are mirrored to an off site device. However, there were no reported DASD hardware or communication problems. No IPLs during the hangs. No hit the stop button. The new DMX4 has a 4 minute MIH setting. The second problem was 2 minutes long. There were TCPIP error messages after the hang - they were attributed to being the victim not the cause. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html NOTICE: This electronic mail message and any files transmitted with it are intended exclusively for the individual or entity to which it is addressed. The message, together with any attachment, may contain confidential and/or privileged information. Any unauthorized review, use, printing, saving, copying, disclosure or distribution is strictly prohibited. If you have received this message in error, please immediately advise the sender by reply email and delete all copies. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html

