If you are running z/OS 2.3 and increasing ESQA because of expansion into ECSA messages or sudden unexplained growth, check out APAR OA58438.
We had 3 system crashes after migrations to z/OS 2.3 in 2019 and one close call after ECSA got to 99% when ESQA expanded into it (only a vendor monitor crashed in that case after a failed ECSA getmain). Stand alone dumps didn't find the root cause other than we new it was RPB pool growth related to SVC dumps from CICS. In one case a single SVC dump caused an 80M ESQA spike within one or two seconds crashed a system when it spilled into ECSA and also filled up ECSA (typically at about 70% use, but "stable"). We worked with IBM all summer on this. We had different SLIPs and GTF traces put in place, but with the traces going the problem never happen. But SVC dump processing did take over the CPU with the trace + GTF active! :-) Meanwhile, we increased ESQA on 30 LPARs via normal IPLs over the summer by about 80M and ECSA a bit as a "work around". Settings that haven't been touched in god knows how long (certainly not since 64-bit usage has increased and HVCOMMON). So we had to loose about 100M of high private to do this. We also increased real storage on a couple of LPARs that really didn't warrant it (based on zero or close to zero demand paging during normal operations), but we knew real storage was also involved in the problem (no flash memory for SVC dumps on my client's mainframes). The entire time IBM has said we are the only ones reporting the problem, but since we had the problem in big sysplexes, small sysplexes, big LPARs, small LPARs, I know that we can't be the only ones. I think other shops are ignoring the ESQA expansion into ECSA (since that in itself doesn't hurt) and / or they have more "white space". The RPB control blocks are freed after about 10 minutes, so anyone looking at their current ESQA (and ECSA) usage wouldn't notice the spikes or would just say 'oh well, looks good now". Anyway, IBM was getting close to figuring this out not too long ago and partially re-created the problem in the lab some weeks ago and just got back to us today with the root cause and the APAR that was opened. It is related to being real storage constrained at the time of the SVC dumps (I think all of the crashes were during CICS startup time in the wee morning hours). I really wanted to post something about this earlier but didn't since IBM said they had no other reported problems, So if you have seen this problem since migrating to z/OS 2.3, now you know you aren't the only ones. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN