It sounds very similar in symptom to my minidisk cache overcommitment problem that resulted in CP thrashing (and an APAR).
-----Original Message----- From: The IBM z/VM Operating System [mailto:ib...@listserv.uark.edu] On Behalf Of Bill Holder Sent: Thursday, September 17, 2009 12:34 PM To: IBMVM@LISTSERV.UARK.EDU Subject: Re: VM lockup due to storage typo I should point out that this hang is likely being misunderstood here. = While this scenario will indeed drive paging over the edge, that's not = likely what happened. If paging had been driven to that point, the system would have quickly taken a PGT004 abend and restarted. Instead, = I believe what happened is likely a most difficult to solve variant on something that was mentioned before: that is, difficulty allocating CP structures required to represent the massive amount of storage. Page tables are only part of the problem. The upper level DAT tables (region = and segment) can be up to 4 frames long, and once storage utilization becomes heavy enough, it becomes fragmented (PGMBK allocation being a factor here), making it very difficult for CP to allocate contiguous = sets of 3s and 4s. We spent quite a bit of effort in z/VM 5.3.0 addressing the PGMBK side of this issue, but the harder problem of the upper level tables remains as a likely constraint point. Occurrences of this sort of problem are likely to result in temporary or permanent hangs of both individual users and eventually the entire system, which supports the theory in this case. I'd really need to see a dump of the system in question to confirm this hypothesis, however. Bill Holder z/VM Development, Memory Management team lead, IBM