For completeness I'm posting the 'Wait state action table (WSAT)', which
determines the DIAGxx action(s) to take for various wait states. I'm surprised
at how short this table is.
Entries are of the form frrrrwww, where
f
represents flags
rrrr
represents the reason code
www
represents the wait state code
The '0010'b flag indicates that SADMP is to be IPLed.
The '0001'b flag indicates that z/OSĀ® is to be IPLed.
Both flags on ('0011'b) indicates that SADMP is to be IPLed, followed by z/OS.
The '1000'b flag indicates that any reason code (for this wait state code)
should be considered a match.
The entries coded into the WSAT as of this writing are as follows:
X'000040A2'
X'1017C0A2'
X'201800A2'
X'301840A2'
X'200010B5'
X'200020B5'
X'A0000007'
X'A0000009'
X'A0000037'
X'A0000039'
X'A0000056'
.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
[email protected]
-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On Behalf
Of Gerhard Adam
Sent: Tuesday, June 07, 2016 12:42 PM
To: [email protected]
Subject: (External):Re: Testing SFM policy
You can write a simple assembler program to load a WAIT PSW that is disabled
for interrupts or you can modify the RESTART NEW PSW in the PSW with the wait
bit on and disabled for interrupts.
If you use the RESTART NEW PSW, then a PSW RESTART should result in your system
going into a wait state.
Both should kill your system.
Adam
-----------------------------------------From: "Jesse 1 Robinson"
To:
Cc:
Sent: Tue, 7 Jun 2016 19:19:08 +0000
Subject: Re: Testing SFM policy
Testing was semi-successful. On one hand, QUIESCE stopped the system.
Missing heartbeat was detected by SFM and system was partitioned out.
However, I was also trying to test SAD and AutoIPL. Nothing happened in that
arena, so--as I do when all else fails--I RTFM. Found this:
"For restartable wait states, Loadwait will ignore the AutoIPL policy unless a
matching WSAT entry is found that has one or both flags on.
If a bit is found on, then the corresponding SAD or re-IPL will be performed.
As of this writing, the WSAT contains no entries matching any restartable wait
state and reason codes, so a restartable wait state request will not result in
any AutoIPL action."
Because the QUIESCE wait state (x'CCC') is restartable, AutoIPL and SAD are
ignored. I need to find another way to kill the system.
Ironically we've had a few AutoIPLs over the years, mostly (all?) involving
virtual storage exhaustion of some kind. None of them intentional.
.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
[email protected]
-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On
Behalf Of Jesse 1 Robinson
Sent: Tuesday, June 07, 2016 8:19 AM
To: [email protected]
Subject: (External):Re: Testing SFM policy
I will proceed with QUIESCE once our automation SME is ready to test.
Key goal is to capture the partitioning message on another member of the
sysplex and blast out some alerts. In a recent failure at oh-dark-thirty on a
Saturday morning, Ops did not notice the wait state failure (SQA exhausted). We
also have a new SAD volume to test:
single Mod-54. Thanks for all the advice.
.
.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-302-7535 Office
[email protected]
-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On
Behalf Of Mike Myers
Sent: Tuesday, June 07, 2016 5:45 AM
To: [email protected]
Subject: (External):Re: Testing SFM policy
Mark and Skip:
I can assure you that QUIESCE will stop the XCF heartbeat. I have used it
often in training to demonstrate the loss of heartbeat and triggering the
sysplex to respond to the loss of a member. With an active SFM policy, it will
trigger your desired action.
The advantage to using QUIESCE is that you can back out and reactivate the
system with PSW restart if SFM is not active or you can beat it to the punch
and will not lose the member.
Mike Myers
Senior z/OS Systems Programmer and Instructor Mentor Services Corporation
Goldsboro, NC
(919) 341-5210
On 06/06/2016 07:44 PM, Mark Jacobs - Listserv wrote:
> Since Quiesce will put the system in a restartable wait state, I'd > think
> that the XCF heartbeat would stop too.
>
> Mark Jacobs
>
>> Jesse 1 Robinson June 6, 2016 at
>> 7:32 PM I'd like to test my Sysplex Failure Management policy. The
>> question is how to make a system stop responding long enough to >> trigger
>> expulsion from the sysplex. I'm thinking of issuing QUIESCE >> on a
>> member. Have not used that in decades. Will it cause lack of XCF >>
>> heartbeat? I can just try it unless someone has a better suggestion.
>>
>> The last time this happened for reals was when a system ran clean out >>
>> of SQA on account of a bad dog product. That's pretty hard to >> recreate.
>> All I want is to go through the pain and agony of >> partitioning to test
>> message handling and auto SAD.
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN