On Mon, 24 Apr 2006 07:53:35 -0700, Walter Marguccio
<[EMAIL PROTECTED]> wrote:
>Bill,
>I believe what you states, unfortunately I didn't understood that reading
the book.
>Yes, Barbara is correct about isolating a system from the plex, but my
point was another one.
>If you have a basic plex and you want to let SFM automate a system reset
of a LPAR for a planning system shutdown,
>these stmts SYSTEM NAME(*) RESETTIME(20) in the SFM policy works fine.
This is what I have experienced so far,
>and I would be grateful to you if you can confirm the following:
>
>1. V XCF,my_lpar,OFF
>2. R xx,SYSNAME=my_lpar
>... system waits cleanup time (defined in COUPLExx), then goes in 0A2 ...
>... system waits reset time (defined in SFM) then SFM resets the lpar ....
>3. msg IXC102A pops up, having SFM reset the lpar I can safely reply R
xx, DOWN
> (or automate the reply using my automation package)
>
>Sorry for being persistent on this, but I don't want to play with the
>fire, nor explaining rubbish to my operating staff.
Had an offline conversation with Walter. This may clarify the situation:
First, the SFM RESETTIME specification does not come into play for
explicit operator VARY commands. It applies only to status update missing
(SUM) situations - when a system fails to update its "heartbeat" for the
failure detection interval specified in the COUPLExx parmlib member
(INTERVAL).
Second, in SUM situations it is not guaranteed that IXC102A will
refrain from appearing until after SFM has reset your LPAR. The timing
depends on what you specify for RESETTIME and for your OPNOTIFY parameter
in COUPLExx. Both the INTERVAL and OPNOTIFY intervals are measured
beginning when a system is observed to have missed a status update. After
the INTERVAL expires, the delinquent system is declared status update
missing and the RESETTIME timer begins running. Meanwhile, the OPNOTIFY
interval continues to count down in parallel with RESETTIME. Normally,
OPNOTIFY is only a few seconds longer than INTERVAL, so you would normally
expect to see IXC102A *before* the LPAR reset occurs. If you want the
reset to occur first, you would have to adjust OPNOTIFY to be greater than
INTERVAL plus RESETTIME. This might still not guarantee the right order,
since multiple tasks are involved and any number of timing glitches could
occur.
The main point - SFM is useful for status update missing conditions,
not for explicit partitioning situations. If you have an image that
becomes unresponsive for some reason (dumping, looping, spinning, CEC
failure, whatever), and remains unresponsive longer than the failure
detection interval you've specified, SFM can automatically remove that
image from the sysplex. But it doesn't do anything for you on planned
shutdowns, when you issue a VARY XCF command to remove an image from the
plex.
Bill Neiman
z/OS Development
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html