Jim,
I have seen the same problem, BUT, only when our coupling
facilities are also shutdown as well. If the couplers
remain up, you should not experience data loss and logstream corruption.
What I see in the log after a Z EOD is issued is
IFA705I HALT SMF PROCESS HAS SYNCHRONIZED THE BUFFERED LOGSTREAM RECORDS.
This tells me that the SMF buffers are flushed to the logstream where they
should live there happily across IPLs
unless you shutdown the couplers. In my experience, this happens mostly
when ICFs are used and the CECs
the ICFs live on are POR'd frequently. We have moved away from ICFs
except for our lab and GDPS environments
for just this reason (and DB2 datasharing). In both those cases, our SMF
output is small enough that we can
easily duplex the logstreams and not lose any data. If this is an option
for you, try duplexing.
When it happened to us, we lost some DB/2 SMF data and OPERLOG.
This was significant for us because
it was during this failure that we discovered that IBM now stops SYSLOG at
two points. When either Z EOD is issued
or when JES2 is stopped. Thus I had no log data neither from SYSLOG nor
OPERLOG. I pointed out this inconsistency
to IBM and I was also directed to pursue a Marketing Request via our local
IBM support. I'm told that the Marketing Request
approach causes IBM to poll it's user community and see how many accounts
are affected.
Then they can use that data make a decision whether or not to fix the
problem.
Jim Holloway - MetLife
> ------------------------------
>
> Date: Wed, 6 May 2009 06:59:00 -0500
> From: Jim Marshall <[email protected]>
> Subject: SMF LOGGER - Not Ready for Prime Time
>
> If anyone is in the throws of putting the SMF LOGGER into production, I
would
> seriously hold up and wait a while. We have been up on it for about
three
> months and have gone through one problem, hurdle, consequence,
challenge,
> etc, after another.
>
> Today the issue is with when you shutdown a system or worse if you need
to
> bring down the entire sysplex. "Z EOD" does not interface to the SMF
LOGGER
> and the operators usually blow the systems away when they receive the
> message which says EOD is done. The significance is the logger is still
doing
> its thing and now break it; causing all kinds of grief later.
>
> IBM has a temporary fix where you issue a set of commands per LPAR
before
> the Halt EOD but so far they have not worked. Besides so far, the
commands
> do not tell you when the logger is completed doing its think. Although
when
> the set of commands do work, hopefully it will tell an operator
> something of its
> completion. Remember, the intent of this is for your operators to visit
every
> LPAR shutting down and wait for each. This will bring new meaning tothe
idea
> of a fast sysplex wide IPL.
>
> Everyone including IBM agrees "Z EOD" needs to be tap everyone on the
> shoulder to have its logger to do what it should do and then tell HALT,
it is
> now done; thus giving the message which is now real. So to do this, we
are
> submitting as "Marketing Request" to IBM to get all the players to
> talk to each
> other to sort through this "situation" (not to be confused with a
problem).
> Things are working as intended except no one thought this all the way
> through and the cardinal word is "intended".
>
> Will keep you posted as things unravel.
>
> jim
> (WashDC)
The information contained in this message may be CONFIDENTIAL and is for the
intended addressee only. Any unauthorized use, dissemination of the
information, or copying of this message is prohibited. If you are not the
intended addressee, please notify the sender immediately and delete this
message.
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html