Ahhhaaa, you named and shamed and I didn't have to *search the archives*!
A lot of the Websphere portfolio doesn't port well to z/OS. I have
anecdotal evidence (I was told by an IBMer) that Websphere messaage
broker runs like a stallion on AIX but sucks big time on z/OS.
Having said that, I still maintain that zUnix is reasonably solid. It
comes down to quality of implementation, and that means tailoring for
the z/OS platform.
Barbara Nitz wrote:
This past weekend I had the dubious honour of shutting down and IPLing 5
systems, two of them with USS work. The shutting down part was really bad
(now I know why our operators keep complaining).
One lpar has my favourite hate-application running (called WBIFN, for all you
European SWIFT customers). Around seven minutes into the shutdown
(another lpar with similar workload but not this appliaction was already down
after 7 minutes) D A,L revealed that there were some DB2s plus WBIFN still
running plus the necessary system infrastructure. And one thing with the
jobname of a TSO user, but OMVSEX in the step info, so the userid belonged
to some USS process? thread? application? And they seemend to multiply while
I was looking at them. Canceling any of them didn't really help, never mind
that the duplicate jobname requires using the asid, which requires a list first.
By the time I get around to killing the pid, it's already gone.
Then I saw that *something* still had open DB2 threads, for which automation
has made provisions and forces things out. So I thought that this must be
related to this user. Given that I couldn't stop it from multiplying, much less
get out of the system (f bpxoinit,shutdown=forkinit was replied with 'shutdown
delayed'), I shut down the fork service.
That stopped the multiplication, but a few of those 'user asids with a number'
were still around. And it was a VERY bad idea to shutdown the fork service, as
that effectively prevented WBIFN from terminating eventually (it never
terminates in a timely manner, anyway). I ended up canceling things, which
generated tons of coredumps which filled the directory, which eventually
prevented the startup of this application. (And no, these useless coredumps
cannot be prevented, believe me, I've tried.)
The good news was that after 20 minutes I had WBIFN and that userid shut
down, and then automation did the rest (in the case of our operators, they
never get automation to do 'the rest').
So how are other installations handling system shutdown when there are
active USS users (or at least their leftover processes)? For a 'pure' MVS, I can
shutdown TSO and the Initiators, cancel any running batch jobs, and I am
done. But how do I stop the USS things from multiplying?
And this Tuesday, that users leftover processes are back. I tried killing the top
one (right under ppid=1), but that only resulted in another process under
ppid=1 (that killed process was just dropped). superkill didn't help, either. Isn't
there any surefire way to get the whole tree stopped in one fell swoop? (and
no, I won't kill pid 1).
(An OMVS ignoramus is asking this, so please be gentle with me)
Best regards, Barbara
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html