Another 09-09-09 Problem

David L. Craig Thu, 10 Sep 2009 10:04:43 -0700

I sent this to one of our account CEs.  I'm posting it
as a brain teaser/anecdote more than for help.  Enjoy.


Hey, Tom, I'm sorry to email this to you, but I'm not
sure how to call the Support Center about it.  Tell
me what you think I should do.

Yesterday morning around 5:00 AM SOMETHING unusual
was going on with C81 which was being used by the
DOSVSE (VSE/ESA 2.2) virtual machine for DASD
backups.  The VSE log after 4:58 was lost, so I'm not
clear on what Ops experienced.  The VM (VM/ESA 2.2)
console log shows the following:

05:02:30 Q C81
05:02:30 TAPE 0C81 ATTACHED TO DOSVSE   0C81 R/W
05:02:54 DET C81 DOSVSE
05:03:04 HCPDTR1120E The requested DETACH for device 0C81 did not complete in 
the allotted time.
05:03:15 VARY OFFLINE C81
05:03:15 HCPCPN140E Tape 0C81 attached to DOSVSE
05:03:15 1 device(s) specified; 0 device(s) successfully varied offline
NETVIEW : IST574E  START I/O TIMEOUT OCCURRED FOR VSELINE
NETVIEW : CNM039I AN IMPORTANT MESSAGE HAS BEEN LOGGED - PLEASE BROWSE THE 
NETVIEW LOG
05:04:11 HCPDPM1283I Path 20 to device 0C81 currently not responding
05:04:11 HCPMHT2153I TAPE  0C81 I/O CANCELLED DUE TO A MISSING INTERRUPT

Essentially it appears the failed VARY OFF of the
drive, while still ATTACHed to the VSE system
(presumably not DVCDNed and perhaps in an I/O pending
state) and following a failed DETACH attempt, MAY
have spawned SOMETHING in the 7060 that induced
communications issues between VM and VSE VTAM as well
as missing interrupt detections on the tape drive.
Fortunately I was almost at work anyway when Ops
called me, so they stood by until I could arrive to
assess the situation. They had already attempted to
FORCE off the DOSVSE virtual machine which resulted
in it being placed in a zombie state, so I just
reIPLed VM at 5:30.  This cleared the MIH issue but
the VTAMs were unable to reinitialize the SNA
elements in the Microsoft HIS servers.  As we have
found in the past POR is the only solution to this, I
gracefully quiesced the VSE system, shutdown VM again
(no reIPL this time), deactivated the 7060 (Basic
mode, no LPARs) via the Support Element, ran the POR,
and activated the 7060, resulting in a normal IPL of
VM et al, and nothing out of the ordinary has since
been observed.

All the Support Element shows are two Channel
Problems (I deleted the Disabled Wait record) and the
HMC did not phone home.  I have EREP SYSUM, EVENT,
and PRINT=PS listings ready to attach to an email as
well as the VM and VSE logs (such as the VSE log
is).  I know you're familiar with the odd issues
we've been having with those drives and/or CUs, so I
won't go into that.

You know, it would be really great if I could just
email this message directly to the Support Center to
open an incident.  Perhaps you could suggest to the
right person that IBM make this sort of approach
possible (or point me to him or her, if necessary--I
know customers are louder than IBMers sometimes).

-- 

May the LORD God bless you exceedingly abundantly!

Dave Craig

-  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -
"'So the universe is not quite as you thought it was.
 You'd better rearrange your beliefs, then.
 Because you certainly can't rearrange the universe.'"

--from _Nightfall_  by Asimov/Silverberg

Another 09-09-09 Problem

Reply via email to