On Thursday, May 4, we experienced a line of moderate to sever thunderstorms move through our area (southeast Texas). We lost commercial power. Nothing unusual about that. Happens all the time. Normally, the UPS carries us until the generator is up and stable. On any power glitch, we commit to generator and stay there for about an hour past the last hit.
We do a full test every Friday. That is, we simulate dropping commercial power and actually run off the generator for an hour or so. So far so good. But this time two things went wrong. One, the generator did not start (the batteries failed). Two, commercial power did not return for over four hours. As it tuned out, the UPS batteries carried us for about 30 minutes but we had no idea what to expect at the time. All we had left were the building emergency lights on their own separate batteries. A while back, I asked this august group's opinion as what I should do in just such an event. The consensus was to let the box run and hope for either the generator or commercial power. If neither happens, both the hardware and software know what to do, and, left alone, will recover nicely once power is back. Chances are there would be no outage. I made that recommendation to management and they agreed. I did power off all unnecessary equipment. I began to wonder if I should go ahead and attempt to shut down when all power was lost. Three long hours pass. We got the generator started about 10 minutes before commercial power came back. The UPS, however, did not survive. Since more storms were in the area, management elected to override the automatic controls and remain on the generator. Anyway, I executed my power up script. Two things a little unexpected happened. One, the 2086-0A4's HMC ran through a CKDISK for over an hour. I used a Support Element to do the POR, initial activates, and eventually the IPL's. The other was a blinking green power light on the 2105-800 Shark and a status code 03 on the left controller. The Shark would not go ready. It has now been about an hour since power was available, about four hours into the event and panic is starting to color my thinking. We put IBM at SEV 1 and opened our DR plan. As we began the first DR steps, the blinking stopped, IBM arrived, and the Shark went ready (all within a few seconds). We IPL'ed (using the support element) and the major subsystems (DB2, JES, MQ) hit the ground running. It was sweet. We had one little application startup sequence issue, but no deviation from the procedure was needed. We very smoothly went from the dark to doing business. Meanwhile, the IBM CE researched the status code and blinking power light. He reported that the status was that the Shark was recharging its batteries, and would not go ready until here was enough capacity to get through another total power loss. How cool, we though. Exactly what we would want. He went on to say that the recharge could take up to 25 hours. Gulp. Did he say 25 hours? Yup. 25 hours. Hmmmm. The bottom line (and moral) is that I would be willing to do recommend exactly the same thing again. I now know that it may take well over an hour after power is restored and that's ok. And, what I am waiting for is well worth the wait. But there are two new players in the game. One, there is a shiny new DS8100 in both the primary and DR sites waiting for power whips and two, a far more aggressive DR strategy in the pipeline. Sorry about the word count. Hope this adds value to your operation. You folks have added much to mine. *Please* don't forget to trim this before replying or commenting. Hal. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html

