I think you mis-interpreted John's remarks. He didn't say anything about mainframe developers extensively writing and testing microcode instructions (which is not a documented user interface), but instead extensively writing and testing z-Architecture assembler code (with the High Level Assembler, which uses instructions documented in the z/OS Principles of Operation, which IS a documented user interface). Outside of IBM, microcode assemblers might be used in a limited educational setting in conjunction with emulated execution to demonstrate principles of microcode development - they would not be used to generate and modify microcode on an actual IBM mainframe processor.

As for bypassing "broken silicon", IBM over the years has put much effort into making their mainframe hardware able to detect and isolate failing hardware components. At least since introduction of 9672 CMOS mainframes, there are even "spare" processors that can be used to replace a failed processor. This diagnostic and sparing capability is built into the existing microcode.

In the very unlikely event that some instructions documented in the PoOp are found not to perform as advertised, IBM would be compelled to fix the broken microcode so that they do. Eliminating a failing z-architecture instruction from software is not an option, because non-IBM software outside IBM's control could be dependent on any behavior that is documented in the PoOp.

Users do not have access to IBM's architecture-defining microcode, and
leasing/licensing agreements precludes user access to that microcode;
but, IBM can download microcode fixes fairly easily for installation by
a local IBM Customer Engineer.  Mainframe processors load the defining
microcode into the processors at every Power-on Reset, so updating the
microcode for fixes or enhancements is not as drastic as on a PC, where the microcode is typically frozen into a physical chip.

Software patches and enhancements to IBM's mainframe Operating System and support programs, which utilize the z-architecture "machine instructions" that are documented in the Principles of Operations, are installed by the installation's own System Programmers. These patches have nothing to do with bugs at the microcode or physical hardware level, which are fixed by IBM Customer Engineers.

Over the last 25 years, many of the mainframe microcode patches I've seen have been for enhancements, and most of the remaining "bug" fixes were for failures that could only occur under an unlikely sequence of events. Over that same 25 years the number of mainframe system failures I've seen that directly relate to mainframe microcode bugs is less than five, and in such cases it was not uncommon that the bug exposure had been present for many months and "kazillions" of instruction executions before we just happened to hit a triggering sequence of events.

I would agree with John about IBM's responsiveness to such issues. IBM regards any failure that takes down mainframe hardware or mainframe operating systems as a serious event and has effective procedures in place for reporting, diagnosing, and resolving such issues. This is one of those areas in which mainframe support is orders of magnitude better than what one finds in the PC world. To be fair to PCs, they do have a much more difficult problem - there is only marginal centralized control for hardware architecture; or, considering requirements for OEM device drivers, for even the core parts of the PC Operating Systems. With the relative chaos that reins in both of these areas, its amazing PCs run as well as they do.

Kuredjian, Michael wrote:
I didn't see much cynicism in my comment, although that may be the result of being jaded by my experience with PC manufacturers and their reluctance to admit and correct problems. I'm very used to both hardware and software manufacturers ignoring obvious problems in their products. I may have mis-used the term "cover-up." What I meant was that they[IBM] could release software patches that specifically avoid making use of broken circuits in silicon. However, I wasn't aware that mainframe developers routinely make use of micro-assembly instructions, thereby revealing hardware bugs quickly.
-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[EMAIL PROTECTED]
Behalf Of john gilmore
Sent: Thursday, July 20, 2006 11:40 AM
To: [email protected]
Subject: z/Architecture design errors


Michael Kuredjian writes

| How do we know the number of hardware design errors? With IA32, it's easier to
| discover these problems because the CPU is used by many people under many
| operating systems. IBM designs the OS and CPU, making it much easier to cover up
| any problems that do exist.

IBM does design both, but many others write assembly-language code that exercises these instructions vigorously. Microprocessor assemblers are toys, designed (to quote myself) to discourage their use. The IBM HLASM is a very different, heavily used animal.

Mr. Kuredjian's sophomore-cynical comment is, however, wide of the mark in another way.

There are two ways to deal with errors in this business. One is to try to keep them secret, fixing them under the covers. The other is to call a shovel a spade as quickly as it has been identified in order to turn one's back upon it as quickly as possible.

IBM does the second. The trouble with the first approach is that when, inevitably, dissimulation is detected, it becomes a cause celebre. Even Microsoft learned, after a time, that candor about the security deficiencies of Windows was the only feasible approach. It now has its hand held in the fire much more briefly than used to be the case.

It is perhaps also worth noting that, while software errors are expensive, hardware errors are even more expensive and much more embarassing. It is much cheaper to find them before they get into silicon.

John Gilmore
Ashland, MA 01721-1817
USA


--
Joel C. Ewing, Fort Smith, AR        [EMAIL PROTECTED]

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to