I think you mis-interpreted John's remarks. He didn't say anything
about mainframe developers extensively writing and testing microcode
instructions (which is not a documented user interface), but instead
extensively writing and testing z-Architecture assembler code (with the
High Level Assembler, which uses instructions documented in the z/OS
Principles of Operation, which IS a documented user interface). Outside
of IBM, microcode assemblers might be used in a limited educational
setting in conjunction with emulated execution to demonstrate principles
of microcode development - they would not be used to generate and modify
microcode on an actual IBM mainframe processor.
As for bypassing "broken silicon", IBM over the years has put much
effort into making their mainframe hardware able to detect and isolate
failing hardware components. At least since introduction of 9672 CMOS
mainframes, there are even "spare" processors that can be used to
replace a failed processor. This diagnostic and sparing capability is
built into the existing microcode.
In the very unlikely event that some instructions documented in the PoOp
are found not to perform as advertised, IBM would be compelled to fix
the broken microcode so that they do. Eliminating a failing
z-architecture instruction from software is not an option, because
non-IBM software outside IBM's control could be dependent on any
behavior that is documented in the PoOp.
Users do not have access to IBM's architecture-defining microcode, and
leasing/licensing agreements precludes user access to that microcode;
but, IBM can download microcode fixes fairly easily for installation by
a local IBM Customer Engineer. Mainframe processors load the defining
microcode into the processors at every Power-on Reset, so updating the
microcode for fixes or enhancements is not as drastic as on a PC, where
the microcode is typically frozen into a physical chip.
Software patches and enhancements to IBM's mainframe Operating System
and support programs, which utilize the z-architecture "machine
instructions" that are documented in the Principles of Operations, are
installed by the installation's own System Programmers. These patches
have nothing to do with bugs at the microcode or physical hardware
level, which are fixed by IBM Customer Engineers.
Over the last 25 years, many of the mainframe microcode patches I've
seen have been for enhancements, and most of the remaining "bug" fixes
were for failures that could only occur under an unlikely sequence of
events. Over that same 25 years the number of mainframe system failures
I've seen that directly relate to mainframe microcode bugs is less than
five, and in such cases it was not uncommon that the bug exposure had
been present for many months and "kazillions" of instruction executions
before we just happened to hit a triggering sequence of events.
I would agree with John about IBM's responsiveness to such issues. IBM
regards any failure that takes down mainframe hardware or mainframe
operating systems as a serious event and has effective procedures in
place for reporting, diagnosing, and resolving such issues. This is one
of those areas in which mainframe support is orders of magnitude better
than what one finds in the PC world. To be fair to PCs, they do have a
much more difficult problem - there is only marginal centralized control
for hardware architecture; or, considering requirements for OEM device
drivers, for even the core parts of the PC Operating Systems. With the
relative chaos that reins in both of these areas, its amazing PCs run as
well as they do.
Kuredjian, Michael wrote:
I didn't see much cynicism in my comment, although that may be the result of being jaded by my experience with PC manufacturers and their reluctance to admit and correct problems. I'm very used to both hardware and software manufacturers ignoring obvious problems in their products.
I may have mis-used the term "cover-up." What I meant was that they[IBM] could release software patches that specifically avoid making use of broken circuits in silicon. However, I wasn't aware that mainframe developers routinely make use of micro-assembly instructions, thereby revealing hardware bugs quickly.
-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[EMAIL PROTECTED]
Behalf Of john gilmore
Sent: Thursday, July 20, 2006 11:40 AM
To: [email protected]
Subject: z/Architecture design errors
Michael Kuredjian writes
| How do we know the number of hardware design errors? With IA32, it's
easier to
| discover these problems because the CPU is used by many people under many
| operating systems. IBM designs the OS and CPU, making it much easier to
cover up
| any problems that do exist.
IBM does design both, but many others write assembly-language code that
exercises these instructions vigorously. Microprocessor assemblers are
toys, designed (to quote myself) to discourage their use. The IBM HLASM is
a very different, heavily used animal.
Mr. Kuredjian's sophomore-cynical comment is, however, wide of the mark in
another way.
There are two ways to deal with errors in this business. One is to try to
keep them secret, fixing them under the covers. The other is to call a
shovel a spade as quickly as it has been identified in order to turn one's
back upon it as quickly as possible.
IBM does the second. The trouble with the first approach is that when,
inevitably, dissimulation is detected, it becomes a cause celebre. Even
Microsoft learned, after a time, that candor about the security deficiencies
of Windows was the only feasible approach. It now has its hand held in the
fire much more briefly than used to be the case.
It is perhaps also worth noting that, while software errors are expensive,
hardware errors are even more expensive and much more embarassing. It is
much cheaper to find them before they get into silicon.
John Gilmore
Ashland, MA 01721-1817
USA
--
Joel C. Ewing, Fort Smith, AR [EMAIL PROTECTED]
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html