Re: z/Architecture design errors

Tom Marchant Thu, 20 Jul 2006 11:54:35 -0700

The mainframe is very different from the PC in this regard.  You might find 
the chapter on Machine Check Handling in the POO to be interesting.  
Mainframes are designed with extensive error detection and recovery 
capabilities.  z/OS records error information to LOGREC to assist in fixing 
problems.


For several years, every CPU chip has been designed with two processors.  
They both process the same instruction stream against the same data, and 
the results are compared.  If there is a discrepency anywhere along the 
way, the instruction is aborted and retried.  If it fails again, the CPU is 
switched out and a spare is switched in to restart the instruction stream.

In a way, this could be considered a "cover-up," but I think not the way 
you meant it.  The program is protected from most hardware failures.

Tom Marchant

On Thu, 20 Jul 2006 13:11:38 -0400, Kuredjian, Michael 
<[EMAIL PROTECTED]> wrote:

>I didn't see much cynicism in my comment, although that may be the
>result of being jaded by my experience with PC manufacturers and
>their reluctance to admit and correct problems. I'm very used to
>both hardware and software manufacturers ignoring obvious problems
>in their products.
>
>I may have mis-used the term "cover-up." What I meant was that
>they[IBM] could release software patches that specifically avoid
>making use of broken circuits in silicon. However, I wasn't aware
>that mainframe developers routinely make use of micro-assembly
>instructions, thereby revealing hardware bugs quickly.
>
>-----Original Message-----
>From: IBM Mainframe Discussion List [mailto:[EMAIL PROTECTED]
>Behalf Of john gilmore
>Sent: Thursday, July 20, 2006 11:40 AM
>To: [email protected]
>Subject: z/Architecture design errors
>
>
>Michael Kuredjian writes
>
>| How do we know the number of hardware design errors? With IA32, it's
>easier to
>| discover these problems because the CPU is used by many people under many
>| operating systems. IBM designs the OS and CPU, making it much easier to
>cover up
>| any problems that do exist.
>
>IBM does design both, but many others write assembly-language code that
>exercises these instructions vigorously.  Microprocessor assemblers are
>toys, designed (to quote myself) to discourage their use.  The IBM HLASM is
>a very different, heavily used animal.
>
>Mr. Kuredjian's sophomore-cynical comment is, however, wide of the mark in
>another way.
>
>There are two ways to deal with errors in this business.  One is to try to
>keep them secret, fixing them under the covers.  The other is to call a
>shovel a spade as quickly as it has been identified in order to turn one's
>back upon it as quickly as possible.
>
>IBM does the second.  The trouble with the first approach is that when,
>inevitably, dissimulation is detected, it becomes a cause celebre.  Even
>Microsoft learned, after a time, that candor about the security 
deficiencies
>of Windows was the only feasible approach.  It now has its hand held in the
>fire much more briefly than used to be the case.
>
>It is perhaps also worth noting that, while software errors are expensive,
>hardware errors are even more expensive and much more embarassing.  It is
>much cheaper to find them before they get into silicon.
>

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Re: z/Architecture design errors

Reply via email to