Re: Mainframe Makers.... WAS: Ars Technica: The IBM mainframe: How it runs and why it survives

Grant Taylor Thu, 03 Aug 2023 12:18:29 -0700

On 8/3/23 12:47 PM, Joel C. Ewing wrote:

The hardware is designed with redundancy to detect failures incomponents (processors, memory, I/O subsystems, interconnection cables),correct any resulting data errors where possible, retry a failedoperation using different hardware components where appropriate, vary afailing component off line, and in many cases allow concurrent repair offailing components while production continues. Undetected hardwareerrors don't happen.

Save for retrying a failed operation the rest of those statementsweren't specific to IBM mainframes.

I remember reading about a Unix server being demonstrated at a tradeshow that was running applications interactively wherein thedemonstrators removed all but one CPU book from the system, reinsertedthe removed CPU books, then removed the one they hadn't removed, andthen reinserted it. At a later demonstration they took a cup of waterand pored it into the top of the system. What was running continued torun in both demonstrations. The real time demo programs didn't evenstutter. What was obvious was that other non-real-time programs runningon the system slowed down as the OS reacted to hardware going offlineand rescheduling tasks on the remaining online CPUs. Monitoring agentslit up like a Christmas tree as they removed CPU books but becamehappier as they were re-inserted.

My understanding was that this was a system that was shipping in the midto late '90s and people were buying them. Thus not a demonstration special.

I don't remember if this was an HP SuperDome running HP-UX or a SunEnterprise 10000 running Solaris.

RAS is not specific to IBM. Though I do think that IBM trademarked thename / phrase.

I'm not aware of any x86_64 servers being anywhere near this level ofreliability.

Aside: I think much of the Unix industry decided to move complexity andcost out of the hardware and instead put it into software that runs onmore commodity / inexpensive hardware.

Having a super reliable basket with all your eggs in it is still allyour eggs in one basket.

z/OS not only coordinates with the hardware when resources visible toz/OS are affected by failures and concurrent maintenance, it is alsodesigned with the philosophy that software failures may occur withinparts of the operating system, either from a hardware failure or asystem software bug. System recovery routines exist to clean up aftersuch failures, limit what running address spaces are affected, and allowproduction to continue in unaffected address spaces.

I can't enumerate things, but I feel like non-mainframes have thingsthat can speak to this.

Another important feature of z/OS that requires some hardwarecoordination is the System Measurement Facility that gathers measurementof system activity and resource usage at a level to support performancetuning or billing based on resource usage.


How much of SMF is hardware vs software?

System accounting -- originally for billing -- has been used for a longtime to provide information for system scaling.

Aside from fact that z/OS is closed-source and only licensed by IBM tospecific hardware, if you could somehow succeed in running it underLinux or on non-z hardware, it would lose the reliability, availability,and serviceability it gets from that hardware/software synergy thatmakes it an ideal production platform for critical workloads.


There is an entire hobby genre doing exactly this.

I absolutely agree that it does not have anywhere near the same RAS thatz Series has. But I also realize that not everybody needs, much less iswilling to pay for, such RAS features.

It doesn't matter how reliable the single basket is if the networkconnectivity into the facility is cut. -- This is one of the placesthat having redundancy higher in the application stack and distributingload geographically starts to shine.

An IBM mainframe is a very impressive system. A Cadillac is a veryimpressive car. But using an IBM mainframe to serve files in a smalloffice is about as appropriate as using the Cadillac to deliver pizzas.




--
Grant. . . .

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Re: Mainframe Makers.... WAS: Ars Technica: The IBM mainframe: How it runs and why it survives

Reply via email to