Stupid question:
Does *anybody* build a fault-tolerant Intel machine? *Has* anyone
built such a beastie?
I recall working with such a machine based on Motorola 68Ks back in the
late 1970s which was reputedly fully fault tolerant. Of course one
co-worker at the place defined "fault tolerant" as a "satisfied Burroughs
XE-550 customer" (The XE-550 was based on the Convergent Technologies'
"MegaFrame"; Convergent used MC68K processors and their assemblers had a
problem with the ABCD instruction... and they built the 7300/3B1 for
AT&T.)
In an Intel-based SMP box I don't think there's any way to tell the OS
"this CPU is cooked, don't schedule for it and make sure it's halted".
I doubt anyone's tried to implement a "Tandem-ish" machine w/ Intels...
(Another difference 'tween architectures mentioned in Appendix "A" of
"Linux for the S/390" RedBook.)
--------------------
John R. Campbell, Speaker to Machines (GNUrd) {813-356|697}-5322
Adsumo ergo raptus sum
MacOS X: Because making Unix user-friendly was easier than debugging
Windows.
Red Hat Certified Engineer (#803004680310286)
IBM Certified: IBM AIX 4.3 System Administration, System Support
----- Forwarded by John Campbell/Tampa/IBM on 02/14/2005 11:16 AM -----
Joseph
Temple/Poughkeepsi To:
[email protected]
e/[EMAIL PROTECTED] cc:
Sent by: Linux on Subject: Re: [LINUX-390] Why
Zseries
390 Port
<[EMAIL PROTECTED]
ST.EDU>
02/10/2005 02:38
PM
Please respond to
Linux on 390 Port
Kielek is correct, but consider this.
1. Given the availability of the application, there is a small difference
between Linux on z and Linux on Intel simply because the zSeries
reliability takes the hardware multiplier on availability closer to 1.
2. Yes we can configure the Intel with redundant servers to mitigate the
reliability difference.
a. In the case of stateful applications such failovers are small
outages because they take measurable time (up to minutes or even hours on
thorny situations). In this case the reduction of the number outages by
better hardware availability still helps.
b. In the case of stateless applications you need less redundant
hardware on z than on intel because you can effectively run the z at higher
utilization than the intel machine for many applications. This is because
many workloads cause the Intel boxes to saturate a relatively low
utilization. When this happens it takes more than n+1 boxes to deliver
n+1 availability is > than one unless the Intel machine is run at very low
utilization. Since the zSeries solution is more like to be CPU bound the
utilization at which n+1 can be n+1 is higher.
c.) Since the failing component is most likely the application or the
linux, there is the opportunity to set the Linux on z farm up in such a
way that the remaining linux images get the capacity that the failed linux
had. In other words a redundant linux/application instance is provided but
still less redundant capacity is required. This depends on being able to
detect the failure and restart the failed linux with reduced resources
until it is ready to accept the load.
That is the failed linux gets hard capped and the remaining soft capped
images grab the resulting whitespace.
Joe Temple
Executive Architect
Sr. Certified IT Specialist
[EMAIL PROTECTED]
845-435-6301 295/6301 cell 914-706-5211
Home office 845-338-1448 Home 845-338-8794
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390