Gabriel Sechan wrote:
I'm still finding myself skeptic- a system that needs 100% uptime that doesn't have physical redundancy?
Who said it doesn't?
And if it has redundancy, why not stagger the rollout, taking down 1 machine at a time and replacing it?
Because you then need extra resources everywhere you need to do a hot replace. For example, every single telecom switch would need a third completely idle switch in order to do an upgrade (both the old and new switch need to be completely available so that they can take over in standard redundancy while the third gets the offline code upgrade). In addition, you would need to have links between the three switches to do a full real-time migration of all active sessions. Bidirectional failover links between 3 nodes (3 links) are 3 times more expensive and problematic than between 2 nodes (1 link).
These are much more expensive than having hot-code replacement on just the two switches.
Go look at the rationale behind Erlang. -a -- [email protected] http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg
