On 9/29/2014 12:09 PM, Sean Kelly wrote:
I'm not saying the errors should be ignored, but rather that
there are other approaches to handling errors besides (or in
addition to) terminating the process.

Those "other methods" are not robust and are not acceptable for a system claiming to be robust.


Remember the Apollo 11 lunar landing, when the descent computer software
started showing self-detected faults? Armstrong turned it off and landed
manually. He wasn't going to bet his ass that the faults could be ignored. You
and I wouldn't, either.

And this is great if there's a human available to take over.  But
what if this were a space probe?

A space probe would have either:

1. an independent backup system.

   -- or --

2. a "can't fail" system that fails and the probe will be lost


http://www.drdobbs.com/architecture-and-design/safe-systems-from-unreliable-parts/228701716


Any system designed around software that "cannot fail" is doomed from the start. You cannot write such software, and nobody else can, either. Attempting to do so will only reap expensive and bitter lessons :-(

Reply via email to