I like this idea for some applications, though a window manager is not one of them.
I have unknowingly implemented this concept in a render farm I manage. The farm does continuous transcoding of media submitted by any user on the web. Therefore processes crash often. Then they restart and move right along. -lee On Tue, Feb 3, 2009 at 4:33 PM, Marcin Cieslak <[email protected]> wrote: > markus schnalke wrote: > >> This is just a thought, because I stumpled upon the concept and think >> it's a quite interesting approach. >> >> See: http://en.wikipedia.org/wiki/Crash-only_software > > I don't like this approach. I have always preferred software that "fails > fast". As soon as something is wrong - just abort with debugging information > what went wrong. > > I see some issues with the approach described in the paper. It assumes that > the state saved is okay - I think that crashes occur _because_ internal > state is inconsistent or wrong. Sure, you can dump internal state regularly > for recovery - but it's like with backups - you never know which one is > really clean and okay until you try to restore. > > Software bugs will sometimes create incorrect data. This may go unnoticed > for some longer time. > > I think that authors unnecessarily assume that software components are > "black boxes" that need to be kept up at all costs. This is not the right > approach for availability I think. Most issues will occur when the component > is upgraded and needs to use/migrate old data or sometimes to cooperate with > still not upgraded components. If something goes wrong, the rollback becomes > the issue also - if I have new, badly-behaving components that dumped its > state in a new format, how do I go back? > > Sweeping problems under the carpet is not going to help much... > > --Marcin > > >
