On Sat, 23 Apr 2011 23:42:23 -0400
Ken Wesson <kwess...@gmail.com> wrote:

> On Sat, Apr 23, 2011 at 11:35 PM, Mike Meyer <m...@mired.org> wrote:
> > On Sat, 23 Apr 2011 23:19:53 -0400
> > Ken Wesson <kwess...@gmail.com> wrote:
> >
> >> On Sat, Apr 23, 2011 at 8:13 PM, Mike Meyer <m...@mired.org> wrote:
> >> > On Sat, 23 Apr 2011 19:41:28 -0400
> >> > Ken Wesson <kwess...@gmail.com> wrote:
> >> > or you live in a universe where cosmic rays can flip bits and other
> >> > sources of hardware hiccups exist.
> >> Software crashes caused by non-software-bug-triggered memory
> >> corruption seem to me to be exceedingly rare, and they could as easily
> >> strike critical parts of the operating system as a multithreaded
> >> server program (and a large batch of independent C jobs will occupy
> >> more memory and have a correspondingly larger cross section as a
> >> target for such things).
> >> The best recourse if the server gets hit by something like that is
> >> going to be to reboot it.
> >
> > While it might be exceedingly rare on a per-cpu-second basis, if your
> > application runs 7x24 on enough cpus, you can expect to see them at
> > regular intervals. In which case the best recourse - if you want a
> > stable, robust application - is to restart the smallest set of
> > processes that might have been affected by the problem.
> 
> In other words, all of them, since the operating system might have
> been affected by such a problem and if it was, everything else is
> probably affected too.

Let me guess - you're one of these people who reboots systems every
couple of days "just in case"?

Sure, a hardware glitch that affects the OS means you should reboot
the system. Of course, if it affects some user process, it may have
affected the OS without leaving evidence of doing so. Then again, it
may not have. While you could reboot everything "just in case", you
could also have a hardware glitch affect the OS without leaving
evidence in any process, so you might as well reboot even though
nothing is wrong "just in case."

Nah, hardware glitches are either localized, in which case restarting
just the address spaces that failed is sufficient (and has proven so
in practice for years), or they're systemic, in which case you'll have
failures throughout the system. It's pretty easy to tell the
difference between the two and deal with them appropriately.

        <mike
-- 
Mike Meyer <m...@mired.org>             http://www.mired.org/consulting.html
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to