On Sat, 23 Apr 2011 23:42:23 -0400 Ken Wesson <kwess...@gmail.com> wrote:
> On Sat, Apr 23, 2011 at 11:35 PM, Mike Meyer <m...@mired.org> wrote: > > On Sat, 23 Apr 2011 23:19:53 -0400 > > Ken Wesson <kwess...@gmail.com> wrote: > > > >> On Sat, Apr 23, 2011 at 8:13 PM, Mike Meyer <m...@mired.org> wrote: > >> > On Sat, 23 Apr 2011 19:41:28 -0400 > >> > Ken Wesson <kwess...@gmail.com> wrote: > >> > or you live in a universe where cosmic rays can flip bits and other > >> > sources of hardware hiccups exist. > >> Software crashes caused by non-software-bug-triggered memory > >> corruption seem to me to be exceedingly rare, and they could as easily > >> strike critical parts of the operating system as a multithreaded > >> server program (and a large batch of independent C jobs will occupy > >> more memory and have a correspondingly larger cross section as a > >> target for such things). > >> The best recourse if the server gets hit by something like that is > >> going to be to reboot it. > > > > While it might be exceedingly rare on a per-cpu-second basis, if your > > application runs 7x24 on enough cpus, you can expect to see them at > > regular intervals. In which case the best recourse - if you want a > > stable, robust application - is to restart the smallest set of > > processes that might have been affected by the problem. > > In other words, all of them, since the operating system might have > been affected by such a problem and if it was, everything else is > probably affected too. Let me guess - you're one of these people who reboots systems every couple of days "just in case"? Sure, a hardware glitch that affects the OS means you should reboot the system. Of course, if it affects some user process, it may have affected the OS without leaving evidence of doing so. Then again, it may not have. While you could reboot everything "just in case", you could also have a hardware glitch affect the OS without leaving evidence in any process, so you might as well reboot even though nothing is wrong "just in case." Nah, hardware glitches are either localized, in which case restarting just the address spaces that failed is sufficient (and has proven so in practice for years), or they're systemic, in which case you'll have failures throughout the system. It's pretty easy to tell the difference between the two and deal with them appropriately. <mike -- Mike Meyer <m...@mired.org> http://www.mired.org/consulting.html Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en