I can write a simple script which will detect if a process goes crazy and kill it/restart the service, thats not the issue..
The issue is about investigating a rebooted machine after a huge load, for example: lets say it's not your well taken care machine but it's your friends small web server and he had to reboot it and now he wants to know why it happened. Hetz On Fri, Sep 18, 2009 at 12:47 AM, sammy ominsky <[email protected]> wrote: > You can make it do other things too, like kill or restart processes. > > --sambo > > On 18/09/2009, at 00:29, Hetz Ben Hamo wrote: > >> Sammy, >> Watch is good and nice, but Watchdog main purpose is to reboot the >> server if something wrong happens. Thats not what I'm looking for. >> >> Hetz >> >> On Thu, Sep 17, 2009 at 10:40 PM, sammy ominsky <[email protected]> wrote: >>> >>> On 17/09/2009, at 22:33, Hetz Ben Hamo wrote: >>> >>>> I tried to ssh it today.. no go. I could ping it, but none of the >>>> services were accessible: http, ssh, etc.. >>> >>>> Sep 17 12:58:11 hetz sendmail[2707]: rejecting connections on daemon >>>> MTA: load average: 140 >>>> >>>> So my question: What do you do in case you have the same scenario? >>>> what steps do you take to prevent things like that from happening? >>> >>> Watchdog? >>> >>> --sambo >>> >>> >> >> >> >> -- >> Skepticism is the lazy person's default position. >> my blog (hebrew): http://benhamo.org > > > _______________________________________________ > Linux-il mailing list > [email protected] > http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il > -- Skepticism is the lazy person's default position. my blog (hebrew): http://benhamo.org _______________________________________________ Linux-il mailing list [email protected] http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
