I would use collectd instead, it has a much better resolution and scales up (which munin doesnt).
my 2cents, Ohad On 9/18/09, Shachar Shemesh <[email protected]> wrote: > > Hetz Ben Hamo wrote: > > So my question: What do you do in case you have the same scenario? > what steps do you take to prevent things like that from happening? > > I would focus less on prevention, and more on diagnostics. I usually > use munin (you can see a live example at http://www.hamakor.org.il/munin). > It's great in that it gives you complete history of almost all relevant > parameters, and you can (farily easily) add your own. > > As for the specific problem you are describing, assuming it repeats itself, > it really depends. For example, if you look at the munin history and see the > load average slowly ascending, then I would run ps and check for runaway > zombies or processes. If the load average jumps suddenly, I would run cron > with something that logs the top ten active processes. > > Shachar > > -- > Shachar Shemesh > Lingnu Open Source Consulting Ltd. > http://www.lingnu.com > > > _______________________________________________ > Linux-il mailing list > [email protected] > http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il > >
_______________________________________________ Linux-il mailing list [email protected] http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
