tags 332285 + fixed-upstream thanks On Mon, 10 Mar 2008 12:01:09 +0100, Petter Reinholdtsen <[EMAIL PROTECTED]> said:
> The consequence on a machine with insufficient resources, is that > munin will spend all resources and bring the machine to complete > standstill. The first job starts, then the next job starts before > the first job is finished and slow down the machine even more, and > then the third job starts before the second and perhaps also the > first job is finished, and so on. What should have happened was: munin-cron(1) starts, and runs a sequence of munin components munin-update, munin-limits, munin-graph, munin-html munin-cron(2) starts 5 minutes later, and runs the same sequence. If any of the components is already running, munin-cron will just try to run the next sequence. If (an extreme example) each component use more than 5 minutes to run, you should have no more than 3 munin-cron processes each running one component. (munin-limits is very quick, so I don't count that one) What you are experiencing sounds like an issue with the locking system not working as intended, rather than there not being a locking system, since it looks like it ignores the locks and runs the munin components in parallel. By the way: Are you still running 1.2.3-1, or have you upgraded to 1.2.5-2 since filing the bug? I've yet to experience this on an etch install, which has 1.2.5, but I seem to remember seeing this in the past on servers with very high load. > This is the failure mode I suggest to implement a guard against. > Yes, in any case munin will not work properly, but at least it will > be possible to try to fix it. :) I'll mark this bug as "fixed-upstream", since the whole locking system was rewritten some time ago in the 1.3 branch, and get munin 1.3.4 out as soon as possible. -- Stig Sandbeck Mathisen, Linpro -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

