Re: mon forking problem

2008-02-19 Thread Nicolas KOWALSKI
Nicolas KOWALSKI [EMAIL PROTECTED] writes:

 For information, after modifying the scheduler as described above,
 our monitoring server did not have any fork problem anymore: last
 week it launched 2.5M forks (monitors and alerts) happily.

We tracked down the source of the problem. In our logfiles, we
sometimes see the following:

2008-02-19 03:56:56 err: call_alert: could not exec alert 
/apps/Minotaure/lib/alert.d/wh-stat.alert: Argument list too long
2008-02-19 03:56:56 err: call_alert: could not exec alert 
/apps/Minotaure/lib/alert.d/wh-sendtrap.alert: Argument list too long
2008-02-19 03:56:56 err: call_alert: could not exec alert 
/apps/Minotaure/lib/alert.d/wh-kpi.alert: Argument list too long

This explains why we ended up with several mon processes. 

With our patch (exit if exec fails during call_alert), this does not
happen anymore.

-- 
Nicolas

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: mon forking problem

2008-01-31 Thread Nicolas KOWALSKI
Nicolas KOWALSKI [EMAIL PROTECTED] writes:

 We have been hit recently by a mon fork problem: after several days
 running mon, our monitoring server was running several mon processes.

 After looking at the code, we think we have found the problem here:
 http://mon.cvs.sourceforge.net/mon/mon/mon?revision=1.25view=markup ,
 lines 5080-5084:

  5080   if (!exec @execargs) {
  5081  syslog ('err', could not exec alert $alert: $!);
  5082  return undef;
  5083   }
  5084   exit;


 The return statment looks buggy for us, because mon in the child
 process, and execution will continue as if it was the father, thus
 creating a new mon master process.

 Instead of the return statment, there should be an exit one, isn't
 it ?

For information, after modifying the scheduler as described above, our
monitoring server did not have any fork problem anymore: last week it
launched 2.5M forks (monitors and alerts) happily.

-- 
Nicolas

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon