Have you thought about putting the program's in groups I use this method to stop and start groups of apps with monit without any issues and I am starting between 10 and 16 processes.
Sent from my iPhone On 3 Feb 2012, at 20:45, Christopher Johnston <[email protected]> wrote: > Okie, we switched our central 'launch' script which essentially takes the > list of apps from 'monit summary' stops them (some of them), then does a > start on the list that matches the regex. If 10 sequential commands get sent > to monit it will fail to start 1 or 2 of them and I see this error in my > logs. Does monit have issues receiving multiple commands all at once? Seems > like an issue to me that monit can't scale to handle requests like this. > This is a multi-user environment where app owners stop and start their apps > at their leisure. > > <27> Feb 3 12:41:00.441595 -08:00 dev001 monit[25592]: monit: action failed > -- Other action already in progress -- please try again later > > > On Thu, Feb 2, 2012 at 12:53 PM, Christopher Johnston <[email protected]> > wrote: > Ok - I grokked the script that handles the restar. I think this could be the > cause, it is essentially doing a 'stop && start' so the initiating start is > producing that message since there is already another action going (to stop > the app). We will modify this to use 'restart' instead. > > > On Thu, Feb 2, 2012 at 10:55 AM, Christopher Johnston <[email protected]> > wrote: > I am a little confused on why I am seeing this. I have 4 applications on my > host (in some cases up to 10) where we need to do a dailly/weekly rolling > restart of all the apps on the host. If I signal 4 monit restart commands > to the apps in sequence I will end up in a situation where only 2 or 3 out of > the apps come up and monit complains that an action is already in progress > (assuming its from the other commands). Monit can't handle getting signaled > 4x to take down apps and restart them? This creates some issues for us when > we are doing a mass code roll out to 100s of applications. We end up having > to go and clean up things manually and the driver behind using monit is to > provide an automated framework for managing apps and guaranteeing uptime. > > Is there any way to remedy this? We are using a very low timeout in monit > since we can't risk having apps down for long periods could this have > something to do with it? > > <27> Feb 2 07:48:43.202228 -08:00 dev001 monit[3263]: monit: action failed > -- Other action already in progress -- please try again later > > > > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general
-- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
