Am 26.05.2010 um 01:15 schrieb Pascal Robert: > > Le 2010-05-25 à 13:28, Chuck Hill a écrit : > >> Hi Pascal, >> >> On May 24, 2010, at 5:14 PM, Pascal Robert wrote: >> >>> Hi everyone, >>> >>> I'm working on a Nagios plugin that will use the /admin/info direct action >>> that was added in the Wonder variant of JavaMonitor. Right now, the plugin >>> is doing the doing the following : >>> >>> - for the "state" key, if the state is DEAD, CRASHING or STOPPING, it >>> sends a CRITICAL signal, if it's UNKNOWN or STARTING, it will send a >>> WARNING signal, if the state is ALIVE, it will be OK. >> >> I think that STOPPING would indicate a manual shutdown or a scheduled >> restart. Do you really want to send a notification in that case? > > I think a notification should be send, unless you're doing something big at > shutdown, you might never get a notification. For example, our Nagios setup > is sending notifications if the service is having a problem after 5 minutes > (regular interval is 3 minutes, interval when a problem is detected is 1 > minute, so 3 + 1 + 1). If your app is in a STOPPING state for more than 4 > minutes, it might be a problem. But a warning signal should be better than a > critical one. > >> UNKNOWN probably represents a timeout talking to wotaskd which would >> indicate one: >> - deadlocked / backlogged instance >> - wotaskd stopped on some machine >> - network problems reaching some machine >> >> Those might warrant a CRITICAL signal. > > Good catch. > >> >>> - for the "deaths", "transactions", "activeSessions", >>> "averageIdlePeriod" and "avgTransactionTime" keys, it will check against >>> the warning and critical values, if the actual values are higher than the >>> params, it will send the appropriate signals. >> >> The last time I looked, these are not cleared when an application is >> unscheduled and stopped. If these values were outside of the limits when >> this happened, that could result in a lot of notifications. >> >> >>> - for refusingNewSessions, scheduled and autoRecover, it will send a >>> WARNING signal if the response is "false" >> >> I often (usually) have configured, but not scheduled instances for use when >> upgrading to a new version, handling server failover, higher loads etc. Is >> there a way to define only the instances that are expected to be scheduled? > > Frankly, I don't know how to handle this. Right now, the plugin works on a > specific instance. I guess we can pass an array of instance IDs, and if the > specified key to check is reaching a warning or critical level for all > instances, it will send a notification.
You could specify a number of instances that must be running so by using the direct action .../admin/running?type=app&name=<appname>&num=<count> your script checks if this evaluates to true. If not you fetch a list of all instances of that app and check them one by one to detect the faulty ones. jw > >> Warning on refusingNewSessions is going to send notifications for scheduled >> restarts, probably not what you want. > > Will add a condition for this, if scheduled is true, refusingNewSessions will > be ok if set to false. > >> >>> Any opinions on this? The only thing that I need to work on is the help >>> output, so if you want to try the plugin (it's a PERL script), send me a >>> note. You need any version of Nagios and the Wonder variant of JavaMonitor. >> >> >> Sounds like it could be useful. >> >> One other thing is that sending passwords on the URL is insecure and >> passwords that contain non-URL friendly characters are a problem (they don't >> seem to get decoded in JavaMonitor, not sure about that, I just changed the >> password). > > Will try that. >
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Do not post admin requests to the list. They will be ignored. Webobjects-dev mailing list ([email protected]) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com This email sent to [email protected]
