Another condition that would currently be down but would fit into the category of possible down is "method~ of object ~". Happens occasionally here on the new build.
I have NOT reread the discussion on that from before. I also have not had time to see if there is a newer beta to fix this issue. Please do not consider this reporting the issue as I do not have time to troubleshoot it right now which is why I have not yet reported. Just wanted to throw it into this discussion. Jason Passow Mississippi Welders Supply [EMAIL PROTECTED] ph: (507) 494-5178 fax: (507) 454-8104 -------------------------------------------------------------------------------- From: Dirk [mailto:[EMAIL PROTECTED] To: Servers Alive Discussion List [mailto:[EMAIL PROTECTED] Sent: Thu, 22 May 2008 11:15:21 -0500 Subject: RE: [SA-list] SA possible enhancements Not that simple. For example on a URL check what is a down? the content doesn't match the content you gave? a 404 error (page not found) a 500 error (server error) Defining ONE string per checktype that is a timeout seems more logical, and then you add a new status "TIMEOUT". But I suppose you also want to alert on that then too? As a timeout can also be a real problem. If you get one of those timeout errors, that isn't a big deal, but if you're process check gives you that timeout all the time, then it might show a real issue with the (remote)server. So it's not just adding a new status (and having rules for it), but it's also changing the alerting engine tooâ¦. Dirk Bulinckx. From: Servers Alive Discussion List [mailto:[email protected] (mailto:[email protected])] On Behalf Of Nathan Groom Sent: Thursday, May 22, 2008 3:26 PM To: Servers Alive Discussion List Subject: RE: [SA-list] SA possible enhancements Could it be set in a way that you could define a string that it explicitly matches for a down, and then everything else could possibly be an error in connection? Example: Check is set to make sure that a certain process is running, the server is busy, so it returns a âtimed outâ error, not â0 processes runningâ (not really sure on the verbiage on that one). You set an if [not]-then-else statement like the following: if return is âtimed outâ then place check in error (color this orange) else check is down (red). Thanks! Nathan Groom Information Services Administrator East Central Iowa REC Urbana, IA Phone: 319-443-4343 Fax: 319-443-4359 -------------------------------------------------------------------------------- From: Servers Alive Discussion List [mailto:[email protected] (mailto:[email protected])] On Behalf Of Dirk Sent: Thursday, May 22, 2008 8:01 AM To: Servers Alive Discussion List Subject: RE: [SA-list] SA possible enhancements Just trying to understand. So it would be an alert that uses the same WHEN part as the current alerts, but were the action is that the COLOR of the entry in the GUI changes? Dirk Bulinckx. From: Servers Alive Discussion List [mailto:[email protected] (mailto:[email protected])] On Behalf Of [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]) Sent: Thursday, May 22, 2008 1:51 PM To: Servers Alive Discussion List Subject: RE: [SA-list] SA possible enhancements That's a really interesting idea, and has a lot of potential... It goes beyond what I was looking for, but might be a lot more flexible. The only danger there is that it could complicate things horrendously when setting up new checks - if you had to add in alerts to change the status for every check depending on how often it had failed. I suppose one could get around that danger by making that an optional overide in each check (i.e. a check uses the existing behaviour by default, but tick a box and it only changes status according to alert settings), OR one could combine this idea with the concept of predefined alerts. Or both! Hmmm... if it was accepted, these ideas could lead to some radical changes in how one sets up checks. That might be a lot of work for Dirk, and probably for us as users to reconfigure things, but I could see huge flexibility benefits here. Ian _________________________________ Ian K Gray OEL IS - European Infrastructure Support Tel: +44 1236 502661 Mob: +44 7881 518854 Ad eundum quo nemo ante iit "Vogl, Tom" <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> Sent by: Servers Alive Discussion List <[email protected] (mailto:[email protected])> 21/05/2008 22:45 Please respond to Servers Alive Discussion List <[email protected] (mailto:[email protected])> To Servers Alive Discussion List <[email protected] (mailto:[email protected])> cc Subject RE: [SA-list] SA possible enhancements On item "2" - What I think Ian is asking for is that the display somehow allow for [ALERT ISSUED] as a status beyond UP/DOWN. That is the fundamental difference in the way the application is designed and the way it is used. A specific CHECK can FAIL - but that does equate to a DOWN status. Currently the GUI only displays the status of the CHECK, and not if an actual ALERT was issued. Maybe the feature requested for item "2" is a new alert option called "Set GUI Color to: " . This way a site may reconfigure the defualt DOWN gui color to be Yellow, and on a known failure (determined by the alert) then make it RED. Parameters would be a pallet of specific colors, as well as the system UP,DOWN, UNKNOWN, UNAVAILALBE, etc.. settingsâ¦.. this way you could have an alert set it one color on first failure, a different color on third failure, etcâ¦.. -Tom From: Servers Alive Discussion List [mailto:[email protected] (mailto:[email protected])] On Behalf Of Dirk Sent: Tuesday, May 20, 2008 12:21 PM To: Servers Alive Discussion List Subject: RE: [SA-list] SA possible enhancements 1) Predefined alerts: looks like something usefull (I'll add that to our to-look-at list) 2) Failed check "down": well a DOWN is the status you get when SA can't say for sure that it's UP. If you want to know the reason of the down, then use the checkresponse (this can be viewed in the interface, used in the alerts and used within the HTML output) 3) XML output: correct this can't be done each cycle, what I can see as a possible option is to add that to the alerts - Execute Command - Internal Servers Alive command (something for the TODO list) 4) On Call: if the On-Call would be enabled by default, then sending the alert to that person would not work, as "just" enabling isn't enough you would alsoneed to set the dates when that person is on-call. Dirk Bulinckx. From: Servers Alive Discussion List [mailto:[email protected] (mailto:[email protected])] On Behalf Of [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]) Sent: Tuesday, May 20, 2008 6:06 PM To: Servers Alive Discussion List Subject: [SA-list] SA possible enhancements Hi Dirk (et al for info), We had an internal service review today on our monitoring services (of which SA forms the backbone). A number of things came up as a result of that, which I would like to pass on as enhancement requests: * We need to do some significant restructuring of alerts, and to do this check by check is going to be a huge piece of work. What would be really great would be to have a number of predefined alerts (e.g. Alert A is an alert set up to send SMS to engineer team X immediately; Alert B does the same but to engineer team Y; Alert C is set up to send an email to management group Z after 3 downs, etc). My idea is that you would then, in each check, be able to say "use predefined alerts A, B and D", as well as being able to create additional alerts for that specific check. I could imagine this being done with tick boxes - i.e. have (say) 10 predefined alert types which you can select within a check. The point of all this is that, if I need to make changes such as changing who gets the alerts, or what the wording of the alerts are, or when they get sent, or even add a new alert to a number of checks, one can simply change a single predefined alert, and/or tick an additional box in each check that is to be affected. Do you follow me? * I can adjust when an alert is sent (e.g. after x downs), and I can adjust how often a check is done (e.g. every x cycles). However, what I can't do is determine when a failed check should be considered a "down". Example: as mentioned in the past, we have a COM check that looks at an SQL db on a server, which quite often fails with a timeout. I have adjusted the alert to only go out after 2 downs (and in fact not to go out at all if the response includes "Timeout", but that doesn't stop that check from going red on our screens. (To be absolutely accurate, therefore, the issue is when a failed check should be presented as a "down" on the on the HTML outputs, but that's probably getting too complicated...) * XML output (that favourite topic of the discussion group) - I can manually export to XML, but I can't (I don't think) have SA do that automatically every check cycle. Hey - I don't understand XML at all, but my colleagues tell me that they can do something clever with it... * I think I've asked this before, but I'll double check... The on-call schedule for people defaults to "Not on call". Would it be possible (as standard or as an option) to change this to defaulting to "On call"? Thoughts? Many thanks as ever, Ian _________________________________ Ian K Gray OEL IS - European Infrastructure Support Tel: +44 1236 502661 Mob: +44 7881 518854 Ad eundum quo nemo ante iit ______________________________________________________________________________ Any opinions expressed in this email are those of the individual and not necessarily of the Company. This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from the Company are confidential and solely for the use of the intended recipient. It may contain material protected by legal privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is strictly prohibited. Please notify the sender immediately of the error and delete any copies of this message Warning: Although the Company has taken reasonable precautions to ensure that no viruses are present in this e-mail, the Company cannot accept responsibility for any loss or damage arising from the use of this e-mail or attachments. To unsubscribe send a message with UNSUBSCRIBE in the subject line to [email protected] (mailto:[email protected]) If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list. To unsubscribe send a message with UNSUBSCRIBE in the subject line to [email protected] (mailto:[email protected]) If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list. To unsubscribe send a message with UNSUBSCRIBE in the subject line to [email protected] (mailto:[email protected]) If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list. ______________________________________________________________________________ Any opinions expressed in this email are those of the individual and not necessarily of the Company. This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from the Company are confidential and solely for the use of the intended recipient. It may contain material protected by legal privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is strictly prohibited. Please notify the sender immediately of the error and delete any copies of this message Warning: Although the Company has taken reasonable precautions to ensure that no viruses are present in this e-mail, the Company cannot accept responsibility for any loss or damage arising from the use of this e-mail or attachments. To unsubscribe send a message with UNSUBSCRIBE in the subject line to [email protected] (mailto:[email protected]) If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list. To unsubscribe send a message with UNSUBSCRIBE in the subject line to [email protected] (mailto:[email protected]) If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list. To unsubscribe send a message with UNSUBSCRIBE in the subject line to [email protected] (mailto:[email protected]) If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list. To unsubscribe send a message with UNSUBSCRIBE in the subject line to [email protected] (mailto:[email protected]) If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list. To unsubscribe send a message with UNSUBSCRIBE in the subject line to [email protected] If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list.
