Re: [Nagios-users] problems with Distributed alerting
Please always respond on list. More below... On Feb 8, 2010, at 4:48 PM, Ron Wilson wrote: > Thanks for reply. I am a little confused between Retain_Status and > Retain_NonStatus > The help screen is not very helpful It's in the documentation. > If I want to make sure that any host or service on the master server > never gets changed when distributes servers force a reload etc which one > do I need to set? This is where my confusion lies. There should be no way that a distributed server can make that kind of change on the central server, reload, restart or whatever. Assuming you've followed the distributed documentation, the only thing they can do is submit a check result unless you've created some special program that propagates other changes. I think you need to describe more clearly how you're set up, how your configs are created and distributed and how you're sending distributed results back to the central server. -- Marc -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] problems with Distributed alerting
On Feb 8, 2010, at 4:08 PM, Ron Wilson wrote: > We have set up a distributed ngaios 3.02 system using Nagiosql with several > slaves and one master. The master is responsible for all alerting. However > when we disable a notification service on the master ngaios and then do a > reload of any of the slave servers it overwrites the status of the disabled > services. I am looking for some ideas how to avoid this. Is it possible to > say extract the status flags somehow before we do a slave to master update so > that we can then re-apply the status flags immediately after update. Or is > there an easier way to handle this situation. I am aware of the caveat of not > restarting ngaios but just reloading but the disabled notifications seem to > get replaced regardless of restart or reload I'm not certain I fully understand what you mean by 'reload of any of the slaves overwrites the status of the disabled services' on the host? It sounds though like you don't have retain_nonstatus_information enabled... -- Marc -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] problems with Distributed alerting
We have set up a distributed ngaios 3.02 system using Nagiosql with several slaves and one master. The master is responsible for all alerting. However when we disable a notification service on the master ngaios and then do a reload of any of the slave servers it overwrites the status of the disabled services. I am looking for some ideas how to avoid this. Is it possible to say extract the status flags somehow before we do a slave to master update so that we can then re-apply the status flags immediately after update. Or is there an easier way to handle this situation. I am aware of the caveat of not restarting ngaios but just reloading but the disabled notifications seem to get replaced regardless of restart or reload -- Ron Wilson Systems Engineer Television New Zealand P.O. Box 3819 Auckland, New Zealand Phone 649-916-7560 "I've noticed that the press tends to be quite accurate, except when they're writing on a subject I know something about." (Keith F. Lynch) == For more information on the Television New Zealand Group, visit us online at tvnz.co.nz == CAUTION: This e-mail and any attachment(s) contain information that is intended to be read only by the named recipient(s). This information is not to be used or stored by any other person and/or organisation. -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] How to access user-defined service variables in a command object
Sorry to have bothered the list. I was making the problem too hard because I was confused by what I'd read about on demand macros in Barth's book (p. 632). Using $_SERVICE_ALARM_NUMBER$ works in the command definition. I don't know why I didn't try that first. For some reason I thought you had to specify the host and service description to get the value of the variable. Paul Dubuc Paul M. Dubuc wrote: > I should have made more clear what I am trying to do below. I know I can > access the service __ALARM_NUMBER from the command definition by giving the > literal host_name and service description like this (I've updated the service > definition in my previous example to illustrate): > > $_SERVICE_ALARM_NUMBER:localhost:DUMMY > > but I would like the command definition to be able to do this using the macro > names $HOSTNAME$ and $SERVICEDESC$ so that one command definition works for > all services that use it for notification. Is there a way to do this? I > would not like to have to define a separate command and contact group for > every alarm number. > > Also, I'm using Nagios 3.2.0. > > Thanks, > Paul Dubuc > > Paul M. Dubuc wrote: >> I'm trying to integrate the use of an internally developed alarm >> generation command into our Nagios configuration. So I want to define >> an Nagios command object that calls this command with arguments specific >> to the service that is generating the status condition that generates >> the alarm. One of the arguments is an alarm number. I can set this >> number in the service definition as a user defined variable: >> >> define service{ > > host_name localhost > > service_description DUMMY >> ... >> __ALARM_NUMBER 123 >> } >> >> Is it possible to access this variable in the command definition using >> on-demand macros? I tried to do this in the following way, but it >> doesn't seem to work: >> >> define command{ >> command_namenotify-service-by-alarm >> command_line/usr/local/bin/sendalarm $HOSTALIAS$ >> $_SERVICE_ALARM_NUMBER:HOSTNAME:SERVICEDESC$ $SERVICESTATE$ >> $SERVICEDESC$ $SERVICEOUTPUT$ >> } >> >> Is there an alternative? >> >> Thanks, >> >> Paul M. Dubuc >> > > -- > The Planet: dedicated and managed hosting, cloud storage, colocation > Stay online with enterprise data centers and the best network in the business > Choose flexible plans and management services without long-term contracts > Personal 24x7 support from experience hosting pros just a phone call away. > http://p.sf.net/sfu/theplanet-com > ___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Scheduled downtime for 1 host and its services
On Tue, Feb 2, 2010 at 4:44 PM, Jelle Smet wrote: > Hi List, > > I'm using Nagios 3.2.0 and have a question about scheduled downtimes which > I can't find in the docs. > > If I schedule downtime for a host, does this automatically schedule > downtime for all the host services too? > No, but notifications are suppressed for services whose hosts are in scheduled downtime. If so, why isn't there the ZZzzz icon next to these services? > Because they are not technically in scheduled downtime. > Will scheduled downtime for a host also make sure this doesn't impact the > availability report of the services? > No. > > Thanks in advance! > > Jelle > Scheduling downtime in vanilla Nagios can be a major pain. It is much easier if you define host and service groups though, because then you can use the extinfo.cgi to for example "schedule downtime for all services in this servicegroup, and all their hosts too". -- Martin Melin op5 AB http://www.op5.com http://www.op5.org/ http://www.op5.com/op5/products/network-monitor/nagios/ -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] How to access user-defined service variables in a command object
I should have made more clear what I am trying to do below. I know I can access the service __ALARM_NUMBER from the command definition by giving the literal host_name and service description like this (I've updated the service definition in my previous example to illustrate): $_SERVICE_ALARM_NUMBER:localhost:DUMMY but I would like the command definition to be able to do this using the macro names $HOSTNAME$ and $SERVICEDESC$ so that one command definition works for all services that use it for notification. Is there a way to do this? I would not like to have to define a separate command and contact group for every alarm number. Also, I'm using Nagios 3.2.0. Thanks, Paul Dubuc Paul M. Dubuc wrote: > I'm trying to integrate the use of an internally developed alarm > generation command into our Nagios configuration. So I want to define > an Nagios command object that calls this command with arguments specific > to the service that is generating the status condition that generates > the alarm. One of the arguments is an alarm number. I can set this > number in the service definition as a user defined variable: > > define service{ > host_name localhost > service_description DUMMY > ... > __ALARM_NUMBER 123 > } > > Is it possible to access this variable in the command definition using > on-demand macros? I tried to do this in the following way, but it > doesn't seem to work: > > define command{ > command_namenotify-service-by-alarm > command_line/usr/local/bin/sendalarm $HOSTALIAS$ > $_SERVICE_ALARM_NUMBER:HOSTNAME:SERVICEDESC$ $SERVICESTATE$ > $SERVICEDESC$ $SERVICEOUTPUT$ > } > > Is there an alternative? > > Thanks, > > Paul M. Dubuc > -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] How to access user-defined service variables in a command object
I'm trying to integrate the use of an internally developed alarm generation command into our Nagios configuration. So I want to define an Nagios command object that calls this command with arguments specific to the service that is generating the status condition that generates the alarm. One of the arguments is an alarm number. I can set this number in the service definition as a user defined variable: define service{ ... __ALARM_NUMBER 123 } Is it possible to access this variable in the command definition using on-demand macros? I tried to do this in the following way, but it doesn't seem to work: define command{ command_namenotify-service-by-alarm command_line/usr/local/bin/sendalarm $HOSTALIAS$ $_SERVICE_ALARM_NUMBER:HOSTNAME:SERVICEDESC$ $SERVICESTATE$ $SERVICEDESC$ $SERVICEOUTPUT$ } Is there an alternative? Thanks, Paul M. Dubuc -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios web GUI, contact groups, hosts and services
Hi Shadhin Rahman, Thanks for your reply. I hadn't mentioned in my e-mail, but adding the notes_url is indeed the option I had in mind as a last resort. I like your suggestion of linking to a wiki. If no others solution comes up, I guess that's what we'll do. Regards, Lennart Karssen. On Mon, 2010-02-08 at 10:41 -0500, shadih rahman wrote: > Karssen, > > Your problem appears to be process not the tool. I am not > suggesting how to conduct business in your organization but here is my > suggestion. > > I would reach out to noc management and put together a wiki or how > to for each unique possible critical alert scenario. The I would add > the "notes_url" parameter of Nagios to point to that particular wiki > page. > >Only thing noc has to do click on the notes_url page and they will > know exactly what to do with the alert. Thanks > > On Mon, Feb 8, 2010 at 3:40 AM, L.C. Karssen > wrote: > Dear list, > > I'm presently working on a Nagios 3.2.0 setup that monitors > approximately > 1000 hosts and about 5000 services. The setup doesn't make use > of Nagios' > notification system, instead people at a control center (NOC) > use the > Nagios web interface to alert the appropriate people in case > of an alert. > > The Nagios configuration is based on a set of host (group) > templates, > where services are assigned to host groups. For example: the > 'check_swap' > service definition is associated with the host group 'all unix > hosts'. > > The problem I'm confronted with is that the people at the NOC > don't need > to see all services on a given host. For example, they > shouldn't call the > sysadmin at night if an SSL certificate check goes into > critical state > because the certificate is only valid for ten more days. So we > want to > remove that service from their view. In the present situation > contact > groups (used to determine which servers are visible to which > department) > are added to each specific host, but according to the Nagios > docs > (http://nagios.sourceforge.net/docs/3_0/cgiauth.html) a > contact group can > see _all_ services on a given host if it is listed as a > contract group for > the host. > So I decided to remove the NOC contact group from the > individual host > definitions and to assign the NOC contact group only to > specific services. > > This leads me to another problem. Some service checks (i.e. > host groups) > are used in one department only. This works fine. > However, some other service checks (like check_swap for the > 'all unix > hosts' hostgroup) are shared by all departments, but some > departments > don't want the NOC to see check_swap alerts whereas others do > want to pass > these alerts to NOC. It would be possible to make services > with slightly > different names (e.g. check_swap_dept1, check_swap_dept2), > each with the > correct contact group. However, that seems to be a needless > increase of > complexity. > Another approach would be to make host groups for each > department and > somehow change the service contact group for each host group. > Unfortunately I haven't been able to get that to work. > > > Any suggestions would be highly appreciated. > > > Lennart Karssen. > > > > -- > The Planet: dedicated and managed hosting, cloud storage, > colocation > Stay online with enterprise data centers and the best network > in the business > Choose flexible plans and management services without > long-term contracts > Personal 24x7 support from experience hosting pros just a > phone call away. > http://p.sf.net/sfu/theplanet-com > ___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS > when reporting any issue. > ::: Messages without supporting info will risk being sent > to /dev/null > > > > -- > Cordially, > Shadhin Rahman -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and man
Re: [Nagios-users] Hyperlink in Acknowledgement Comment
I've tried all sorts of combinations, but what I have now is this: http://xxx.xxx.xxx.xxx/nagios/cgi-bin/cmd.cgi?cmd_typ=&cmd_mod=2&host=&service=&sticky_ack=on&send_notification=on&persistent=on&com_data=""&btnSubmit=Commit I know it can be done via the nagios.cmd file, but since these are two separate machines, I'm trying to do it via a http command. The above command results in the following comment: "a href='http://xxx.xxx.xxx.xxx/otrs/index.pl?Action=AgentZoom -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Hyperlink in Acknowledgement Comment
fevin Kagen wrote: > Thanks, Patrick. I saw that, but it doesn't seem to make a my > difference. I did notice that using a named pipe does write a > hyperlink by default. Since the "http" option does not, I'm wondering > if it is possible,. Does anyone have this working via http? > Thanks! > > On Sat, Feb 6, 2010 at 11:10 AM, Morris, Patrick > mailto:patrick.mor...@hp.com>> wrote: > > fevin Kagen wrote: > > Hi- > I'm using nagios in conjunction w/ OTRS. All in all, it works > great. However, we would like to replace the simple ticket > number in the acknowledgement comment with a hyperlink to the > actual ticket. Any ideas on how to do this? I've found the > "Nagios::Acknowledge::HTTP::URL: " variable in the OTRS > settings, but I can't seem to add a hyperlink since the "<" > character is automatically removed. Thanks! > fevin > > > http://nagios.sourceforge.net/docs/3_0/configcgi.html#escape_html_tags > > Seeing the command you're doing this in might help. I suspect that you're not so much getting hit by a character stripping issue as by a quoting one, and maybe the "<" character is being interpreted as an input redirection. I'm not too familiar with OTRS, but it should definitely be possible to put that link in a Nagios notification command (and, in fact, we do that here). -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Hyperlink in Acknowledgement Comment
Thanks, Patrick. I saw that, but it doesn't seem to make a my difference. I did notice that using a named pipe does write a hyperlink by default. Since the "http" option does not, I'm wondering if it is possible,. Does anyone have this working via http? Thanks! On Sat, Feb 6, 2010 at 11:10 AM, Morris, Patrick wrote: > fevin Kagen wrote: > >> Hi- >> I'm using nagios in conjunction w/ OTRS. All in all, it works great. >> However, we would like to replace the simple ticket number in the >> acknowledgement comment with a hyperlink to the actual ticket. Any ideas on >> how to do this? I've found the "Nagios::Acknowledge::HTTP::URL: " variable >> in the OTRS settings, but I can't seem to add a hyperlink since the "<" >> character is automatically removed. Thanks! >> fevin >> > > http://nagios.sourceforge.net/docs/3_0/configcgi.html#escape_html_tags > -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] How to set hard state from passive service check?
On Feb 8, 2010, at 8:49 AM, Peter Klausner wrote: > I want passive checks to set a HARD state immediately. Active checks should > apply > the max_check_attempts setting. > > Is there a way to achieve this? Not that I know of without creating a separate service definition. Passive service checks are treated exactly the same as active service checks in this regard. > I found passive_host_checks_are_soft, but it applies only to hosts. Any passive host check results in an immediate hard state. This option makes them more service-like for those who want/need that functionality. There are historical reasons why host checks are treated this way that never applied to services... -- Marc -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios web GUI, contact groups, hosts and services
Karssen, Your problem appears to be process not the tool. I am not suggesting how to conduct business in your organization but here is my suggestion. I would reach out to noc management and put together a wiki or how to for each unique possible critical alert scenario. The I would add the "notes_url" parameter of Nagios to point to that particular wiki page. Only thing noc has to do click on the notes_url page and they will know exactly what to do with the alert. Thanks On Mon, Feb 8, 2010 at 3:40 AM, L.C. Karssen wrote: > Dear list, > > I'm presently working on a Nagios 3.2.0 setup that monitors approximately > 1000 hosts and about 5000 services. The setup doesn't make use of Nagios' > notification system, instead people at a control center (NOC) use the > Nagios web interface to alert the appropriate people in case of an alert. > > The Nagios configuration is based on a set of host (group) templates, > where services are assigned to host groups. For example: the 'check_swap' > service definition is associated with the host group 'all unix hosts'. > > The problem I'm confronted with is that the people at the NOC don't need > to see all services on a given host. For example, they shouldn't call the > sysadmin at night if an SSL certificate check goes into critical state > because the certificate is only valid for ten more days. So we want to > remove that service from their view. In the present situation contact > groups (used to determine which servers are visible to which department) > are added to each specific host, but according to the Nagios docs > (http://nagios.sourceforge.net/docs/3_0/cgiauth.html) a contact group can > see _all_ services on a given host if it is listed as a contract group for > the host. > So I decided to remove the NOC contact group from the individual host > definitions and to assign the NOC contact group only to specific services. > > This leads me to another problem. Some service checks (i.e. host groups) > are used in one department only. This works fine. > However, some other service checks (like check_swap for the 'all unix > hosts' hostgroup) are shared by all departments, but some departments > don't want the NOC to see check_swap alerts whereas others do want to pass > these alerts to NOC. It would be possible to make services with slightly > different names (e.g. check_swap_dept1, check_swap_dept2), each with the > correct contact group. However, that seems to be a needless increase of > complexity. > Another approach would be to make host groups for each department and > somehow change the service contact group for each host group. > Unfortunately I haven't been able to get that to work. > > > Any suggestions would be highly appreciated. > > > Lennart Karssen. > > > > -- > The Planet: dedicated and managed hosting, cloud storage, colocation > Stay online with enterprise data centers and the best network in the > business > Choose flexible plans and management services without long-term contracts > Personal 24x7 support from experience hosting pros just a phone call away. > http://p.sf.net/sfu/theplanet-com > ___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -- Cordially, Shadhin Rahman -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] How to set hard state from passive service check?
According to the docs (and in my set-up) a passive service check result of non-OK sets a SOFT state. So you need max_check_attempts passive and/or active checks until it changes to HARD. I want passive checks to set a HARD state immediately. Active checks should apply the max_check_attempts setting. Is there a way to achieve this? I found passive_host_checks_are_soft, but it applies only to hosts. Thanks, Peter Klausner -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Antwort: Re: Nagios 3.2.0 process dies silently - help!
Tony Johansson schrieb am 05.02.2010 19:39:47: > [pid 32731] write(6, "1265393559||AHS||C: Drive Space||c:\\ - total: > 15.86 Gb - used: 7.60 Gb (48%) - free 8.26 Gb (52%)||c:\\ Used > Space=7.60Gb;14.27;15.54;0.00;15.86\n", 144) = -1 EFBIG (File too large) > [pid 32731] --- SIGXFSZ (File size limit exceeded) @ 0 (0) --- > [pid 32732] +++ killed by SIGXFSZ +++ > > "File size limit exceeded" seems to be the cause > Disk space is plenty: > df -h > FilesystemSize Used Avail Use% Mounted on > /dev/mapper/VolGroup00-LogVol00 > 68G 28G 38G 43% / > /dev/sda1 99M 30M 65M 32% /boot > tmpfs 506M 0 506M 0% /dev/shm > > Also, I did try renaming retention.dat, status.dat and moving files out > of checkresults earlier with no result. > > Seems like /var/spool/nagios/perfdata.log is 2G while > /var/spool/nagios/perfdata.log is a mere 11K > I've tried renaming the file and started nagios which now seems to run ok. > Looks like I need to set up log rotation or what is the best way to > handle perfdata.log? 2 GiB is the maximum filesize for an ext3 on x86_x64 platforms. There is no need for nagios to "handle" the perfdata.log. Nagios only writes perfdata if you request it to do so. This data is _only_ interpreted by external tools, like pnp. They care about truncating the perfdata once they parsed it. Since you obviously don't use any perfdata tool - why do you write perfdata logs at all? ;) S GFKL Financial Services AG Vorstand: Jürgen Baltes, Dr. Tom Haverkamp Vorsitzender des Aufsichtsrats: Wilhelm Plumpe Sitz: Limbecker Platz 1, 45127 Essen, Amtsgericht Essen, HRB 13522-- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Nagios web GUI, contact groups, hosts and services
Dear list, I'm presently working on a Nagios 3.2.0 setup that monitors approximately 1000 hosts and about 5000 services. The setup doesn't make use of Nagios' notification system, instead people at a control center (NOC) use the Nagios web interface to alert the appropriate people in case of an alert. The Nagios configuration is based on a set of host (group) templates, where services are assigned to host groups. For example: the 'check_swap' service definition is associated with the host group 'all unix hosts'. The problem I'm confronted with is that the people at the NOC don't need to see all services on a given host. For example, they shouldn't call the sysadmin at night if an SSL certificate check goes into critical state because the certificate is only valid for ten more days. So we want to remove that service from their view. In the present situation contact groups (used to determine which servers are visible to which department) are added to each specific host, but according to the Nagios docs (http://nagios.sourceforge.net/docs/3_0/cgiauth.html) a contact group can see _all_ services on a given host if it is listed as a contract group for the host. So I decided to remove the NOC contact group from the individual host definitions and to assign the NOC contact group only to specific services. This leads me to another problem. Some service checks (i.e. host groups) are used in one department only. This works fine. However, some other service checks (like check_swap for the 'all unix hosts' hostgroup) are shared by all departments, but some departments don't want the NOC to see check_swap alerts whereas others do want to pass these alerts to NOC. It would be possible to make services with slightly different names (e.g. check_swap_dept1, check_swap_dept2), each with the correct contact group. However, that seems to be a needless increase of complexity. Another approach would be to make host groups for each department and somehow change the service contact group for each host group. Unfortunately I haven't been able to get that to work. Any suggestions would be highly appreciated. Lennart Karssen. -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null