Re: [Nagios-users-br] Nagios em rede GRANDE, BEM GRANDE.

2010-08-23 Thread Marcel
Com 2 serviços por host vc só deve pingar sua rede para determinar
alcançabilidade, certo?

Qual plugin vc usa? qual versão dos plugins? qual versão do nagios? Qual
distro?

Existem recomendações para melhorar o nível da investigação, mas algumas
questões podem ser atacadas sem qualquer informação adicional.

1) Vc está com uma média de tempo de execução das checagens um tanto quanto
alta: 402.97 segundos, qual é a frequência que está checando seus hosts?
tente aumentar a frequência para um tempo um pouco maior, para todas as
checagens (de 5 para 10 minutos), e analise o comportamento.
2) Verifique por mais de um processo pai (PPID=1), se houver mais de um
processo pai, pode causar interferência ao nagios, já que eles irão
compartilhar o objects.cache, retention.dat, status.dat e isso sempre é
motivo de problemas.
3) Se minha asserção sobre 2 serviços por hosts serem pings, tente trocar o
plugin check_ping pelo check_icmp.
4) Caso nenhuma das alternativas acima indique causa raiz, atualize o nagios
e implemente as recomendações de tuning:
http://nagios.sourceforge.net/docs/3_0/tuning.html

Espero ter ajudado,


2010/8/21 Everton Pestana evertonpest...@gmail.com

 Prezadas e prezados,

 Trabalho numa empresa grande, e tenho um grande parque de servidores e
 serviços a serem monitorados.

 Preciso de de uma ajuda pois o nagios esta tendo um comportamento
 muito estranho.


 Hoje estou rodando o nagios  com um único no de processamento com 2GB de
 Ram.

 Com aproximadamente 3000 hosts e 6000 serviços.

 Estatisticas:


  Services Actively Checked:
  Time FrameServices Checked = 1 minute:147 (2.6%)= 5 minutes:5574 (99.5%)
 =
 15 minutes:5574 (99.5%)= 1 hour: 5574 (99.5%)Since program start:  5574
 (99.5%)MetricMin.Max.Average Check Execution Time:  0.00 sec23.26
 sec0.402
 sec Check Latency:0.00 sec402.97 sec0.872 sec Percent State Change:0.00%
 6.12%0.01%




 Check Statistics:
  TypeLast 1 MinLast 5 MinLast 15 Min Active Scheduled Service
 Checks22526008
 18041


 O que tem acontecido com o nagios, em determinados momento parece que
 a maquina fica totalmente inativa, caem absurdamente os trafegos das
 interfaces (quase zerando) e o load consequentemente cai tb.


 Nesse momento observei que o nagios continua rodando, mas nenhum
 processo filho é executado mais, a maquina parece morta.
 Se eu der um reload no nagios tudo volta ao normal, mas depois de
 algumas horas depois acontece novamente o mesmo problema.Normalmente
 aconteceu as vezes que percebi as 4h da manha.


 Olhei todos os logs do nagios e de sistema possíveis e imaginaveis, e
 não ache nenhum erro nada que pudesse apontar tal comportamento.

 Desde já muito obrigado pela ajuda.

 Abs.

 Everton Pestana

 --
 This SF.net email is sponsored by

 Make an app they can't live without
 Enter the BlackBerry Developer Challenge
 http://p.sf.net/sfu/RIM-dev2dev
 --
 Nagios-users-br@lists.sourceforge.net mailing list
 https://lists.sourceforge.net/lists/listinfo/nagios-users-br
 Wiki: http://nagios-br.sf.net/wiki

--
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users 
worldwide. Take advantage of special opportunities to increase revenue and 
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
-- 
Nagios-users-br@lists.sourceforge.net mailing list
https://lists.sourceforge.net/lists/listinfo/nagios-users-br
Wiki: http://nagios-br.sf.net/wiki


[Nagios-users-br] Nagios em rede GRANDE, BEM GRANDE.

2010-08-23 Thread Everton Pestana
Prezados,


Trabalho numa empresa grande, e tenho um grande parque de servidores e
serviços a serem monitorados.

Preciso de de uma ajuda pois o nagios esta tendo um comportamento
muito estranho.


Hoje estou rodando o nagios  com um único no de processamento com 2GB de Ram.

Com aproximadamente 3000 hosts e 6000 serviços.

Estatisticas:


Services Actively Checked:
  Time FrameServices Checked = 1 minute:147 (2.6%)= 5 minutes:5574 (99.5%) =
15 minutes:5574 (99.5%)= 1 hour: 5574 (99.5%)Since program start:  5574
(99.5%)MetricMin.Max.Average Check Execution Time:  0.00 sec23.26 sec0.402
sec Check Latency:0.00 sec402.97 sec0.872 sec Percent State Change:0.00%
6.12%0.01%




Check Statistics:
TypeLast 1 MinLast 5 MinLast 15 Min Active Scheduled Service Checks22526008
18041


O que tem acontecido com o nagios, em determinados momento parece que
a maquina fica totalmente inativa, caem absurdamente os trafegos das
interfaces (quase zerando) e o load consequentemente cai tb.


Nesse momento observei que o nagios continua rodando, mas nenhum
processo filho é executado mais, a maquina parece morta.
Se eu der um reload no nagios tudo volta ao normal, mas depois de
algumas horas depois acontece novamente o mesmo problema.Normalmente
aconteceu as vezes que percebi as 4h da manha.


Olhei todos os logs do nagios e de sistema possíveis e imaginaveis, e
não ache nenhum erro nada que pudesse apontar tal comportamento.

Desde já muito obrigado pela ajuda.

Abs.

Everton Pestana
--
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users 
worldwide. Take advantage of special opportunities to increase revenue and 
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
-- 
Nagios-users-br@lists.sourceforge.net mailing list
https://lists.sourceforge.net/lists/listinfo/nagios-users-br
Wiki: http://nagios-br.sf.net/wiki


Re: [Nagios-users-br] Nagios em rede GRANDE, BEM GRANDE.

2010-08-23 Thread Leonardo Carneiro
2010/8/21 Everton Pestana evertonpest...@gmail.com:
 Prezados,


 Trabalho numa empresa grande, e tenho um grande parque de servidores e
 serviços a serem monitorados.

 Preciso de de uma ajuda pois o nagios esta tendo um comportamento
 muito estranho.


 Hoje estou rodando o nagios  com um único no de processamento com 2GB de Ram.

 Com aproximadamente 3000 hosts e 6000 serviços.

 Estatisticas:


 Services Actively Checked:
  Time FrameServices Checked = 1 minute:147 (2.6%)= 5 minutes:5574 (99.5%) =
 15 minutes:5574 (99.5%)= 1 hour: 5574 (99.5%)Since program start:  5574
 (99.5%)    MetricMin.Max.Average Check Execution Time:  0.00 sec23.26 sec0.402
 sec Check Latency:0.00 sec402.97 sec0.872 sec Percent State Change:0.00%
 6.12%0.01%




 Check Statistics:
 TypeLast 1 MinLast 5 MinLast 15 Min Active Scheduled Service Checks22526008
 18041


 O que tem acontecido com o nagios, em determinados momento parece que
 a maquina fica totalmente inativa, caem absurdamente os trafegos das
 interfaces (quase zerando) e o load consequentemente cai tb.


 Nesse momento observei que o nagios continua rodando, mas nenhum
 processo filho é executado mais, a maquina parece morta.
 Se eu der um reload no nagios tudo volta ao normal, mas depois de
 algumas horas depois acontece novamente o mesmo problema.Normalmente
 aconteceu as vezes que percebi as 4h da manha.


 Olhei todos os logs do nagios e de sistema possíveis e imaginaveis, e
 não ache nenhum erro nada que pudesse apontar tal comportamento.

 Desde já muito obrigado pela ajuda.

 Abs.

 Everton Pestana
 --
 Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
 Be part of this innovative community and reach millions of netbook users
 worldwide. Take advantage of special opportunities to increase revenue and
 speed time-to-market. Join now, and jumpstart your future.
 http://p.sf.net/sfu/intel-atom-d2d
 --
 Nagios-users-br@lists.sourceforge.net mailing list
 https://lists.sourceforge.net/lists/listinfo/nagios-users-br
 Wiki: http://nagios-br.sf.net/wiki


Olá Everton, na lista internacional do Nagios existe uma discussão
exatamente sobre um problema semelhante ao seu: problemas de
estabilidade e escalabilidade em instâncias muito grandes do Nagios.

Sugiro que vc dê uma olhada no histórico, pois o pessoal fez uma
discussão bem longa com vááárias dicas de como resolver o problema.

Pelo que eu lembro, não foi nenhuma ação isolada que corrigiu esse
tipo de problema, mas várias ações que aumentaram a eficiencia do
Nagios em processar os serviços e hosts.

--
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users 
worldwide. Take advantage of special opportunities to increase revenue and 
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
-- 
Nagios-users-br@lists.sourceforge.net mailing list
https://lists.sourceforge.net/lists/listinfo/nagios-users-br
Wiki: http://nagios-br.sf.net/wiki

[Nagios-users] How can we configure into nagios dynamic thresholds depending on timeframes

2010-08-23 Thread Alex Peeters

Dear Sire,

How can we configure into nagios dynamic thresholds depending on timeframes.

Example.   -w = 80 -c = 90 during business hours but -w = 90 -c = 95  
outsite business hours.

How can we configure into nagios dynamic thresholds depending on  
timeframes: Part II

define service{
 use local-service ; Name  
of service template to use
 host_name   localhost
 service_description Current Users
   check_command   check_local_users!20!50
   check_period nonworkhours
   notification_period  nonworkhours
 }

define service{
 use local-service ; Name  
of service template to use
 host_name   localhost
 service_description Current Users
   check_command   check_local_users!40!60
   check_period workhours
   notification_period  workhours
 }

In my above example i dit configure the same test twice. The two  
timeframes 'nonworkhours' en 'workhours' together equals 24x7

Is this way of configuring allowed. Because this solves my problem.

1) how will the nagios scheduling react on this configuration?

2) how will the display react on this configuration?


# 'workhours' timeperiod definition
define timeperiod{
 timeperiod_name workhours
 alias   Normal Working Hours
 monday  09:00-17:00
 tuesday 09:00-17:00
 wednesday   09:00-17:00
 thursday09:00-17:00
 friday  09:00-17:00
 }


# 'nonworkhours' timeperiod definition
define timeperiod{
 timeperiod_name nonworkhours
 alias   Non-Work Hours
 sunday  00:00-24:00
 monday  00:00-09:00,17:00-24:00
 tuesday 00:00-09:00,17:00-24:00
 wednesday   00:00-09:00,17:00-24:00
 thursday00:00-09:00,17:00-24:00
 friday  00:00-09:00,17:00-24:00
 saturday00:00-24:00
 }


check_period: This directive is used to specify the short name of the  
time period during which active checks of this host can be made.

check_period: This directive is used to specify the short name of the  
time period during which active checks of this service can be made.

If you do not use the check_period directive to specify a timeperiod,  
Nagios will be able to schedule active
checks of the host or service anytime it needs to. This is essentially  
a 24x7 monitoring scenario.

Specifying a timeperiod in the check_period directive allows you to  
restrict the time that Nagios perform
regularly scheduled, active checks of the host or service. When Nagios  
attempts to reschedule a host or
service check, it will make sure that the next check falls within a  
valid time range within the defined
timeperiod. If it doesn’t, Nagios will adjust the next check time to  
coincide with the next valid time in
the specified timeperiod. This means that the host or service may not  
get checked again for another hour,
day, or week, etc.

Timeperiods:
Exclusions and Host/Service Checks - There is a bug in the  
service/host check scheduling logic
that rears its head when you use timeperiod definitions that use the  
excludedirective. The
problem occurs when Nagios Core tries to re-schedule the next check.  
In this case, the
scheduling logic may incorrectly schedule the next check further out  
in the future than it
should. In essence, it skips over the (missing) logic where it could  
determine an earlier possible
time using the exception times. Imperfect Solution: Don’t use  
timeperiod definitions that
exclude other timeperods for your host/service check periods. A fix is  
being worked on, and
will hopefully make it into a 3.4.x release.


Vriendelijke Groeten,

-- Alex Peeters


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] (no subject)

2010-08-23 Thread Alex Peeters

Dear Sire,

How can we configure into nagios dynamic thresholds depending on timeframes.

Example.   -w = 80 -c = 90 during business hours but -w = 90 -c = 95  
outsite business hours.

How can we configure into nagios dynamic thresholds depending on  
timeframes: Part II

define service{
 use local-service ; Name  
of service template to use
 host_name   localhost
 service_description Current Users
   check_command   check_local_users!20!50
   check_period nonworkhours
   notification_period  nonworkhours
 }

define service{
 use local-service ; Name  
of service template to use
 host_name   localhost
 service_description Current Users
   check_command   check_local_users!40!60
   check_period workhours
   notification_period  workhours
 }

In my above example i dit configure the same test twice. The two  
timeframes 'nonworkhours' en 'workhours' together equals 24x7

Is this way of configuring allowed. Because this solves my problem.

1) how will the nagios scheduling react on this configuration?

2) how will the display react on this configuration?


# 'workhours' timeperiod definition
define timeperiod{
 timeperiod_name workhours
 alias   Normal Working Hours
 monday  09:00-17:00
 tuesday 09:00-17:00
 wednesday   09:00-17:00
 thursday09:00-17:00
 friday  09:00-17:00
 }


# 'nonworkhours' timeperiod definition
define timeperiod{
 timeperiod_name nonworkhours
 alias   Non-Work Hours
 sunday  00:00-24:00
 monday  00:00-09:00,17:00-24:00
 tuesday 00:00-09:00,17:00-24:00
 wednesday   00:00-09:00,17:00-24:00
 thursday00:00-09:00,17:00-24:00
 friday  00:00-09:00,17:00-24:00
 saturday00:00-24:00
 }


check_period: This directive is used to specify the short name of the  
time period during which active checks of this host can be made.

check_period: This directive is used to specify the short name of the  
time period during which active checks of this service can be made.

If you do not use the check_period directive to specify a timeperiod,  
Nagios will be able to schedule active
checks of the host or service anytime it needs to. This is essentially  
a 24x7 monitoring scenario.

Specifying a timeperiod in the check_period directive allows you to  
restrict the time that Nagios perform
regularly scheduled, active checks of the host or service. When Nagios  
attempts to reschedule a host or
service check, it will make sure that the next check falls within a  
valid time range within the defined
timeperiod. If it doesn’t, Nagios will adjust the next check time to  
coincide with the next valid time in
the specified timeperiod. This means that the host or service may not  
get checked again for another hour,
day, or week, etc.

Timeperiods:
Exclusions and Host/Service Checks - There is a bug in the  
service/host check scheduling logic
that rears its head when you use timeperiod definitions that use the  
excludedirective. The
problem occurs when Nagios Core tries to re-schedule the next check.  
In this case, the
scheduling logic may incorrectly schedule the next check further out  
in the future than it
should. In essence, it skips over the (missing) logic where it could  
determine an earlier possible
time using the exception times. Imperfect Solution: Don’t use  
timeperiod definitions that
exclude other timeperods for your host/service check periods. A fix is  
being worked on, and
will hopefully make it into a 3.4.x release.


Vriendelijke Groeten,

-- Alex Peeters

Section Supervision  Monitoring
Monitoring
02/787.57.27

P  Please consider the environment - do you really need  to print this email?

define service{
host_name   host_name
hostgroup_name  hostgroup_name
service_description service_description
display_namedisplay_name
servicegroups   servicegroup_names
is_volatile [0/1]
check_command   command_name
initial_state   [o,w,u,c]
max_check_attempts  #
check_interval  #
retry_interval  #
active_checks_enabled   [0/1]
passive_checks_enabled  [0/1]
check_periodtimeperiod_name
obsess_over_service [0/1]
check_freshness [0/1]
freshness_threshold #
event_handler   command_name
event_handler_enabled   [0/1]
low_flap_threshold  #
high_flap_threshold #
flap_detection_enabled  [0/1]
flap_detection_options  [o,w,c,u]
process_perf_data 

Re: [Nagios-users] contactgroup definition

2010-08-23 Thread Assaf Flatto
You should get at least a warning for the missing contactgroups


jm+nagios-us...@roth.lu wrote:
 Is it normal that there is NO error on startup when I
 - in a contact definition
 - add a contactgroups directive
 - for a contactgroup that has not been defined?

 I only get an error when I try to use a non-existing contact group in a
 service definition. This makes me wonder what is going on.

 [FEATURE REQUEST] No matter how it finally works, it would be nice to be
 able to generate the groups on-the-fly as mentioned above, but with the
 possibility to view the resulting groups.

 Thanks
 JM
   


-- 
Never,Ever Cut A Deal With a Dragon 


Next year I will be doing the London to Paris bike ride to 
raise money for the DogTrust (www.dogtrust.co.uk) .
Please Sponsor me at http://www.justgiving.com/Assaf-Flatto


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Can NRPE Output Be Graphed

2010-08-23 Thread Assaf Flatto
Robert Jackson wrote:

 I’m in the process of setting up NRPE daemons on remote Linux servers 
 to enable me to monitor them. I’ve set-up a couple of checks (zombie 
 and total processes) as per the documentation. Everything is working 
 fine and Nagios reports correctly the numbers involved. I’m now 
 wondering if these checks can/will output to PNP4Nagios to enable them 
 to be graphed?

 Also what else can I use NRPE for to monitor Linux servers?


Hello Robert

To answer your first question - Yes PNP4Nagios will be able to graph the 
data .

The second question , short answer is - any thing you want , nrpe is 
only the middle man between the nagios server and the plugins to be 
executed on the remote machines , to that end it is able to do anything 
you write a plugin to check and pass the results back to nagios.

Assaf

-- 
Never,Ever Cut A Deal With a Dragon 


Next year I will be doing the London to Paris bike ride to 
raise money for the DogTrust (www.dogtrust.co.uk) .
Please Sponsor me at http://www.justgiving.com/Assaf-Flatto


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Send_nsca problem

2010-08-23 Thread Assaf Flatto
Eric Anderson wrote:
 Hello,

 I have what I think is a very basic problem but I cannot seem to locate it. I 
 believe it to be a permissions issue.
 I'm attempting to use send_nsca to forward received syslog traffic to the 
 Nagios process.

 This article describes using using syslog to forward to send_nsca and then to 
 a Nagios server running nsca:
 http://exchange.nagios.org/directory/Addons/Log-File-Management/Syslog%252Dng-Integration-Tool/details

 I'm attempting something similar except with a twist; the basic idea is this:
 1. Syslog receives messages from clients.
 2. A script parses syslog and sends the info to send_nsca process on the same 
 host.
 3. Send_nsca sends to nsca running on this host
 4. NSCA forwards to Nagios.

 I've successfully got Nagios and NSCA running. At this point I want to test 
 send_nsca with the following command:
 send_nsca locahost -c /usr/local/nagios/nsca-2.7.2/sample-config/  
 /home/nagios/test

 The test file contaings:
 localhosttabTestMessagetab0tabThis is a test message.cr

 After running this file, I get this in my /var/log/messages file:
 Aug 12 22:16:54 . nsca[25712]: Handling the connection...
 Aug 12 22:16:54 . nsca[25712]: SERVICE CHECK - Host name: 'localhost', 
 Service Description: 'Test Message', Return Code: '0', Output: 'This is a 
 test message.'
 Aug 12 22:16:54 . nsca[25712]: Command file 
 '/var/spool/nagios/cmd/nagios.cmd' does not exist, attempt to use alternate 
 dump file '/var/spool/nagios/cmd/nsca.dump' for output
 Aug 12 22:16:54 . nsca[25712]: Could not open alternate dump file 
 '/var/spool/nagios/cmd/nsca.dump' for appending
 Aug 12 22:16:54 . nsca[25712]: End of connection...

 Can anyone suggest where I may be going wrong here?

 NSCA.CFG
 nsca_user=nagios
 nsca_group=nagcmd
 nsca_chroot=/var/spool/nagios/cmd
 command_file=/var/spool/nagios/cmd/nagios.cmd
 alternate_dump_file=/var/spool/nagios/cmd/nsca.dump

 NAGIOS.CFG
 nagios_user=nagios
 nagios_group=nagios
 command_file=/var/spool/nagios/cmd/nagios.cmd

 If I ls -haltr on /var/spool/nagios/cmd I get the following:
 drwxr-xr-x 3 nagios nagios 4.0k 2009-11-10 11:09 ..
 prw-rw-r-- 1 nagios nagcmd 0 2010-08-12 15:16 nagios.cmd
 prw-rw-r-- 1 nagios nagcmd 0 2010-08-12 15:25 nsca.dump
   
Is nagios part of the nagcmd group ?

are you executing as root or as nagios user ?

-- 
Never,Ever Cut A Deal With a Dragon 


Next year I will be doing the London to Paris bike ride to 
raise money for the DogTrust (www.dogtrust.co.uk) .
Please Sponsor me at http://www.justgiving.com/Assaf-Flatto


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] single email alert to multiple contacts?

2010-08-23 Thread Parish, Brent
Agree totally!  

All alerts from Nagios go to the same post-processing script we built
and that's where they get shuffled off where they need to go, based on
user preferences.

We built a database and simple CGI interface (within Nagios pages).
Users click on the preferences link and subscribe to systems they are
interested in receiving alerts from.  They can then decide what email
address to send to, based on time of day, hostname, alert level, etc.

That takes virtually all the alert management off the Nagios maintainer
(me!) and allows people to modify their own contact information (e.g. at
work, send to instant messenger.  on vacation, send to phone.  At home,
send to home email. etc)

There are fall through rules that can optionally send to an admin
group mailbox (with appropriate verbiage in the alert message indicating
the fall through) if no one is subscribed to get the alert.

Finally, we built a quick  dirty reporting page that lists all
contacts for all services for all hosts, so we can glance through and
pin down gaps.




-Original Message-
From: Herb J. [mailto:nag...@herb-j.com] 
Sent: Friday, August 20, 2010 12:35 PM
To: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] single email alert to multiple contacts?

Issues like this is just one of the reasons why we had to abstract out 
all notifications from Nagios to an external script. We have servers in 
a number of different locations, different platform groupings, 
escalation tiers, etc., as well as notifications sent by Jabber. They 
had to do to different people, with different escalation tiers, in 
different locations, who manage different groups of servers. With such a

variety of users receiving different emails, mailing lists were out of 
the question. It got to the point where the processing of service check 
data would be delayed by several seconds every time a notification 
needed to be sent. If an entire rack of machines or a whole platform 
went down, the check latency went through the roof due to all of the
delays.

The new system I put in place allows a single notification to be 
generated by Nagios, and regardless of how many people are configured to

receive it (be it 1 or 50), there was no delay in Nagios and there is no

need to use distribution lists.

Of course, the down side of this method is that this system isn't 
possible without a fairly complex management interface (the same one we 
use to build all of the config files).


On 08/20/2010 11:49 AM, Charlie Reddington wrote:
 On Aug 20, 2010, at 10:23 AM, Scott Nottingham wrote:


 Does anyone know how (or if it is even possible) to configure nagios
 to send a single email to all contacts associated with the host/
 service/etc as opposed to a separate email to each contact?

 The problem I'm facing is with emailing distribution lists.  If both
 distribution_list_A and B contain user_A, said user ends up getting
 2 email for the same event.  If nagios could be configured to send a
 single email to both distribution lists, our exchange server would
 recognize that user_A is a member of both lists and send only 1
 email to him.

 Thanks in advance for any insight you can provide!
  
 Think of your exhange servers mailing lists as buckets. Bucket A is
 list A with user A in it. Bucket B is list B with user A in it.

 Each bucket is going to get an email, and that email is going to get
 copied to it's users.

 I don't think this way is going to be possible, unless you make
 another group, and put your groups in there. But I will bet that user
 a still gets 2 emails. But I can't say for certain, since it's been
 about 5 years since I used a exchange server.

 I would probably pull user a out, and let him get contacted separately
 with nagios, instead of depending on a group list if it's a big deal.
 The down side is this doesn't scale very well.

 Charlie



--
 This SF.net email is sponsored by

 Make an app they can't live without
 Enter the BlackBerry Developer Challenge
 http://p.sf.net/sfu/RIM-dev2dev
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null




--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue. 
::: Messages without supporting info will risk being sent 

Re: [Nagios-users] single email alert to multiple contacts?

2010-08-23 Thread Sean McAfee
Parish, Brent wrote:
 Agree totally!  
 
 All alerts from Nagios go to the same post-processing script we built
 and that's where they get shuffled off where they need to go, based on
 user preferences.

This sounds amazing!

Is there any chance this could be released to the community?

-- 
Sean McAfee
Senior Systems Engineer

--
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users 
worldwide. Take advantage of special opportunities to increase revenue and 
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] single email alert to multiple contacts?

2010-08-23 Thread Julian Hein



Am 23.08.10 21:22 schrieb Sean McAfee unter
smca...@collaborativefusion.com:

 Parish, Brent wrote:
 Agree totally!  
 
 All alerts from Nagios go to the same post-processing script we built
 and that's where they get shuffled off where they need to go, based on
 user preferences.
 
 This sounds amazing!
 
 Is there any chance this could be released to the community?

You could look at NoMa (Notification Manager), which does the same:
https://www.netways.org/projects/noma/files

Julian


--
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users 
worldwide. Take advantage of special opportunities to increase revenue and 
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null