Re: [Nagios-users-br] RES: Nagios em rede GRANDE, BEM GRANDE.
O uso do protocolo TCP é mais confiável que UDP simplesmente porque o protocolo é orientado a conexão. Em palavras mais simples, ele faz a recuperação de pacotes de rede perdido no próprio protocolo, enquanto que o UDP depende da aplicação conferir e fazer a recuperação da informação. Mas as perdas existem tanto com o uso de um ou outro protocolo. E dependendo da escala de monitoração, usar TCP para contornar uma deficiência da rede pode trazer outros problemas. O SNMP pode oscilar não apenas por causa da rede, mas tbm pode falhar se o agent estiver com problemas. Existem técnicas adequadas para cada caso, mas se vc não tem uma conexão confiável por rede remota, melhor fazer a monitoração por um agent na rede local e fazer o relay dos resultados. Claro que estamos falando aqui de um evento de monitoração e não um alarme (notification, trap) no SNMP. Então para uma monitoração adequada, precisamos primeiro ver se a condição que gera os resultados indesejados é a rede e aplicar correções na rede ou mudar a topologia do agente Nagios para contornar a situação, por exemplo usando NSCA. Se a situação é mais devido à demora da aplicação SNMP do host monitorado, ajustar parâmetros de timeout pode ajudar. Um outro ponto a ser considerado é quantos threads podem ser rodados simultaneamente. Em se usando uma aplicação com uma resposta não muito imediata como o SNMP, faz sentido ter a máxima quantidade de threads simultâneos, uma vez que a interação com o host monitorado demora bem mais que uma interação com ping (por exemplo). Como vc usa o valor default, então não há limitações... mas vale a pena checar. Tem q ser zero. ;) sd, Edgar Em 5 de maio de 2010 12:19, Marcel mits...@gmail.com escreveu: Só alterar a conf do snmpd para escutar tanto udp quanto tcp, que é mais caro mas não há perda de pacotes! 2010/5/5 benedito.ra...@caixa.gov.br Alexandre, Grato pela resposta. Ocorre que, pela política imposta pela área de segurança da empresa, não é permitido instalação de qualquer arquivo no cliente monitorado. Ao que me parece, para que o NRPE funcione, tem que instalar e configurar o cliente, certo? Ou estou errado? Diramos -Mensagem original- De: Alexandre Gorges [mailto:algor...@gmail.com] Enviada em: quarta-feira, 5 de maio de 2010 09:53 Para: Unofficial Brazilian (Portuguese) Nagios Users List Assunto: Re: [Nagios-users-br] Nagios em rede GRANDE, BEM GRANDE. Benedito. Eu tinha esses problemas de timeout com snmp também. O snmp, por usar udp, é muito sensível a pequenas oscilações na rede. Passei a usar NRPE no lugar do snmp. Os problemas foram totalmente resolvidos e permitiu outros tipos de verificações nos sistemas e o uso do eventhandler para reiniciar processos. []'s Alexandre Gorges http://www.google.com.br/profiles/algorges MSN/Gtalk/iCHAT/Skype/Buzz: algor...@gmail.com ICQ: 2031408 From: benedito.ra...@caixa.gov.br Reply-To: Unofficial Brazilian (Portuguese) Nagios Users List nagios-users-br@lists.sourceforge.net Date: Tue, 4 May 2010 18:23:13 -0300 To: nagios-users-br@lists.sourceforge.net Subject: [Nagios-users-br] Nagios em rede GRANDE, BEM GRANDE. Prezadas e prezados, Trabalho numa empresa estatal BEM GRANDE, em se tratando de quantidade de servidores e ativos de rede. Preciso de dicas para ajustar os parâmetros do Nagios para monitorar hosts e serviços em larga escala. Até hoje, usei o Nagios para monitorar 700 servidores e 2000 serviços na filial em que trabalho. Uso o Nagios Core 3.20, que tem funcionado legal para estes quantitativos. Máquina com 4 processadores e 4 Gb de memória. Todas as checagens são feitas via SNMP, através dos plugins do site manubulom, do nagiosexchange. Ocorre que surgiu a demanda para implementar o Nagios nas demais filiais, sendo que algumas têm muito mais hosts e serviços que a minha. A maior, tem 2000 hosts e 6000 serviços. Notem que será um servidor Nagios por filial. Na maior das filiais, incluí todos os 2000 hosts e 6000 serviços. A checagem de hosts está funcionando OK. Mas as de serviços, apresentam a mensagem Nagios check time-out em muitos casos. A máquina está com 16 processadores e 16 Gb de memória. Portanto, não acredito ser problema de insuficiência de hardware. Deixei os parâmetros default do nagios.cfg. Resumindo: - Preciso de dicas de como melhorar a performance do Nagios. - Existem parâmetros específicos no nagios.cfg, a serem alterados, para melhorar a performance em redes grandes como a minha? Tenho lido sobre hosts scalations e service scalations, mas não entendi muito. Será que resolveria meu problema? Grato, desde já, por qualquer ajuda. Diramos -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki
Re: [Nagios-users-br] Ndo2db
Colega, tenta trocar suas configurações para as configurações abaixo, caso a sua base esteja em outro servidor troque o db_host=localhost para db_host=ipdasuabase Boa sorte! Cleiton Souza *ndomod.cfg* instance_name=default output_type=tcpsocket output=127.0.0.1 tcp_port=5668 output_buffer_items=5000 file_rotation_interval=14400 file_rotation_timeout=60 reconnect_interval=15 reconnect_warning_interval=900 data_processing_options=-1 config_output_options=3 *ndo2db.cfg* ndo2db_user=nagios ndo2db_group=nagios socket_type=tcp socket_name=/var/run/ndo.sock tcp_port=5668 db_servertype=mysql db_host=localhost db_name=nagios db_port=3306 db_prefix=nagios_ db_user=nagios db_pass=suasenhasemaspas max_timedevents_age=1440 max_systemcommands_age=1440 max_servicechecks_age=1440 max_hostchecks_age=1440 max_eventhandlers_age=1440 2010/5/4 Maycon Sanches maycon...@gmail.com Pessoal estou com sérios problems com NDO. Efetuei a migração do meu servidor e agora o NDO não consegue mais acessar a base (está em outro servidor), retorna o erro: [1272997227] ndomod: Successfully reconnected to data sink! 192 items lost, 5000 queued items to flush. [1272997227] ndomod: Error writing to data sink! Some output may get lost. 4785 queued items to flush. Segue minhas conf ndoutils-1.4b9 *ndomod.cfg* instance_name=default output_type=unixsocket output=/usr/local/nagios/var/ndo.sock tcp_port=5668 use_ssl=1 output_buffer_items=5000 buffer_file=/usr/local/nagios/var/ndomod.tmp file_rotation_interval=14400 file_rotation_timeout=60 reconnect_interval=15 reconnect_warning_interval=15 data_processing_options=-1 config_output_options=2 *ndo2db.cfg* lock_file=/usr/local/nagios/var/ndo2db.lock ndo2db_user=nagios ndo2db_group=nagios socket_type=unix socket_name=/usr/local/nagios/var/ndo.sock tcp_port=5668 use_ssl=1 db_servertype=mysql db_host=172.21.0.117 db_port=3306 db_name=nagios db_prefix=nagios_ db_user=nagios db_pass='senha' max_timedevents_age=1440 max_systemcommands_age=10080 max_servicechecks_age=10080 max_hostchecks_age=10080 max_eventhandlers_age=44640 max_externalcommands_age=44640 debug_level=-1 debug_verbosity=1 debug_file=/usr/local/nagios/var/ndo2db.debug max_debug_file_size=100 O servidor do nagios consegue acessar normalmente o mysql com o comando mysql -u nagios -h x.x.x.x -p Efetuei varios testes, pesquisas na net porem nenhuma com sucesso. Alguem teria alguma sugestao? _ Maycon Sanches Amaro Antes de imprimir, lembre-se de seu compromisso com o Meio Ambiente e do custo que você pode evitar. -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki
Re: [Nagios-users-br] Nagios e E-mail
Leandro, O serviço de e-mails permite conexão via POP? Você pode desenvolver um plugin que faz a leitura dos e-mails. Att, Rudolfo. 2010/5/5 Leandro Prazeres Coelho leandro.coe...@sesisc.org.br Senhores, bom dia. Possuo uma conta de e-mail onde recebo alertas de um outro sistema de monitoração, Epicenter. Gostaria de verificar se vocês conhecem algum plugin onde eu possa fazer o nagios ler os e-mails, que contem informações sobre o status de roteadores, e informar o status no nagios? Grato, LEANDRO PRAZERES COELHO Sistema FIESC TIC - Unidade Integrada de Tecnologia da Informação e Comunicação Rod. Admar Gonzaga, 2765 - Itacorubi - 88034-001 - Florianópolis - SC Fone (48) 3231-4699 - Fax (48) 3231-4170 e-mail: leandro.coe...@sesisc.org.br - site: http://www.sistemafiesc.org.br -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki
Re: [Nagios-users-br] Ndo2db
Amigo valeu pela dica, mas infelizmente nada Alguem sabe como faz pra habilitar o debug?? Coloquei na configuração debug_verbosity=1 debug_file=/tmp/ndo2db.debug Ele criar o arquivo mas nao grava nada dentro. Erro continua: [1273175217] ndomod: Successfully reconnected to data sink! 0 items lost, 4832 queued items to flush. [1273175217] ndomod: Error writing to data sink! Some output may get lost. 4301 queued items to flush. _ Maycon Sanches Amaro Analista de Suporte Jr. Redes e Telecom Informática de Municípios Associados S.A. Seu governo mais inteligente maycon.am...@ima.sp.gov.br - www.ima.sp.gov.br Fone: (19) 19 3755 6550 Antes de imprimir, lembre-se de seu compromisso com o Meio Ambiente e do custo que você pode evitar. 2010/5/5 Cleiton Souza cleiton.bra...@gmail.com Colega, tenta trocar suas configurações para as configurações abaixo, caso a sua base esteja em outro servidor troque o db_host=localhost para db_host=ipdasuabase Boa sorte! Cleiton Souza *ndomod.cfg* instance_name=default output_type=tcpsocket output=127.0.0.1 tcp_port=5668 output_buffer_items=5000 file_rotation_interval=14400 file_rotation_timeout=60 reconnect_interval=15 reconnect_warning_interval=900 data_processing_options=-1 config_output_options=3 *ndo2db.cfg* ndo2db_user=nagios ndo2db_group=nagios socket_type=tcp socket_name=/var/run/ndo.sock tcp_port=5668 db_servertype=mysql db_host=localhost db_name=nagios db_port=3306 db_prefix=nagios_ db_user=nagios db_pass=suasenhasemaspas max_timedevents_age=1440 max_systemcommands_age=1440 max_servicechecks_age=1440 max_hostchecks_age=1440 max_eventhandlers_age=1440 2010/5/4 Maycon Sanches maycon...@gmail.com Pessoal estou com sérios problems com NDO. Efetuei a migração do meu servidor e agora o NDO não consegue mais acessar a base (está em outro servidor), retorna o erro: [1272997227] ndomod: Successfully reconnected to data sink! 192 items lost, 5000 queued items to flush. [1272997227] ndomod: Error writing to data sink! Some output may get lost. 4785 queued items to flush. Segue minhas conf ndoutils-1.4b9 *ndomod.cfg* instance_name=default output_type=unixsocket output=/usr/local/nagios/var/ndo.sock tcp_port=5668 use_ssl=1 output_buffer_items=5000 buffer_file=/usr/local/nagios/var/ndomod.tmp file_rotation_interval=14400 file_rotation_timeout=60 reconnect_interval=15 reconnect_warning_interval=15 data_processing_options=-1 config_output_options=2 *ndo2db.cfg* lock_file=/usr/local/nagios/var/ndo2db.lock ndo2db_user=nagios ndo2db_group=nagios socket_type=unix socket_name=/usr/local/nagios/var/ndo.sock tcp_port=5668 use_ssl=1 db_servertype=mysql db_host=172.21.0.117 db_port=3306 db_name=nagios db_prefix=nagios_ db_user=nagios db_pass='senha' max_timedevents_age=1440 max_systemcommands_age=10080 max_servicechecks_age=10080 max_hostchecks_age=10080 max_eventhandlers_age=44640 max_externalcommands_age=44640 debug_level=-1 debug_verbosity=1 debug_file=/usr/local/nagios/var/ndo2db.debug max_debug_file_size=100 O servidor do nagios consegue acessar normalmente o mysql com o comando mysql -u nagios -h x.x.x.x -p Efetuei varios testes, pesquisas na net porem nenhuma com sucesso. Alguem teria alguma sugestao? _ Maycon Sanches Amaro Antes de imprimir, lembre-se de seu compromisso com o Meio Ambiente e do custo que você pode evitar. -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki
Re: [Nagios-users-br] Ndo2db
Colega digite no shell: /etc/init.d/ndo2db status ou /etc/init.d/ndodaemon status e manda pra gente o que aparece 2010/5/6 Maycon Sanches maycon...@gmail.com Amigo valeu pela dica, mas infelizmente nada Alguem sabe como faz pra habilitar o debug?? Coloquei na configuração debug_verbosity=1 debug_file=/tmp/ndo2db.debug Ele criar o arquivo mas nao grava nada dentro. Erro continua: [1273175217] ndomod: Successfully reconnected to data sink! 0 items lost, 4832 queued items to flush. [1273175217] ndomod: Error writing to data sink! Some output may get lost. 4301 queued items to flush. _ Maycon Sanches Amaro Analista de Suporte Jr. Redes e Telecom Informática de Municípios Associados S.A. Seu governo mais inteligente maycon.am...@ima.sp.gov.br - www.ima.sp.gov.br Fone: (19) 19 3755 6550 Antes de imprimir, lembre-se de seu compromisso com o Meio Ambiente e do custo que você pode evitar. 2010/5/5 Cleiton Souza cleiton.bra...@gmail.com Colega, tenta trocar suas configurações para as configurações abaixo, caso a sua base esteja em outro servidor troque o db_host=localhost para db_host=ipdasuabase Boa sorte! Cleiton Souza *ndomod.cfg* instance_name=default output_type=tcpsocket output=127.0.0.1 tcp_port=5668 output_buffer_items=5000 file_rotation_interval=14400 file_rotation_timeout=60 reconnect_interval=15 reconnect_warning_interval=900 data_processing_options=-1 config_output_options=3 *ndo2db.cfg* ndo2db_user=nagios ndo2db_group=nagios socket_type=tcp socket_name=/var/run/ndo.sock tcp_port=5668 db_servertype=mysql db_host=localhost db_name=nagios db_port=3306 db_prefix=nagios_ db_user=nagios db_pass=suasenhasemaspas max_timedevents_age=1440 max_systemcommands_age=1440 max_servicechecks_age=1440 max_hostchecks_age=1440 max_eventhandlers_age=1440 2010/5/4 Maycon Sanches maycon...@gmail.com Pessoal estou com sérios problems com NDO. Efetuei a migração do meu servidor e agora o NDO não consegue mais acessar a base (está em outro servidor), retorna o erro: [1272997227] ndomod: Successfully reconnected to data sink! 192 items lost, 5000 queued items to flush. [1272997227] ndomod: Error writing to data sink! Some output may get lost. 4785 queued items to flush. Segue minhas conf ndoutils-1.4b9 *ndomod.cfg* instance_name=default output_type=unixsocket output=/usr/local/nagios/var/ndo.sock tcp_port=5668 use_ssl=1 output_buffer_items=5000 buffer_file=/usr/local/nagios/var/ndomod.tmp file_rotation_interval=14400 file_rotation_timeout=60 reconnect_interval=15 reconnect_warning_interval=15 data_processing_options=-1 config_output_options=2 *ndo2db.cfg* lock_file=/usr/local/nagios/var/ndo2db.lock ndo2db_user=nagios ndo2db_group=nagios socket_type=unix socket_name=/usr/local/nagios/var/ndo.sock tcp_port=5668 use_ssl=1 db_servertype=mysql db_host=172.21.0.117 db_port=3306 db_name=nagios db_prefix=nagios_ db_user=nagios db_pass='senha' max_timedevents_age=1440 max_systemcommands_age=10080 max_servicechecks_age=10080 max_hostchecks_age=10080 max_eventhandlers_age=44640 max_externalcommands_age=44640 debug_level=-1 debug_verbosity=1 debug_file=/usr/local/nagios/var/ndo2db.debug max_debug_file_size=100 O servidor do nagios consegue acessar normalmente o mysql com o comando mysql -u nagios -h x.x.x.x -p Efetuei varios testes, pesquisas na net porem nenhuma com sucesso. Alguem teria alguma sugestao? _ Maycon Sanches Amaro Antes de imprimir, lembre-se de seu compromisso com o Meio Ambiente e do custo que você pode evitar. -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki -- -- Nagios-users-br@lists.sourceforge.net mailing list https://lists.sourceforge.net/lists/listinfo/nagios-users-br Wiki: http://nagios-br.sf.net/wiki
[Nagios-users] Check for the IBM ds4000 series totalstorage cabinett - FIX
Hi, after the latest firmware upgrade to a IBM DS 4700, the Check for the IBM ds4000 series totalstorage cabinet stopped working. I made a small fix to the perl plugin, just 2 lines. To have the plugin working with the latest cli, make this substitutions: Line 164 Change if($line=~/Array status:\s*([^\s]*)/i) { with if(($line=~/Array status:\s*([^\s]*)/i) || ($line=~/Status:\s*([^\s]*)/i)) { Line 167 Change if(!defined($array_status) || !($array_status=~/online/i)) { with if(!defined($array_status) || !(($array_status=~/online/i) || ($array_status=~/optimal/i))) { Done, now it should work. The latest upgrade changes the output of cli messages, that's why the script doesn't work anymore. The fix traps the old and new status messages. Hope it helps Cheers, Giorgio Zarrelli -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_disk plugin
with or without quotes give me the same result :( Dave Date: Wed, 5 May 2010 16:12:49 +0200 From: Richard Lynch richard.ly...@rasmussen.edu To: Nagios Users List nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] check_disk plugin I would have wrapped he wlidcard paths in quotes, but I have no idea if that's right or not... -I /my/fist/.* -I /second/.* On 5/5/10 5:49 AM, Davide Blasi davide.bl...@infracom.it wrote: Hi list, I have a question about check_disk plugin. Running check_disk -h I read : [...] -I, --ignore-eregi-path=PATH, --ignore-eregi-partition=PARTITION Regular expression to ignore selected path/partition (case insensitive) (may be repeated) [...] Good it working fine :) But now I have to add another path to ignore. This help say may be repeated but if I try to use my check like this -I /my/fist/.* -I /second/.* only first occurrence works. I tried with one -I using coma, colon, semicolon or space to separate paths but nothing works :( How can I concatenate more than one path to ignore ? Thank you in advance and sorry for my bad English ;) Dave -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check for the IBM ds4000 series totalstorage cabinett - FIX
Hi at all, I'm newbie on nagios and I'm writing here to ask you for suggestions abut how to structure my monitoring situation. I've to monitor linux servers for about 15/20 customers, from 1 to 5 server for each customer. We aren't on vpn with customers, so this servers are all behind NAT. That isn't a problem because we are the administrator of the firewall (other linux server) so we can manage any kind of DNAT and filter rule. I read on official documentation that suggest to use NCSA addon for distributed monitoring, but we choose to use NRPE addon for different motivations like: -customer force us to do that -the number of monitored servers for each customer will never grow up -the services to monitor for each server are the same (raid hw/sw, disk usage etc) -we need a completly centralized monitoring structure For last sentence I thought to use the arguments option on NRPE (yes, I read the SECURITY document). Besides, to solve the problem of NAT with NRPE I'll do DNAT on firewall and the port parameter on check_nrpe plugin (is there problems to do that? I did little tests but I prefear a confirm) To manage this structure I need to organized a well-formed config file structure on nagios server. I thinked to structure it like this obj--| |--templatelinuxserversgeneral.cfg | |--customer_1_directory|-templateserver.cfg | |-server1.cfg | |-server2.cfg | |-servern.cfg | |--customer_2_directory|-templateserver.cfg |-server1.cfg |-servern.cfg Where: -templatelinuxserversgeneral.cfg is a very basic template for server -customer_1_directory in wich there is 1 file for each customer's server -templateserver.cfg will use templatelinuxserversgeneral and will add more specific common variabiles for that customer's server like the public IPAddress that will be the same for each customer's server. -servern.cfg in wich there will be some very specific server variables like nrpe port (read up). What do you think? How can I organize that service-server combination? Thank's so much P.S. sorry for my bad english -- Enrico Zimol -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Distributed monitoring
wrong 3d before, sorry On 6 May 2010 11:55, Enrico Zimol lomiz.m...@gmail.com wrote: Hi at all, I'm newbie on nagios and I'm writing here to ask you for suggestions abut how to structure my monitoring situation. I've to monitor linux servers for about 15/20 customers, from 1 to 5 server for each customer. We aren't on vpn with customers, so this servers are all behind NAT. That isn't a problem because we are the administrator of the firewall (other linux server) so we can manage any kind of DNAT and filter rule. I read on official documentation that suggest to use NCSA addon for distributed monitoring, but we choose to use NRPE addon for different motivations like: -customer force us to do that -the number of monitored servers for each customer will never grow up -the services to monitor for each server are the same (raid hw/sw, disk usage etc) -we need a completly centralized monitoring structure For last sentence I thought to use the arguments option on NRPE (yes, I read the SECURITY document). Besides, to solve the problem of NAT with NRPE I'll do DNAT on firewall and the port parameter on check_nrpe plugin (is there problems to do that? I did little tests but I prefear a confirm) To manage this structure I need to organized a well-formed config file structure on nagios server. I thinked to structure it like this obj--| |--templatelinuxserversgeneral.cfg | |--customer_1_directory|-templateserver.cfg | |-server1.cfg | |-server2.cfg | |-servern.cfg | |--customer_2_directory|-templateserver.cfg |-server1.cfg |-servern.cfg Where: -templatelinuxserversgeneral.cfg is a very basic template for server -customer_1_directory in wich there is 1 file for each customer's server -templateserver.cfg will use templatelinuxserversgeneral and will add more specific common variabiles for that customer's server like the public IPAddress that will be the same for each customer's server. -servern.cfg in wich there will be some very specific server variables like nrpe port (read up). What do you think? How can I organize that service-server combination? Thank's so much P.S. sorry for my bad english -- Enrico Zimol -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] High latency when 15% hosts offline
Hi I'm running Nagios Core 3.2.1 Currently we have a network switch down, meaning all hosts beneath that switch are unreachable, 42 in number (from a total of 336) . In Nagios I have the switch set up as the parent. The switch I have set to be in scheduled downtime until we get a replacement, to prevent notifications being sent out. I am finding that the service check latency is enormous and the scheduling queue is slipping behind in time. For example, it is now 11:04am and the next check at the top of the scheduling queue should have run at 9:52am. Here are the service metrics from the Perf. Info page; Check Execution Time: 0.00 sec30.19 sec 2.170 sec Check Latency: 0.00 sec13612.54 sec7025.395 sec Percent State Change: 0.00% 17.37% 0.50% Are there any ways I can reduce this latency, other than disabling active checks on all the unreachable hosts? Or any 'parallel' check tweaks I may have mis-configured? I'm happy to provide any other info Thanks for any help Kristian -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check for the IBM ds4000 series totalstorage cabinett - FIX
Hi, yes you can. I splitted all the major files in smaller files and nested directories, so it's easier for me to manage all the services and hosts. And I assure you, I have an heavy split configuration. quota chi=Enrico Zimol Hi at all, I'm newbie on nagios and I'm writing here to ask you for suggestions abut how to structure my monitoring situation. I've to monitor linux servers for about 15/20 customers, from 1 to 5 server for each customer. We aren't on vpn with customers, so this servers are all behind NAT. That isn't a problem because we are the administrator of the firewall (other linux server) so we can manage any kind of DNAT and filter rule. I read on official documentation that suggest to use NCSA addon for distributed monitoring, but we choose to use NRPE addon for different motivations like: -customer force us to do that -the number of monitored servers for each customer will never grow up -the services to monitor for each server are the same (raid hw/sw, disk usage etc) -we need a completly centralized monitoring structure For last sentence I thought to use the arguments option on NRPE (yes, I read the SECURITY document). Besides, to solve the problem of NAT with NRPE I'll do DNAT on firewall and the port parameter on check_nrpe plugin (is there problems to do that? I did little tests but I prefear a confirm) To manage this structure I need to organized a well-formed config file structure on nagios server. I thinked to structure it like this obj--| |--templatelinuxserversgeneral.cfg | |--customer_1_directory|-templateserver.cfg | |-server1.cfg | |-server2.cfg | |-servern.cfg | |--customer_2_directory|-templateserver.cfg |-server1.cfg |-servern.cfg Where: -templatelinuxserversgeneral.cfg is a very basic template for server -customer_1_directory in wich there is 1 file for each customer's server -templateserver.cfg will use templatelinuxserversgeneral and will add more specific common variabiles for that customer's server like the public IPAddress that will be the same for each customer's server. -servern.cfg in wich there will be some very specific server variables like nrpe port (read up). What do you think? How can I organize that service-server combination? Thank's so much P.S. sorry for my bad english -- Enrico Zimol -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Different contacts for services on same host
Sebastian Gosenheimer wrote: Hi everybody, i'm just having some problems with my contacts and contactgroups. Let's say, i have a contactgroup xy and a contact ab. I set up a host with the contactgroup xy and some services on this host with the contactgroup xy. But now i have a special service on this host, where i only want notifications send to the contact ab. Nagios is also sending the notification e-mails correctly to the contact ab, but it's also sending the notification e-mails to the contactgroup xy. Can someone tell what i'm missing? Thank you for your help! Kind regards, --sg Is the service using any templates ? Does the template have a contactgroup directive in it ? you need to override any template or wide definitions that cover the service and put in the service specific definitions the directive : contacts ab this will tell the service to just send to the specific contact . Assaf -- Never,Ever Cut A Deal With a Dragon I am doing a Charity Bike ride On the 27 of June for the Capital to Coast Charity. Please help by Donating http://www.justgiving.com/Lovefilm-capital-to-coast -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] nagios.cmd does not seem to work properly
Hello, We'd like to use nagios.cmd pipe to send some signals. According to the examples in documentation, we can do a simple thing like that : /bin/printf [%lu] ACKNOWLEDGE_HOST_PROBLEM;host1;1;1;1;Some One;Some Acknowledgement Comment\n $now $commandfile The problem is, it seems it's not taken in account - no log in nagios.log, no message is printed in shell, and more over nagios doesn't seem to do anything. I tested with an error inside the command, and it seems to be parsed: [1273156151] Warning: Unrecognized external command - SCHEDULE_HOST_SVC_CHECKs;fqdn;1273156151 but nothing else happens. Is there a way to be sure that nagios do what we ask? The main thing is: - we're using NSCA plugin so that we have about 80 servers with their own nagios, sending results and status to a single host - we tried to run this command only on remote hosts - maybe we have to use it on the nsca server ? Any help is welcome. Thank you in advance ! Best regards, C. -- Cédric Jeanneret | System Administrator 021 619 10 32| Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios.cmd does not seem to work properly
Cedric Jeanneret wrote: Hello, We'd like to use nagios.cmd pipe to send some signals. According to the examples in documentation, we can do a simple thing like that : /bin/printf [%lu] ACKNOWLEDGE_HOST_PROBLEM;host1;1;1;1;Some One;Some Acknowledgement Comment\n $now $commandfile The problem is, it seems it's not taken in account - no log in nagios.log, no message is printed in shell, and more over nagios doesn't seem to do anything. I tested with an error inside the command, and it seems to be parsed: [1273156151] Warning: Unrecognized external command - SCHEDULE_HOST_SVC_CHECKs;fqdn;1273156151 but nothing else happens. Is there a way to be sure that nagios do what we ask? The main thing is: - we're using NSCA plugin so that we have about 80 servers with their own nagios, sending results and status to a single host - we tried to run this command only on remote hosts - maybe we have to use it on the nsca server ? Any help is welcome. Thank you in advance ! Best regards, C. Enable the nagios debug ( see bottom of nagios.cfg file ) and test again what you insert . Although the nagios debug file fills up and grows quite fast so you need to make sure to have sufficient space to have the input you want . from what I know - external commands are logged in the nagios log - again check the option in the nagios.cfg Assaf -- Never,Ever Cut A Deal With a Dragon I am doing a Charity Bike ride On the 27 of June for the Capital to Coast Charity. Please help by Donating http://www.justgiving.com/Lovefilm-capital-to-coast -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] high host latency on nagios master
try lowering max_check_result_reaper value I had good luck playing with that value. Thanks On Tue, May 4, 2010 at 8:13 PM, Trisha Hoang tri...@rockyou.com wrote: Hi, The nagios *master *got really high host latency and I'm not sure how to tweak it. I ran the check_ping plugin on a handful of hosts and the rta averaged at 0.2 second so it's not the network. *Environment:* - 565 hosts - 6790 passive checks from the slaves - not using event broker - master server *actively* executes the hosts checks every 5 minutes and *passively *processes checks every 1 minute - not doing performance data *Nagiostats* Nagios Stats 3.2.1 Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org) Last Modified: 03-09-2010 License: GPL CURRENT STATUS DATA -- Status File:/var/log/nagios/status.dat Status File Age:0d 0h 0m 23s Status File Version:3.2.1 Program Running Time: 0d 1h 32m 19s Nagios PID: 28282 Used/High/Total Command Buffers:1316 / 3066 / 4096 Total Services: 7745 Services Checked: 7745 Services Scheduled: 1381 Services Actively Checked: 955 Services Passively Checked: 6790 Total Service State Change: 0.000 / 9.740 / 0.007 % Active Service Latency: 18.948 / 205.144 / 165.751 sec Active Service Execution Time: 0.007 / 9.051 / 0.055 sec Active Service State Change:0.000 / 5.460 / 0.006 % Active Services Last 1/5/15/60 min: 0 / 0 / 0 / 0 Passive Service Latency:34.359 / 190.247 / 76.739 sec Passive Service State Change: 0.000 / 9.740 / 0.008 % Passive Services Last 1/5/15/60 min:0 / 3054 / 6774 / 6784 Services Ok/Warn/Unk/Crit: 7720 / 1 / 0 / 24 Services Flapping: 27 Services In Downtime: 0 Total Hosts:566 Hosts Checked: 566 Hosts Scheduled:566 Hosts Actively Checked: 566 Host Passively Checked: 0 Total Host State Change:0.000 / 0.000 / 0.000 % Active Host Latency:0.000 / 3410.087 / 2413.051 sec Active Host Execution Time: 0.007 / 10.010 / 0.063 sec Active Host State Change: 0.000 / 0.000 / 0.000 % Active Hosts Last 1/5/15/60 min:0 / 8 / 10 / 565 Passive Host Latency: 0.000 / 0.000 / 0.000 sec Passive Host State Change: 0.000 / 0.000 / 0.000 % Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0 Hosts Up/Down/Unreach: 563 / 3 / 0 Hosts Flapping: 1 Hosts In Downtime: 0 Active Host Checks Last 1/5/15 min: 5 / 32 / 75 Scheduled: 0 / 0 / 0 On-demand: 5 / 32 / 75 Parallel:1 / 11 / 23 Serial: 0 / 0 / 0 Cached: 4 / 21 / 52 Passive Host Checks Last 1/5/15 min:0 / 0 / 0 Active Service Checks Last 1/5/15 min: 0 / 0 / 0 Scheduled: 0 / 0 / 0 On-demand: 0 / 0 / 0 Cached: 0 / 0 / 0 Passive Service Checks Last 1/5/15 min: 2 / 1455 / 1455 External Commands Last 1/5/15 min: 1302 / 6063 / 20253 *Nagios.cfg* # EXTERNAL COMMAND CHECK INTERVAL # This is the interval at which Nagios should check for external commands. # This value works of the interval_length you specify later. If you leave # that at its default value of 60 (seconds), a value of 1 here will cause # Nagios to check for external commands every minute. If you specify a # number followed by an s (i.e. 15s), this will be interpreted to mean # actual seconds rather than a multiple of the interval_length variable. # Note: In addition to reading the external command file at regularly # scheduled intervals, Nagios will also check for external commands after # event handlers are executed. # NOTE: Setting this value to -1 causes Nagios to check the external # command file as often as possible. #command_check_interval=15s command_check_interval=-1 # SERVICE INTER-CHECK DELAY METHOD # This is the method that Nagios should use when initially # spreading out service checks when it starts monitoring. The # default is to use smart delay calculation, which will try to # space all service checks out evenly to minimize CPU load. # Using the dumb setting will cause all checks to be scheduled # at the same time (with no delay between them)! This is not a # good thing for production, but is useful when testing the # parallelization functionality. # n =
Re: [Nagios-users] nagios.cmd does not seem to work properly
Cedric Jeanneret wrote: Hello, We'd like to use nagios.cmd pipe to send some signals. According to the examples in documentation, we can do a simple thing like that : /bin/printf [%lu] ACKNOWLEDGE_HOST_PROBLEM;host1;1;1;1;Some One;Some Acknowledgement Comment\n $now $commandfile The problem is, it seems it's not taken in account - no log in nagios.log, no message is printed in shell, and more over nagios doesn't seem to do anything. I tested with an error inside the command, and it seems to be parsed: [1273156151] Warning: Unrecognized external command - SCHEDULE_HOST_SVC_CHECKs;fqdn;1273156151 but nothing else happens. Is there a way to be sure that nagios do what we ask? The main thing is: - we're using NSCA plugin so that we have about 80 servers with their own nagios, sending results and status to a single host - we tried to run this command only on remote hosts - maybe we have to use it on the nsca server ? You do, of course, have to run this on the Nagios server, since that's where the pipe is. The message in your log seems to indicate you typo'd the SCHEDULE_HOST_SVC_CHECK command. If you're using NSCA, your pipe works: it submits commands to Nagios the same way. -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_disk plugin
Davide Blasi wrote: with or without quotes give me the same result :( Try using single quotes, e.g. -I '/my/fist/.*' -I '/second/.*' -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_disk plugin
Aidan Anderson wrote : Davide Blasi wrote: with or without quotes give me the same result :( Try using single quotes, e.g. -I '/my/fist/.*' -I '/second/.*' No, It doesn't works :( But I don't think that is a quote problem. If I invert path order, the check correctly accepts first argument but ignore the others. Dave -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Meaningful subject lines in e-mail alert
Greetings, We are sending e-mail alerts on host/service state change. I was just wondering what do you guys use as subject lines; just looking around for ideas. Thank you in advance. -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null