Re: [Nagios-users-br] RES: Nagios em rede GRANDE, BEM GRANDE.

2010-05-06 Thread Shine
O uso do protocolo TCP é mais confiável que UDP simplesmente porque o
protocolo é orientado a conexão. Em palavras mais simples, ele faz a
recuperação de pacotes de rede perdido no próprio protocolo, enquanto
que o UDP depende da aplicação conferir e fazer a recuperação da
informação.
Mas as perdas existem tanto com o uso de um ou outro protocolo. E
dependendo da escala de monitoração, usar TCP para contornar uma
deficiência da rede pode trazer outros problemas.

O SNMP pode oscilar não apenas por causa da rede, mas tbm pode falhar
se o agent estiver com problemas. Existem técnicas adequadas para cada
caso, mas se vc não tem uma conexão confiável por rede remota, melhor
fazer a monitoração por um agent na rede local e fazer o relay dos
resultados. Claro que estamos falando aqui de um evento de monitoração
e não um alarme (notification, trap) no SNMP.

Então para uma monitoração adequada, precisamos primeiro ver se a
condição que gera os resultados indesejados é a rede e aplicar
correções na rede ou mudar a topologia do agente Nagios para contornar
a situação, por exemplo usando NSCA. Se a situação é mais devido à
demora da aplicação SNMP do host monitorado, ajustar parâmetros de
timeout pode ajudar.

Um outro ponto a ser considerado é quantos threads podem ser rodados
simultaneamente. Em se usando uma aplicação com uma resposta não muito
imediata como o SNMP, faz sentido ter a máxima quantidade de threads
simultâneos, uma vez que a interação com o host monitorado demora bem
mais que uma interação com ping (por exemplo). Como vc usa o valor
default, então não há limitações... mas vale a pena checar. Tem q ser
zero. ;)

sd,
Edgar

Em 5 de maio de 2010 12:19, Marcel mits...@gmail.com escreveu:
 Só alterar a conf do snmpd para escutar tanto udp quanto tcp, que é mais
 caro mas não há perda de pacotes!

 2010/5/5 benedito.ra...@caixa.gov.br

 Alexandre,

 Grato pela resposta.
 Ocorre que, pela política imposta pela área de segurança da empresa, não é
 permitido instalação de qualquer arquivo no cliente monitorado.
 Ao que me parece, para que o NRPE funcione, tem que instalar e configurar o
 cliente, certo?
 Ou estou errado?

 Diramos

 -Mensagem original-
 De: Alexandre Gorges [mailto:algor...@gmail.com]
 Enviada em: quarta-feira, 5 de maio de 2010 09:53
 Para: Unofficial Brazilian (Portuguese) Nagios Users List
 Assunto: Re: [Nagios-users-br] Nagios em rede GRANDE, BEM GRANDE.

 Benedito.

 Eu tinha esses problemas de timeout com snmp também. O snmp, por usar udp,
 é
 muito sensível a pequenas oscilações na rede.

 Passei a usar NRPE no lugar do snmp. Os problemas foram totalmente
 resolvidos e permitiu outros tipos de verificações nos sistemas e o uso do
 eventhandler para reiniciar processos.


 []'s
 Alexandre Gorges
 http://www.google.com.br/profiles/algorges
 MSN/Gtalk/iCHAT/Skype/Buzz: algor...@gmail.com
 ICQ: 2031408




  From: benedito.ra...@caixa.gov.br
  Reply-To: Unofficial Brazilian (Portuguese) Nagios Users List
  nagios-users-br@lists.sourceforge.net
  Date: Tue, 4 May 2010 18:23:13 -0300
  To: nagios-users-br@lists.sourceforge.net
  Subject: [Nagios-users-br] Nagios em rede GRANDE, BEM GRANDE.
 
  Prezadas e prezados,
 
  Trabalho numa empresa estatal BEM GRANDE, em se tratando de quantidade de
  servidores e ativos de rede.
  Preciso de dicas para ajustar os parâmetros do Nagios para monitorar
 hosts e
  serviços em larga escala.
  Até hoje, usei o Nagios para monitorar 700 servidores e 2000 serviços na
  filial em que trabalho.
  Uso o Nagios Core 3.20, que tem funcionado legal para estes
 quantitativos.
  Máquina com 4 processadores e 4 Gb de memória.
  Todas as checagens são feitas via SNMP, através dos plugins do site
 manubulom,
  do nagiosexchange.
 
  Ocorre que surgiu a demanda para implementar o Nagios nas demais filiais,
  sendo que algumas têm muito mais hosts e serviços que a minha.
  A maior, tem 2000 hosts e 6000 serviços.
  Notem que será um servidor Nagios por filial.
 
  Na maior das filiais, incluí todos os 2000 hosts e 6000 serviços.
  A checagem de hosts está funcionando OK.
  Mas as de serviços, apresentam a mensagem Nagios check time-out em
 muitos
  casos.
  A máquina está com 16 processadores e 16 Gb de memória.
  Portanto, não acredito ser problema de insuficiência de hardware.
 
  Deixei os parâmetros default do nagios.cfg.
 
  Resumindo:
 
  - Preciso de dicas de como melhorar a performance do Nagios.
  - Existem parâmetros específicos no nagios.cfg, a serem alterados, para
  melhorar a performance em redes grandes como a minha?
 
  Tenho lido sobre hosts scalations e service scalations, mas não entendi
  muito.
  Será que resolveria meu problema?
 
  Grato, desde já, por qualquer ajuda.
 
 
  Diramos
 
 
 
 
 --
  --
  Nagios-users-br@lists.sourceforge.net mailing list
  https://lists.sourceforge.net/lists/listinfo/nagios-users-br
  Wiki: http://nagios-br.sf.net/wiki




 

Re: [Nagios-users-br] Ndo2db

2010-05-06 Thread Cleiton Souza
Colega, tenta trocar suas configurações para as configurações abaixo, caso a
sua base esteja em outro servidor troque o db_host=localhost para
db_host=ipdasuabase

Boa sorte!

Cleiton Souza

*ndomod.cfg*
instance_name=default
output_type=tcpsocket
output=127.0.0.1
tcp_port=5668
output_buffer_items=5000
file_rotation_interval=14400
file_rotation_timeout=60
reconnect_interval=15
reconnect_warning_interval=900
data_processing_options=-1
config_output_options=3



*ndo2db.cfg*
ndo2db_user=nagios
ndo2db_group=nagios
socket_type=tcp
socket_name=/var/run/ndo.sock
tcp_port=5668
db_servertype=mysql
db_host=localhost
db_name=nagios
db_port=3306
db_prefix=nagios_
db_user=nagios
db_pass=suasenhasemaspas
max_timedevents_age=1440
max_systemcommands_age=1440
max_servicechecks_age=1440
max_hostchecks_age=1440
max_eventhandlers_age=1440

2010/5/4 Maycon Sanches maycon...@gmail.com

 Pessoal estou com sérios problems com NDO. Efetuei a migração do meu
 servidor e agora o NDO não consegue mais acessar a base (está em outro
 servidor), retorna o erro:

 [1272997227] ndomod: Successfully reconnected to data sink!  192 items
 lost,
 5000 queued items to flush.
 [1272997227] ndomod: Error writing to data sink!  Some output may get lost.
  4785 queued items to flush.

 Segue minhas conf

 ndoutils-1.4b9

 *ndomod.cfg*
 instance_name=default
 output_type=unixsocket
 output=/usr/local/nagios/var/ndo.sock
 tcp_port=5668
 use_ssl=1
 output_buffer_items=5000
 buffer_file=/usr/local/nagios/var/ndomod.tmp
 file_rotation_interval=14400
 file_rotation_timeout=60
 reconnect_interval=15
 reconnect_warning_interval=15
 data_processing_options=-1
 config_output_options=2

 *ndo2db.cfg*
 lock_file=/usr/local/nagios/var/ndo2db.lock
 ndo2db_user=nagios
 ndo2db_group=nagios
 socket_type=unix
 socket_name=/usr/local/nagios/var/ndo.sock
 tcp_port=5668
 use_ssl=1
 db_servertype=mysql
 db_host=172.21.0.117
 db_port=3306
 db_name=nagios
 db_prefix=nagios_
 db_user=nagios
 db_pass='senha'
 max_timedevents_age=1440
 max_systemcommands_age=10080
 max_servicechecks_age=10080
 max_hostchecks_age=10080
 max_eventhandlers_age=44640
 max_externalcommands_age=44640
 debug_level=-1
 debug_verbosity=1
 debug_file=/usr/local/nagios/var/ndo2db.debug
 max_debug_file_size=100

 O servidor do nagios consegue acessar normalmente o mysql com o comando
 mysql -u nagios -h x.x.x.x -p

 Efetuei varios testes, pesquisas na net porem nenhuma com sucesso.
 Alguem teria alguma sugestao?

 _
 Maycon Sanches Amaro

 Antes de imprimir, lembre-se de seu compromisso com o Meio Ambiente e do
 custo que você pode evitar.

 --
 --
 Nagios-users-br@lists.sourceforge.net mailing list
 https://lists.sourceforge.net/lists/listinfo/nagios-users-br
 Wiki: http://nagios-br.sf.net/wiki

--
-- 
Nagios-users-br@lists.sourceforge.net mailing list
https://lists.sourceforge.net/lists/listinfo/nagios-users-br
Wiki: http://nagios-br.sf.net/wiki


Re: [Nagios-users-br] Nagios e E-mail

2010-05-06 Thread Rudolfo Rosa
Leandro,
O serviço de e-mails permite conexão via POP?

Você pode desenvolver um plugin que faz a leitura dos e-mails.

Att, Rudolfo.

2010/5/5 Leandro Prazeres Coelho leandro.coe...@sesisc.org.br

 Senhores, bom dia.

 Possuo uma conta de e-mail onde recebo alertas de um outro sistema de
 monitoração, Epicenter. Gostaria de verificar se vocês conhecem algum
 plugin onde eu possa fazer o nagios ler os e-mails, que contem
 informações sobre o status de roteadores, e informar o status no nagios?

 Grato,


 
 LEANDRO PRAZERES COELHO
 Sistema FIESC
 TIC - Unidade Integrada de Tecnologia da Informação e Comunicação
 Rod. Admar Gonzaga, 2765 - Itacorubi - 88034-001 - Florianópolis - SC
 Fone (48) 3231-4699 - Fax (48) 3231-4170
 e-mail: leandro.coe...@sesisc.org.br - site:
 http://www.sistemafiesc.org.br






 --
 --
 Nagios-users-br@lists.sourceforge.net mailing list
 https://lists.sourceforge.net/lists/listinfo/nagios-users-br
 Wiki: http://nagios-br.sf.net/wiki

--

-- 
Nagios-users-br@lists.sourceforge.net mailing list
https://lists.sourceforge.net/lists/listinfo/nagios-users-br
Wiki: http://nagios-br.sf.net/wiki


Re: [Nagios-users-br] Ndo2db

2010-05-06 Thread Maycon Sanches
Amigo valeu pela dica, mas infelizmente nada Alguem sabe como faz pra
habilitar o debug??
Coloquei na configuração
debug_verbosity=1
debug_file=/tmp/ndo2db.debug
Ele criar o arquivo mas nao grava nada dentro.


Erro continua:
[1273175217] ndomod: Successfully reconnected to data sink!  0 items lost,
4832 queued items to flush.
[1273175217] ndomod: Error writing to data sink!  Some output may get lost.
 4301 queued items to flush.

_
Maycon Sanches Amaro
Analista de Suporte Jr.
Redes e Telecom
Informática de Municípios Associados S.A.
Seu governo mais inteligente
maycon.am...@ima.sp.gov.br  - www.ima.sp.gov.br
Fone: (19) 19 3755 6550

Antes de imprimir, lembre-se de seu compromisso com o Meio Ambiente e do
custo que você pode evitar.


2010/5/5 Cleiton Souza cleiton.bra...@gmail.com

 Colega, tenta trocar suas configurações para as configurações abaixo, caso
 a
 sua base esteja em outro servidor troque o db_host=localhost para
 db_host=ipdasuabase

 Boa sorte!

 Cleiton Souza

 *ndomod.cfg*
 instance_name=default
 output_type=tcpsocket
 output=127.0.0.1
 tcp_port=5668
 output_buffer_items=5000
 file_rotation_interval=14400
 file_rotation_timeout=60
 reconnect_interval=15
 reconnect_warning_interval=900
 data_processing_options=-1
 config_output_options=3



 *ndo2db.cfg*
 ndo2db_user=nagios
 ndo2db_group=nagios
 socket_type=tcp
 socket_name=/var/run/ndo.sock
 tcp_port=5668
 db_servertype=mysql
 db_host=localhost
 db_name=nagios
 db_port=3306
 db_prefix=nagios_
 db_user=nagios
 db_pass=suasenhasemaspas
 max_timedevents_age=1440
 max_systemcommands_age=1440
 max_servicechecks_age=1440
 max_hostchecks_age=1440
 max_eventhandlers_age=1440

 2010/5/4 Maycon Sanches maycon...@gmail.com

  Pessoal estou com sérios problems com NDO. Efetuei a migração do meu
  servidor e agora o NDO não consegue mais acessar a base (está em outro
  servidor), retorna o erro:
 
  [1272997227] ndomod: Successfully reconnected to data sink!  192 items
  lost,
  5000 queued items to flush.
  [1272997227] ndomod: Error writing to data sink!  Some output may get
 lost.
   4785 queued items to flush.
 
  Segue minhas conf
 
  ndoutils-1.4b9
 
  *ndomod.cfg*
  instance_name=default
  output_type=unixsocket
  output=/usr/local/nagios/var/ndo.sock
  tcp_port=5668
  use_ssl=1
  output_buffer_items=5000
  buffer_file=/usr/local/nagios/var/ndomod.tmp
  file_rotation_interval=14400
  file_rotation_timeout=60
  reconnect_interval=15
  reconnect_warning_interval=15
  data_processing_options=-1
  config_output_options=2
 
  *ndo2db.cfg*
  lock_file=/usr/local/nagios/var/ndo2db.lock
  ndo2db_user=nagios
  ndo2db_group=nagios
  socket_type=unix
  socket_name=/usr/local/nagios/var/ndo.sock
  tcp_port=5668
  use_ssl=1
  db_servertype=mysql
  db_host=172.21.0.117
  db_port=3306
  db_name=nagios
  db_prefix=nagios_
  db_user=nagios
  db_pass='senha'
  max_timedevents_age=1440
  max_systemcommands_age=10080
  max_servicechecks_age=10080
  max_hostchecks_age=10080
  max_eventhandlers_age=44640
  max_externalcommands_age=44640
  debug_level=-1
  debug_verbosity=1
  debug_file=/usr/local/nagios/var/ndo2db.debug
  max_debug_file_size=100
 
  O servidor do nagios consegue acessar normalmente o mysql com o comando
  mysql -u nagios -h x.x.x.x -p
 
  Efetuei varios testes, pesquisas na net porem nenhuma com sucesso.
  Alguem teria alguma sugestao?
 
  _
  Maycon Sanches Amaro
 
  Antes de imprimir, lembre-se de seu compromisso com o Meio Ambiente e do
  custo que você pode evitar.
 
 
 --
  --
  Nagios-users-br@lists.sourceforge.net mailing list
  https://lists.sourceforge.net/lists/listinfo/nagios-users-br
  Wiki: http://nagios-br.sf.net/wiki
 

 --
 --
 Nagios-users-br@lists.sourceforge.net mailing list
 https://lists.sourceforge.net/lists/listinfo/nagios-users-br
 Wiki: http://nagios-br.sf.net/wiki

--

-- 
Nagios-users-br@lists.sourceforge.net mailing list
https://lists.sourceforge.net/lists/listinfo/nagios-users-br
Wiki: http://nagios-br.sf.net/wiki


Re: [Nagios-users-br] Ndo2db

2010-05-06 Thread Cleiton Souza
Colega digite no shell:

/etc/init.d/ndo2db status
ou
/etc/init.d/ndodaemon status
e manda pra gente o que aparece
2010/5/6 Maycon Sanches maycon...@gmail.com

 Amigo valeu pela dica, mas infelizmente nada Alguem sabe como faz pra
 habilitar o debug??
 Coloquei na configuração
 debug_verbosity=1
 debug_file=/tmp/ndo2db.debug
 Ele criar o arquivo mas nao grava nada dentro.


 Erro continua:
 [1273175217] ndomod: Successfully reconnected to data sink!  0 items lost,
 4832 queued items to flush.
 [1273175217] ndomod: Error writing to data sink!  Some output may get lost.
  4301 queued items to flush.

 _
 Maycon Sanches Amaro
 Analista de Suporte Jr.
 Redes e Telecom
 Informática de Municípios Associados S.A.
 Seu governo mais inteligente
 maycon.am...@ima.sp.gov.br  - www.ima.sp.gov.br
 Fone: (19) 19 3755 6550

 Antes de imprimir, lembre-se de seu compromisso com o Meio Ambiente e do
 custo que você pode evitar.


 2010/5/5 Cleiton Souza cleiton.bra...@gmail.com

  Colega, tenta trocar suas configurações para as configurações abaixo,
 caso
  a
  sua base esteja em outro servidor troque o db_host=localhost para
  db_host=ipdasuabase
 
  Boa sorte!
 
  Cleiton Souza
 
  *ndomod.cfg*
  instance_name=default
  output_type=tcpsocket
  output=127.0.0.1
  tcp_port=5668
  output_buffer_items=5000
  file_rotation_interval=14400
  file_rotation_timeout=60
  reconnect_interval=15
  reconnect_warning_interval=900
  data_processing_options=-1
  config_output_options=3
 
 
 
  *ndo2db.cfg*
  ndo2db_user=nagios
  ndo2db_group=nagios
  socket_type=tcp
  socket_name=/var/run/ndo.sock
  tcp_port=5668
  db_servertype=mysql
  db_host=localhost
  db_name=nagios
  db_port=3306
  db_prefix=nagios_
  db_user=nagios
  db_pass=suasenhasemaspas
  max_timedevents_age=1440
  max_systemcommands_age=1440
  max_servicechecks_age=1440
  max_hostchecks_age=1440
  max_eventhandlers_age=1440
 
  2010/5/4 Maycon Sanches maycon...@gmail.com
 
   Pessoal estou com sérios problems com NDO. Efetuei a migração do meu
   servidor e agora o NDO não consegue mais acessar a base (está em outro
   servidor), retorna o erro:
  
   [1272997227] ndomod: Successfully reconnected to data sink!  192 items
   lost,
   5000 queued items to flush.
   [1272997227] ndomod: Error writing to data sink!  Some output may get
  lost.
4785 queued items to flush.
  
   Segue minhas conf
  
   ndoutils-1.4b9
  
   *ndomod.cfg*
   instance_name=default
   output_type=unixsocket
   output=/usr/local/nagios/var/ndo.sock
   tcp_port=5668
   use_ssl=1
   output_buffer_items=5000
   buffer_file=/usr/local/nagios/var/ndomod.tmp
   file_rotation_interval=14400
   file_rotation_timeout=60
   reconnect_interval=15
   reconnect_warning_interval=15
   data_processing_options=-1
   config_output_options=2
  
   *ndo2db.cfg*
   lock_file=/usr/local/nagios/var/ndo2db.lock
   ndo2db_user=nagios
   ndo2db_group=nagios
   socket_type=unix
   socket_name=/usr/local/nagios/var/ndo.sock
   tcp_port=5668
   use_ssl=1
   db_servertype=mysql
   db_host=172.21.0.117
   db_port=3306
   db_name=nagios
   db_prefix=nagios_
   db_user=nagios
   db_pass='senha'
   max_timedevents_age=1440
   max_systemcommands_age=10080
   max_servicechecks_age=10080
   max_hostchecks_age=10080
   max_eventhandlers_age=44640
   max_externalcommands_age=44640
   debug_level=-1
   debug_verbosity=1
   debug_file=/usr/local/nagios/var/ndo2db.debug
   max_debug_file_size=100
  
   O servidor do nagios consegue acessar normalmente o mysql com o comando
   mysql -u nagios -h x.x.x.x -p
  
   Efetuei varios testes, pesquisas na net porem nenhuma com sucesso.
   Alguem teria alguma sugestao?
  
   _
   Maycon Sanches Amaro
  
   Antes de imprimir, lembre-se de seu compromisso com o Meio Ambiente e
 do
   custo que você pode evitar.
  
  
 
 --
   --
   Nagios-users-br@lists.sourceforge.net mailing list
   https://lists.sourceforge.net/lists/listinfo/nagios-users-br
   Wiki: http://nagios-br.sf.net/wiki
  
 
 
 --
  --
  Nagios-users-br@lists.sourceforge.net mailing list
  https://lists.sourceforge.net/lists/listinfo/nagios-users-br
  Wiki: http://nagios-br.sf.net/wiki
 

 --

 --
  Nagios-users-br@lists.sourceforge.net mailing list
 https://lists.sourceforge.net/lists/listinfo/nagios-users-br
 Wiki: http://nagios-br.sf.net/wiki

--

-- 
Nagios-users-br@lists.sourceforge.net mailing list
https://lists.sourceforge.net/lists/listinfo/nagios-users-br
Wiki: http://nagios-br.sf.net/wiki


[Nagios-users] Check for the IBM ds4000 series totalstorage cabinett - FIX

2010-05-06 Thread Giorgio Zarrelli
Hi,

after the latest firmware upgrade to a IBM DS 4700, the Check for the IBM
ds4000 series totalstorage cabinet stopped working.

I made a small fix to the perl plugin, just 2 lines. To have the plugin
working with the latest cli, make this substitutions:

Line 164

Change

if($line=~/Array status:\s*([^\s]*)/i) {

with

if(($line=~/Array status:\s*([^\s]*)/i) || ($line=~/Status:\s*([^\s]*)/i)) {


Line 167

Change

if(!defined($array_status) || !($array_status=~/online/i)) {

with

if(!defined($array_status) || !(($array_status=~/online/i) ||
($array_status=~/optimal/i))) {

Done, now it should work. The latest upgrade changes the output of cli
messages, that's why the script doesn't work anymore. The fix traps the
old and new status messages.

Hope it helps

Cheers,

Giorgio Zarrelli


--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_disk plugin

2010-05-06 Thread Davide Blasi

with or without quotes give me the same result :(

Dave

Date: Wed, 5 May 2010 16:12:49 +0200
From: Richard Lynch richard.ly...@rasmussen.edu
To: Nagios Users List nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] check_disk plugin


I would have wrapped he wlidcard paths in quotes, but I have no idea if
that's right or not...

-I /my/fist/.* -I /second/.*



On 5/5/10 5:49 AM, Davide Blasi davide.bl...@infracom.it wrote:

 
 Hi list,
 
 I have a question about check_disk plugin.
 Running check_disk -h I read :
 
 [...]
  -I, --ignore-eregi-path=PATH, --ignore-eregi-partition=PARTITION
 Regular expression to ignore selected path/partition (case insensitive)
 (may be repeated)
 [...]
 
 Good it working fine :)
 
 But now I have to add another path to ignore.
 This help say may be repeated but if I try to use my check like this  -I
 /my/fist/.* -I /second/.*  only first occurrence works.
 I tried with one -I using coma, colon, semicolon or space to separate paths
 but nothing works :(
 
 How can I concatenate more than one path to ignore ?
 
 Thank you in advance and sorry for my bad English ;)
 
 Dave
 
 
 --
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting
 any issue. 
 ::: Messages without supporting info will risk being sent to /dev/null


--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Check for the IBM ds4000 series totalstorage cabinett - FIX

2010-05-06 Thread Enrico Zimol
Hi at all,
I'm newbie on nagios and I'm writing here to ask you for suggestions
abut how to structure my monitoring situation.
I've to monitor linux servers for about 15/20 customers, from 1 to 5
server for each customer.
We aren't on vpn with customers, so this servers are all behind NAT.
That isn't a problem because we are the administrator of the firewall
(other linux server) so we can manage any kind of DNAT and filter
rule.

I read on official documentation that suggest to use NCSA addon for
distributed monitoring, but we choose to use NRPE addon for different
motivations like:
-customer force us to do that
-the number of monitored servers for each customer will never grow up
-the services to monitor for each server are the same (raid hw/sw,
disk usage etc)
-we need a completly centralized monitoring structure

For last sentence I thought to use the arguments option on NRPE (yes,
I read the SECURITY document).
Besides, to solve the problem of NAT with NRPE I'll do DNAT on
firewall and the port parameter on check_nrpe plugin (is there
problems to do that? I did little tests but I prefear a confirm)


To manage this structure I need to organized a well-formed config file
structure on nagios server.

I thinked to structure it like this

obj--|
|--templatelinuxserversgeneral.cfg
|
|--customer_1_directory|-templateserver.cfg
|   |-server1.cfg
|   |-server2.cfg
|   |-servern.cfg
|
|--customer_2_directory|-templateserver.cfg
|-server1.cfg
|-servern.cfg


Where:
-templatelinuxserversgeneral.cfg is a very basic template for server
-customer_1_directory in wich there is 1 file for each customer's server
-templateserver.cfg will use templatelinuxserversgeneral and will add
more specific common variabiles for that customer's server like the
public IPAddress that will be the same for each customer's server.
-servern.cfg in wich there will be some very specific server variables
like nrpe port (read up).

What do you think?
How can I organize that service-server combination?


Thank's so much

P.S. sorry for my bad english

-- 
Enrico Zimol

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Distributed monitoring

2010-05-06 Thread Enrico Zimol
wrong 3d before, sorry

On 6 May 2010 11:55, Enrico Zimol lomiz.m...@gmail.com wrote:
 Hi at all,
 I'm newbie on nagios and I'm writing here to ask you for suggestions
 abut how to structure my monitoring situation.
 I've to monitor linux servers for about 15/20 customers, from 1 to 5
 server for each customer.
 We aren't on vpn with customers, so this servers are all behind NAT.
 That isn't a problem because we are the administrator of the firewall
 (other linux server) so we can manage any kind of DNAT and filter
 rule.

 I read on official documentation that suggest to use NCSA addon for
 distributed monitoring, but we choose to use NRPE addon for different
 motivations like:
 -customer force us to do that
 -the number of monitored servers for each customer will never grow up
 -the services to monitor for each server are the same (raid hw/sw,
 disk usage etc)
 -we need a completly centralized monitoring structure

 For last sentence I thought to use the arguments option on NRPE (yes,
 I read the SECURITY document).
 Besides, to solve the problem of NAT with NRPE I'll do DNAT on
 firewall and the port parameter on check_nrpe plugin (is there
 problems to do that? I did little tests but I prefear a confirm)


 To manage this structure I need to organized a well-formed config file
 structure on nagios server.

 I thinked to structure it like this

 obj--|
        |--templatelinuxserversgeneral.cfg
        |
        |--customer_1_directory|-templateserver.cfg
        |                       |-server1.cfg
        |                       |-server2.cfg
        |                       |-servern.cfg
        |
        |--customer_2_directory|-templateserver.cfg
                                |-server1.cfg
                                |-servern.cfg


 Where:
 -templatelinuxserversgeneral.cfg is a very basic template for server
 -customer_1_directory in wich there is 1 file for each customer's server
 -templateserver.cfg will use templatelinuxserversgeneral and will add
 more specific common variabiles for that customer's server like the
 public IPAddress that will be the same for each customer's server.
 -servern.cfg in wich there will be some very specific server variables
 like nrpe port (read up).

 What do you think?
 How can I organize that service-server combination?


 Thank's so much

 P.S. sorry for my bad english

 --
 Enrico Zimol


--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] High latency when 15% hosts offline

2010-05-06 Thread kristian
Hi

I'm running Nagios Core 3.2.1

Currently we have a network switch down, meaning all hosts beneath that
switch are unreachable, 42 in number (from a total of 336) . In Nagios I
have the switch set up as the parent. The switch I have set to be in
scheduled downtime until we get a replacement, to prevent notifications
being sent out.

I am finding that the service check latency is enormous and the scheduling
queue is slipping behind in time. For example, it is now 11:04am and the
next check at the top of the scheduling queue should have run at 9:52am.

Here are the service metrics from the Perf. Info page;

Check Execution Time:   0.00 sec30.19 sec   2.170 sec 
Check Latency:  0.00 sec13612.54 sec7025.395 sec
Percent State Change:   0.00%   17.37%  0.50%

Are there any ways I can reduce this latency, other than disabling active
checks on all the unreachable hosts? Or any 'parallel' check tweaks I may
have mis-configured?

I'm happy to provide any other info

Thanks for any help
Kristian

 
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Check for the IBM ds4000 series totalstorage cabinett - FIX

2010-05-06 Thread Giorgio Zarrelli
Hi,

yes you can. I splitted all the major files in smaller files and nested
directories, so it's easier for me to manage all the services and hosts.

And I assure you, I have an heavy split configuration.

quota chi=Enrico Zimol
 Hi at all,
 I'm newbie on nagios and I'm writing here to ask you for suggestions
 abut how to structure my monitoring situation.
 I've to monitor linux servers for about 15/20 customers, from 1 to 5
 server for each customer.
 We aren't on vpn with customers, so this servers are all behind NAT.
 That isn't a problem because we are the administrator of the firewall
 (other linux server) so we can manage any kind of DNAT and filter
 rule.

 I read on official documentation that suggest to use NCSA addon for
 distributed monitoring, but we choose to use NRPE addon for different
 motivations like:
 -customer force us to do that
 -the number of monitored servers for each customer will never grow up
 -the services to monitor for each server are the same (raid hw/sw,
 disk usage etc)
 -we need a completly centralized monitoring structure

 For last sentence I thought to use the arguments option on NRPE (yes,
 I read the SECURITY document).
 Besides, to solve the problem of NAT with NRPE I'll do DNAT on
 firewall and the port parameter on check_nrpe plugin (is there
 problems to do that? I did little tests but I prefear a confirm)


 To manage this structure I need to organized a well-formed config file
 structure on nagios server.

 I thinked to structure it like this

 obj--|
   |--templatelinuxserversgeneral.cfg
   |
   |--customer_1_directory|-templateserver.cfg
   |   |-server1.cfg
   |   |-server2.cfg
   |   |-servern.cfg
   |
   |--customer_2_directory|-templateserver.cfg
   |-server1.cfg
   |-servern.cfg


 Where:
 -templatelinuxserversgeneral.cfg is a very basic template for server
 -customer_1_directory in wich there is 1 file for each customer's server
 -templateserver.cfg will use templatelinuxserversgeneral and will add
 more specific common variabiles for that customer's server like the
 public IPAddress that will be the same for each customer's server.
 -servern.cfg in wich there will be some very specific server variables
 like nrpe port (read up).

 What do you think?
 How can I organize that service-server combination?


 Thank's so much

 P.S. sorry for my bad english

 --
 Enrico Zimol

 --
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null




--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Different contacts for services on same host

2010-05-06 Thread Assaf Flatto
Sebastian Gosenheimer wrote:
 Hi everybody,

 i'm just having some problems with my contacts and contactgroups.

 Let's say, i have a contactgroup xy and a contact ab. I set up a host with 
 the contactgroup xy and some services on this host with the contactgroup xy. 
 But now i have a special service on this host, where i only want 
 notifications send to the contact ab. Nagios is also sending the notification 
 e-mails correctly to the contact ab, but it's also sending the notification 
 e-mails to the contactgroup xy.

 Can someone tell what i'm missing? Thank you for your help!

 Kind regards,
 --sg

   
Is  the service using any templates ?
Does the template have a contactgroup  directive in it ?

you need to override any template or wide definitions that cover the 
service and put in the service specific definitions the directive : 
contacts ab

this will tell the service to just send to the specific contact .

Assaf

-- 
Never,Ever Cut A Deal With a Dragon 


I am doing a Charity Bike ride On the 27 of June for the
Capital to Coast Charity. Please help by Donating
http://www.justgiving.com/Lovefilm-capital-to-coast



--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] nagios.cmd does not seem to work properly

2010-05-06 Thread Cedric Jeanneret
Hello,

We'd like to use nagios.cmd pipe to send some signals. According to the 
examples in documentation, we can do a simple thing like that :

/bin/printf [%lu] ACKNOWLEDGE_HOST_PROBLEM;host1;1;1;1;Some One;Some 
Acknowledgement Comment\n $now  $commandfile

The problem is, it seems it's not taken in account - no log in nagios.log, no 
message is printed in shell, and more over nagios doesn't seem to do anything.

I tested with an error inside the command, and it seems to be parsed:
[1273156151] Warning: Unrecognized external command - 
SCHEDULE_HOST_SVC_CHECKs;fqdn;1273156151

but nothing else happens.

Is there a way to be sure that nagios do what we ask?

The main thing is:
- we're using NSCA plugin so that we have about 80 servers with their own 
nagios, sending results and status to a single host
- we tried to run this command only on remote hosts - maybe we have to use it 
on the nsca server ?

Any help is welcome.

Thank you in advance !

Best regards,

C.

-- 
Cédric Jeanneret |  System Administrator
021 619 10 32|  Camptocamp SA
cedric.jeanne...@camptocamp.com  |  PSE-A / EPFL

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios.cmd does not seem to work properly

2010-05-06 Thread Assaf Flatto
Cedric Jeanneret wrote:
 Hello,

 We'd like to use nagios.cmd pipe to send some signals. According to the 
 examples in documentation, we can do a simple thing like that :

 /bin/printf [%lu] ACKNOWLEDGE_HOST_PROBLEM;host1;1;1;1;Some One;Some 
 Acknowledgement Comment\n $now  $commandfile

 The problem is, it seems it's not taken in account - no log in nagios.log, no 
 message is printed in shell, and more over nagios doesn't seem to do anything.

 I tested with an error inside the command, and it seems to be parsed:
 [1273156151] Warning: Unrecognized external command - 
 SCHEDULE_HOST_SVC_CHECKs;fqdn;1273156151

 but nothing else happens.

 Is there a way to be sure that nagios do what we ask?

 The main thing is:
 - we're using NSCA plugin so that we have about 80 servers with their own 
 nagios, sending results and status to a single host
 - we tried to run this command only on remote hosts - maybe we have to use it 
 on the nsca server ?

 Any help is welcome.

 Thank you in advance !

 Best regards,

 C.

   
Enable the nagios debug ( see bottom of nagios.cfg file ) and test again 
what you insert .
Although the nagios debug file fills up and grows quite fast so you need 
to make sure to have sufficient space to have the input you want .

from what I know - external commands are logged in the nagios log - 
again check the option in the nagios.cfg

Assaf

-- 
Never,Ever Cut A Deal With a Dragon 


I am doing a Charity Bike ride On the 27 of June for the
Capital to Coast Charity. Please help by Donating
http://www.justgiving.com/Lovefilm-capital-to-coast



--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] high host latency on nagios master

2010-05-06 Thread shadih rahman
try lowering max_check_result_reaper value  I had good luck playing with
that value.  Thanks

On Tue, May 4, 2010 at 8:13 PM, Trisha Hoang tri...@rockyou.com wrote:

 Hi,
 The nagios *master *got really high host latency and I'm not sure how to
 tweak it. I ran the check_ping plugin on a handful of hosts and the rta
 averaged at 0.2 second so it's not the network.

 *Environment:*
 - 565 hosts
 - 6790 passive checks from the slaves
 - not using event broker
 - master server *actively* executes the hosts checks every 5 minutes and 
 *passively
 *processes checks every 1 minute
 - not doing performance data

 *Nagiostats*

 Nagios Stats 3.2.1
 Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org)
 Last Modified: 03-09-2010
 License: GPL

 CURRENT STATUS DATA
 --
 Status File:/var/log/nagios/status.dat
 Status File Age:0d 0h 0m 23s
 Status File Version:3.2.1

 Program Running Time:   0d 1h 32m 19s
 Nagios PID: 28282
 Used/High/Total Command Buffers:1316 / 3066 / 4096

 Total Services: 7745
 Services Checked:   7745
 Services Scheduled: 1381
 Services Actively Checked:  955
 Services Passively Checked: 6790
 Total Service State Change: 0.000 / 9.740 / 0.007 %
 Active Service Latency: 18.948 / 205.144 / 165.751 sec
 Active Service Execution Time:  0.007 / 9.051 / 0.055 sec
 Active Service State Change:0.000 / 5.460 / 0.006 %
 Active Services Last 1/5/15/60 min: 0 / 0 / 0 / 0
 Passive Service Latency:34.359 / 190.247 / 76.739 sec
 Passive Service State Change:   0.000 / 9.740 / 0.008 %
 Passive Services Last 1/5/15/60 min:0 / 3054 / 6774 / 6784
 Services Ok/Warn/Unk/Crit:  7720 / 1 / 0 / 24
 Services Flapping:  27
 Services In Downtime:   0

 Total Hosts:566
 Hosts Checked:  566
 Hosts Scheduled:566
 Hosts Actively Checked: 566
 Host Passively Checked: 0
 Total Host State Change:0.000 / 0.000 / 0.000 %
 Active Host Latency:0.000 / 3410.087 / 2413.051 sec
 Active Host Execution Time: 0.007 / 10.010 / 0.063 sec
 Active Host State Change:   0.000 / 0.000 / 0.000 %
 Active Hosts Last 1/5/15/60 min:0 / 8 / 10 / 565
 Passive Host Latency:   0.000 / 0.000 / 0.000 sec
 Passive Host State Change:  0.000 / 0.000 / 0.000 %
 Passive Hosts Last 1/5/15/60 min:   0 / 0 / 0 / 0
 Hosts Up/Down/Unreach:  563 / 3 / 0
 Hosts Flapping: 1
 Hosts In Downtime:  0

 Active Host Checks Last 1/5/15 min: 5 / 32 / 75
Scheduled:   0 / 0 / 0
On-demand:   5 / 32 / 75
Parallel:1 / 11 / 23
Serial:  0 / 0 / 0
Cached:  4 / 21 / 52
 Passive Host Checks Last 1/5/15 min:0 / 0 / 0
 Active Service Checks Last 1/5/15 min:  0 / 0 / 0
Scheduled:   0 / 0 / 0
On-demand:   0 / 0 / 0
Cached:  0 / 0 / 0
 Passive Service Checks Last 1/5/15 min: 2 / 1455 / 1455

 External Commands Last 1/5/15 min:  1302 / 6063 / 20253


 *Nagios.cfg*

 # EXTERNAL COMMAND CHECK INTERVAL
 # This is the interval at which Nagios should check for external commands.
 # This value works of the interval_length you specify later.  If you leave
 # that at its default value of 60 (seconds), a value of 1 here will cause
 # Nagios to check for external commands every minute.  If you specify a
 # number followed by an s (i.e. 15s), this will be interpreted to mean
 # actual seconds rather than a multiple of the interval_length variable.
 # Note: In addition to reading the external command file at regularly
 # scheduled intervals, Nagios will also check for external commands after
 # event handlers are executed.
 # NOTE: Setting this value to -1 causes Nagios to check the external
 # command file as often as possible.

 #command_check_interval=15s
 command_check_interval=-1

 # SERVICE INTER-CHECK DELAY METHOD
 # This is the method that Nagios should use when initially
 # spreading out service checks when it starts monitoring.  The
 # default is to use smart delay calculation, which will try to
 # space all service checks out evenly to minimize CPU load.
 # Using the dumb setting will cause all checks to be scheduled
 # at the same time (with no delay between them)!  This is not a
 # good thing for production, but is useful when testing the
 # parallelization functionality.
 #   n   = 

Re: [Nagios-users] nagios.cmd does not seem to work properly

2010-05-06 Thread Morris, Patrick
Cedric Jeanneret wrote:
 Hello,

 We'd like to use nagios.cmd pipe to send some signals. According to the 
 examples in documentation, we can do a simple thing like that :

 /bin/printf [%lu] ACKNOWLEDGE_HOST_PROBLEM;host1;1;1;1;Some One;Some 
 Acknowledgement Comment\n $now  $commandfile

 The problem is, it seems it's not taken in account - no log in nagios.log, no 
 message is printed in shell, and more over nagios doesn't seem to do anything.

 I tested with an error inside the command, and it seems to be parsed:
 [1273156151] Warning: Unrecognized external command - 
 SCHEDULE_HOST_SVC_CHECKs;fqdn;1273156151

 but nothing else happens.

 Is there a way to be sure that nagios do what we ask?

 The main thing is:
 - we're using NSCA plugin so that we have about 80 servers with their own 
 nagios, sending results and status to a single host
 - we tried to run this command only on remote hosts - maybe we have to use it 
 on the nsca server ?

You do, of course, have to run this on the Nagios server, since that's 
where the pipe is.  The message in your log seems to indicate you typo'd 
the SCHEDULE_HOST_SVC_CHECK  command.

If you're using NSCA, your pipe works: it submits commands to Nagios the 
same way.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_disk plugin

2010-05-06 Thread Aidan Anderson
Davide Blasi wrote:
 with or without quotes give me the same result :(

   

Try using single quotes, e.g.

-I '/my/fist/.*' -I '/second/.*'



--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_disk plugin

2010-05-06 Thread Davide Blasi


Aidan Anderson wrote :

Davide Blasi wrote:
 with or without quotes give me the same result :(

   

 Try using single quotes, e.g.

 -I '/my/fist/.*' -I '/second/.*'

No, It doesn't works :(
But I don't think that is a quote problem. 
If I invert path order, the check correctly accepts first argument but ignore 
the others.

Dave


--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Meaningful subject lines in e-mail alert

2010-05-06 Thread Kumar, Ashish
Greetings,

We are sending e-mail alerts on host/service state change.  I was just
wondering what do you guys use as subject lines; just looking around for
ideas.

Thank you in advance.
--

___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null