On Mar 25, 2009, at 2:30 PM, RijilV wrote: > 2009/3/24 Christopher McAtackney <[email protected]>: >> Hi all, >> >> I was wondering if someone could give a brief overview of the pros / >> cons of using NRPE to monitor my remote hosts versus using the >> check_by_ssh command? >> >> I'm aware that check_by_ssh increases the CPU overhead, but I'm not >> clear on the level of impact here - does this increase the load on >> the >> monitoring machine in direction relation to the number of hosts being >> monitored? For example, if I was using check_by_ssh to monitor, say, >> 2000 services spread across 200 hosts, would I experience significant >> slowdown on my monitoring machine? >> >> Cheers for any info, >> >> Chris >> > > > SSH is going to slow it down on both sides of the communication. SSH > does quite a bit more in terms of setting up the connection which > involves using asymmetric encryption to setup a shared secret for > symmetric encryption and verifying keys for the asymmetric part, > verifying access, allocating a session. Whereas NRPE even with > encryption just does a simple pre-shared secret for the symmetric > encryption, much faster even if using the same encryption algorithm > > > One thing you could do with SSH to speed it up (and I would argue make > it faster than NRPE depending on the stability of your network)) would > be to use ControlMaster. ControlMaster is a SSH v2 feature, where you > create a connection and can open up multiple sessions with that > ControlMaster for other SSH processes. This saves you not only the > key-exchange heavy lifting but also you're not opening up a new socket > on the remote host. In order to really make it worth it you'd have to > spawn a process that was continuously connected. I wrote an ugly > check_by_ssh that would spawn a ControlMaster if one didn't exist and > use it if it did. Reduced the load/latency quite a bit for SSH > checks. Though if I had to do it again I'd used 'ControlMaster auto' > (man 5 ssh_config) and create a separate check that was responsible > for maintaining the ControlMaster, then you could use the stock > check_by_ssh without any modifications. > > > That all being said, you might want to think about a distributed setup > anyhow, if nothing more for redundancy. 200 servers and 2,000 checks > is alot of responsibility for a singleton, you could break it 50/50 > between two servers that could take over for the other one if it > fails. > > > .r' > > ------------------------------------------------------------------------------ > _______________________________________________ > Nagios-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null
+1 on the control master. We have about 1000 checks over 300 hosts and using control master made the box much more stable and quite frankly usable. Saved a lot of plug in time outs as well. Think about 1000 checks every 5 or 10 minutes. That's 1000 encrypted tunnels that are going up and down. That's a lot of overhead for a quick check, let along if your server is checking say 5 or 10 things back to back. http://www.torchbox.com/blog/ssh_tips_2.html Charlie ------------------------------------------------------------------------------ _______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
