Guys, Currently we have a pretty large Opsview distributed setup at 3.5.0 and we are wanting to upgrade to 3.7.0 soon. Some of the threads about performance issues on reload are causing me to question if we should try the upgrade.
Currently we have 855 hosts/3093 services spread across 22 sites with slave servers. Our current reload time is up to 5 minutes which is already longer than we would like but acceptable. In the discussion I've seen going on a large number of hostgroups and contacts seem to impact the time of reload. Is there anything I can check on my current setup to gauge what effect the upgrade might have to our system. If our reload time went from 5 minutes to 15-30 minutes then the negative side-effects of the upgrade would outweigh the benefits from the new features. I would appreciate any guidance (or experiences) the group my provide with 3.7.0 and distributed setups. James Whittington VC3, Inc. From: opsview-users-boun...@lists.opsview.org [mailto:opsview-users-boun...@lists.opsview.org] On Behalf Of Ton Voon Sent: Tuesday, May 25, 2010 4:09 AM To: Opsview Users Subject: Re: [opsview-users] Extreme slowdown in reload after upgrade to 3.7.0 On 22 May 2010, at 15:46, Toni Van Remortel wrote: On 20 May 2010, at 14:19, Toni Van Remortel wrote: I had a quick look at the contacts.cfg file, and it shows me a possible source of the problem: large amount of host groups (163) and service groups (143) associated with various contacts. I have many groups that are eg about Servers, but for every site I create a separate group. So I have 9 hostgroups for Servers, 9 for Switches, etc. Stepping back a bit, why are there so many host groups and service groups? Well, I have different sites, but they have local administrators. So I need to have a good separation in the hosts and services for every site. That's why I have so many groups. Every site thus has a full hierarchy tree, but this means I have to maintain several of these trees here on the top. A nice feature would be to have another property on each host: site or company. And the option to link this to a user. Would make it much easier for me :) What would "site" property look like? Does site == slave? Currently, role selection is via host group in hierarchy UNION slave selection. Would UNION keyword selection help (could then have a keyword = services on a site)? Does this take > 6 minutes? Nope: nag...@node5:~$ time /usr/local/nagios/bin/rc.opsview check Checking configuration for /usr/local/nagios/etc/nagios.cfg... okay real 2m16.410s user 2m11.240s sys 0m1.090s Just to let you know that I've reproduced the problem and Nagios stalls in the conf.d directory. Must be a bug somewhere in Nagios for reading subdirectories. Haven't had a chance to look at this yet, but will try and get this into 3.7.1. Ton
_______________________________________________ Opsview-users mailing list Opsview-users@lists.opsview.org http://lists.opsview.org/lists/listinfo/opsview-users