Guys,
Currently we have a pretty large Opsview distributed setup at 3.5.0 and we are 
wanting to upgrade to 3.7.0 soon.
Some of the threads about performance issues on reload are causing me to 
question if we should try the upgrade.

Currently we have 855 hosts/3093 services spread across 22 sites with slave 
servers.
Our current reload time is up to 5 minutes which is already longer than we 
would like but acceptable.

In the discussion I've seen going on a large number of hostgroups and contacts 
seem to impact the time of reload.
Is there anything I can check on my current setup to gauge what effect the 
upgrade might have to our system.

If our reload time went from 5 minutes to 15-30 minutes then the negative 
side-effects of the upgrade would outweigh the benefits from the new features.

I would appreciate any guidance (or experiences)  the group my provide with 
3.7.0 and distributed setups.

James Whittington
VC3, Inc.



From: opsview-users-boun...@lists.opsview.org 
[mailto:opsview-users-boun...@lists.opsview.org] On Behalf Of Ton Voon
Sent: Tuesday, May 25, 2010 4:09 AM
To: Opsview Users
Subject: Re: [opsview-users] Extreme slowdown in reload after upgrade to 3.7.0


On 22 May 2010, at 15:46, Toni Van Remortel wrote:


On 20 May 2010, at 14:19, Toni Van Remortel wrote:



I had a quick look at the contacts.cfg file, and it shows me a possible source 
of the problem: large amount of host groups (163) and service groups (143) 
associated with various contacts.
I have many groups that are eg about Servers, but for every site I create a 
separate group. So I have 9 hostgroups for Servers, 9 for Switches, etc.

Stepping back a bit, why are there so many host groups and service groups?

Well, I have different sites, but they have local administrators. So I need to 
have a good separation in the hosts and services for every site. That's why I 
have so many groups.
Every site thus has a full hierarchy tree, but this means I have to maintain 
several of these trees here on the top.
A nice feature would be to have another property on each host: site or company. 
And the option to link this to a user. Would make it much easier for me :)

What would "site" property look like? Does site == slave?

Currently, role selection is via host group in hierarchy UNION slave selection. 
Would UNION keyword selection help (could then have a keyword = services on a 
site)?

Does this take > 6 minutes?

Nope:

nag...@node5:~$ time /usr/local/nagios/bin/rc.opsview check
Checking configuration for /usr/local/nagios/etc/nagios.cfg... okay

real    2m16.410s
user    2m11.240s
sys     0m1.090s

Just to let you know that I've reproduced the problem and Nagios stalls in the 
conf.d directory. Must be a bug somewhere in Nagios for reading subdirectories. 
Haven't had a chance to look at this yet, but will try and get this into 3.7.1.

Ton

_______________________________________________
Opsview-users mailing list
Opsview-users@lists.opsview.org
http://lists.opsview.org/lists/listinfo/opsview-users

Reply via email to