A 10x increase is very extreme. Can you send the output of /usr/local/nagios/var/log/create_and_send_configs.debug?
I attached it as it is a bit too large to post here. This debug tells me it is not the scp that is taking the most time - the longest scp is to node9 which took 1.14 seconds. So I don't think an rsync is the way to go. It looks like it is the verification step which is the main time user - node5_verify took 12 minutes. So Nagios is taking a long time validating the configurations (which explains your 100% nagios cpu load). Did you make other changes? Maybe adding service check dependencies for NRPE? No changes made, and the dependencies count up to 20 for that slave server. But those are not NRPE related. In total there are 145 service dependencies. I had a quick look at the contacts.cfg file, and it shows me a possible source of the problem: large amount of host groups (163) and service groups (143) associated with various contacts. I have many groups that are eg about Servers, but for every site I create a separate group. So I have 9 hostgroups for Servers, 9 for Switches, etc. In the meantime, it is possible to increase the number of parallel jobs which may reduce your reload time. I've added some documentation at: * http://docs.opsview.com/doku.php?id=opsview-community:configuration_files OK, I raised it to 16 (4 per CPU) and now the reload is brought down to 10 minutes. I also removed some contacts, hosts and services that are no longer required. Attached the latest create_and_send_configs.debug file. Still a long verify for some nodes ... Toni
create_and_send_configs.debug
Description: create_and_send_configs.debug
_______________________________________________ Opsview-users mailing list Opsview-users@lists.opsview.org http://lists.opsview.org/lists/listinfo/opsview-users