Hello fellow Opsview users.
My company has had Opsview in use for a little over a year now and I think we
are starting to see some growing pains as more stuff gets added.
We are currently using version 3.5.0 on Ubuntu 8.04 , reverse tunnel
master/slave setup.
MySQL is running on the master server.
We only have 466 hosts and 1557 services but those checks are distributed
across 19 slave servers.
The master server is running on a VMWare ESX 3.5 host.
ESX Server CPU hovers between 20-35 %
ESX Server RAM is around 80%
The master Opsview server currently has 2 GB RAM, although it appears mysql is
using about half of it.
Here is a partial listing of top
top - 12:17:25 up 1 day, 2:46, 2 users, load average: 0.25, 0.88, 1.08
Tasks: 186 total, 3 running, 183 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.6%us, 1.7%sy, 0.0%ni, 77.5%id, 19.1%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 2112044k total, 1928028k used, 184016k free, 103416k buffers
Swap: 1341388k total, 776816k used, 564572k free, 466224k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4557 mysql 20 0 1596m 1.1g 4336 S 1 53.7 44:47.91 mysqld
4849 root 20 0 2564 920 772 S 1 0.0 5:54.77 vmware-guestd
25640 nagios 20 0 5764 2948 1596 S 1 0.1 0:00.03 update_snmptrap
2480 root 15 -5 0 0 0 S 1 0.0 4:46.58 kjournald
5066 nagios 20 0 23336 12m 1328 S 1 0.6 18:21.65 nagios
- One core issue we see other than general slowness is some of cgi's
just time out. We have some people that still like the old statusmap.cgi but
the full map never renders and the CPU spikes at 100%.
- Also on the new Events View we always get script time out errors in
the browser and the load time of events is pretty slow.
- Also I see a fair number of timeout errors in the ospview log
[2010/01/20 13:56:46] [exec_and_log] [WARN] ssh_exchange_identification:
Timeout waiting for version information.
[2010/01/20 15:32:22] [import_ndologsd] [WARN] Import of 1264019531.129820,
size=2685022, took 11.43 seconds > 5 seconds
[2010/01/20 15:33:00] [import_ndologsd] [WARN] Import of 1264019546.119956,
size=13666, took 34.00 seconds > 5 seconds
[2010/01/20 15:36:15] [import_ndologsd] [WARN] Import of 1264019759.068638,
size=2692963, took 15.83 seconds > 5 seconds
[2010/01/20 15:36:45] [import_ndologsd] [WARN] Import of 1264019774.998388,
size=18311, took 29.89 seconds > 5 seconds
[2010/01/21 09:28:08] [import_ndologsd] [WARN] Import of 1264084077.245904,
size=2696437, took 11.33 seconds > 5 seconds
[2010/01/21 09:28:46] [import_ndologsd] [WARN] Import of 1264084092.143017,
size=8507, took 34.52 seconds > 5 seconds
I'm looking for ideas on what the core bottleneck might be? I could add more
memory to the server, I could move MySQL to a different server, maybe move some
files to a RAM disk?
I'm willing to make upgrades where needed I just want to make sure I use
resources wisely..
Thanks for any advice you all may have.
James Whittington
VC3, Inc.
_______________________________________________
Opsview-users mailing list
[email protected]
http://lists.opsview.org/lists/listinfo/opsview-users