The problem seems to be the high amount of collectd processes running. Try killing all "collectd-client.rb" processes. There should be only one running per host.
In case you want to use the old method of monitoring you can follow this guide: http://docs.opennebula.org/stable/administration/monitoring/imsshpullg.html#imsshpullg On Mon, Jan 20, 2014 at 2:17 PM, Gerry O'Brien <[email protected]> wrote: > Hi Ruben, > > Below is the output of 'ps -ef | grep one' on a host that has been > disabled, rebooted and enabled. There are multiple versions of > collectd-client.rb kvm running. > > > We have discovered today a serious issue that is having an adverse > effect on our DNS system. When the machines below was enabled, immediately > our DNS server is flooded with requests from the host (see a sample below). > Our logs show that this has only started happening since the upgrade to > 4.4. If we don't get a fix for this we will have to go back to 4.2, which is > something I really don't want to do. > > Regards, > Gerry > > > > > oneadmin 3628 1 0 13:04 ? 00:00:00 ruby > /var/tmp/one/im/kvm.d/collectd-client.rb kvm /var/lib/one//datastores 4124 > 20 0 host101.scss.tcd.ie > oneadmin 4600 1 0 13:05 ? 00:00:00 ruby > /var/tmp/one/im/kvm.d/collectd-client.rb kvm /var/lib/one//datastores 4124 > 20 0 host101.scss.tcd.ie > oneadmin 6400 1 0 13:07 ? 00:00:00 ruby > /var/tmp/one/im/kvm.d/collectd-client.rb kvm /var/lib/one//datastores 4124 > 20 0 host101.scss.tcd.ie > oneadmin 9003 1 0 13:08 ? 00:00:00 ruby > /var/tmp/one/im/kvm.d/collectd-client.rb kvm /var/lib/one//datastores 4124 > 20 0 host101.scss.tcd.ie > oneadmin 12953 3628 0 13:10 ? 00:00:00 /bin/bash > /var/tmp/one/im/kvm.d/../run_probes kvm-probes /var/lib/one//datastores 4124 > 20 0 host101.scss.tcd.ie > oneadmin 12955 6400 0 13:10 ? 00:00:00 /bin/bash > /var/tmp/one/im/kvm.d/../run_probes kvm-probes /var/lib/one//datastores 4124 > 20 0 host101.scss.tcd.ie > oneadmin 12969 12953 0 13:10 ? 00:00:00 /bin/bash > /var/tmp/one/im/kvm.d/../run_probes kvm-probes /var/lib/one//datastores 4124 > 20 0 host101.scss.tcd.ie > oneadmin 12970 12969 0 13:10 ? 00:00:00 /bin/bash > /var/tmp/one/im/kvm.d/../run_probes kvm-probes /var/lib/one//datastores 4124 > 20 0 host101.scss.tcd.ie > oneadmin 12972 12955 0 13:10 ? 00:00:00 /bin/bash > /var/tmp/one/im/kvm.d/../run_probes kvm-probes /var/lib/one//datastores 4124 > 20 0 host101.scss.tcd.ie > oneadmin 12973 12972 0 13:10 ? 00:00:00 /bin/bash > /var/tmp/one/im/kvm.d/../run_probes kvm-probes /var/lib/one//datastores 4124 > 20 0 host101.scss.tcd.ie > oneadmin 13029 12973 0 13:10 ? 00:00:00 /bin/bash ./monitor_ds.sh > kvm-probes /var/lib/one//datastores 4124 20 0 host101.scss.tcd.ie > oneadmin 13030 12970 0 13:10 ? 00:00:00 /bin/bash ./monitor_ds.sh > kvm-probes /var/lib/one//datastores 4124 20 0 host101.scss.tcd.ie > > > > -2014 13:14:26.675 client 134.226.59.101#52314: query: host101.scss.tcd.ie > IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:26.680 client 134.226.59.101#51356: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:26.680 client 134.226.59.101#51356: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:26.822 client 134.226.59.101#47870: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:26.822 client 134.226.59.101#47870: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:26.824 client 134.226.59.101#58734: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:26.825 client 134.226.59.101#58734: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:26.952 client 134.226.59.101#39659: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:26.952 client 134.226.59.101#39659: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:26.952 client 134.226.59.101#53975: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:26.953 client 134.226.59.101#53975: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:27.108 client 134.226.59.101#36294: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:27.108 client 134.226.59.101#36294: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:27.109 client 134.226.59.101#59277: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:27.109 client 134.226.59.101#59277: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:27.347 client 134.226.59.101#49614: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:27.348 client 134.226.59.101#49614: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:27.350 client 134.226.59.101#44058: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:27.357 client 134.226.59.101#44058: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:27.458 client 134.226.59.101#51830: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:27.458 client 134.226.59.101#51830: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:27.461 client 134.226.59.101#38419: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:27.461 client 134.226.59.101#38419: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:31.184 client 134.226.59.101#38617: query: > host101.scss.tcd.ie IN A + (134.226.32.57) > 20-Jan-2014 13:14:31.184 client 134.226.59.101#38617: query: > host101.scss.tcd.ie IN AAAA + (134.226.32.57) > 20-Jan-2014 13:14:31.302 client 134.226 > > > > > > > > On 17/01/2014 17:45, Ruben S. Montero wrote: >> >> Hi Gerry >> >> Just to check, are you using 4.4 Final? We've seen this in the betas and >> "thought" we fixed for the final version. Also could you check that there >> are just one monitorization process at the hosts (collectd-client.sh, or >> equiv should be the name of the process) >> >> Also could you send us the lines from oned.log between Thu Jan 16 16:56:25 >> 2014 and Thu Jan 16 17:25:43 2014; plus the first lines that includes you >> oned.conf values (we are interested specially in those related to >> monitoring interval) >> >> >> Cheers >> >> Ruben >> >> >> >> >> On Fri, Jan 17, 2014 at 2:27 PM, Gerry O'Brien <[email protected]> wrote: >> >>> Hi, >>> >>> Below is a truncated log file for a VM. The monitor continually >>> cycles >>> through finding the machine RUNNING and stat UNKNOWN. This occurs for >>> many >>> many machines at the same time. All machines were created by a script. >>> >>> The VMs are Microsoft Windows 7 64bit Enterprise. Individual context >>> is created by a startup script. They run fine but eventually /var/log/one >>> is going overflow. >>> >>> Restarting oned seems to fix the problem but this is hardly a long >>> term solution. >>> >>> Any suggestions on what could be causing this? >>> >>> Regards, >>> Gerry >>> >>> >>> >>> >>> Thu Jan 16 16:56:21 2014 [DiM][I]: New VM state is ACTIVE. >>> Thu Jan 16 16:56:22 2014 [LCM][I]: New VM state is PROLOG. >>> Thu Jan 16 16:56:22 2014 [VM][I]: Virtual Machine has no context >>> Thu Jan 16 16:56:22 2014 [LCM][I]: New VM state is BOOT >>> Thu Jan 16 16:56:22 2014 [VMM][I]: Generating deployment file: >>> /var/lib/one/vms/1788/deployment.0 >>> Thu Jan 16 16:56:23 2014 [VMM][I]: ExitCode: 0 >>> Thu Jan 16 16:56:23 2014 [VMM][I]: Successfully execute network driver >>> operation: pre. >>> Thu Jan 16 16:56:25 2014 [VMM][I]: ExitCode: 0 >>> Thu Jan 16 16:56:25 2014 [VMM][I]: Successfully execute virtualization >>> driver operation: deploy. >>> Thu Jan 16 16:56:25 2014 [VMM][I]: ExitCode: 0 >>> Thu Jan 16 16:56:25 2014 [VMM][I]: Successfully execute network driver >>> operation: post. >>> Thu Jan 16 16:56:25 2014 [LCM][I]: New VM state is RUNNING >>> Thu Jan 16 16:56:51 2014 [LCM][I]: New VM state is UNKNOWN >>> Thu Jan 16 16:59:01 2014 [VMM][I]: VM found again, state is RUNNING >>> Thu Jan 16 16:59:23 2014 [LCM][I]: New VM state is UNKNOWN >>> Thu Jan 16 17:01:41 2014 [VMM][I]: VM found again, state is RUNNING >>> Thu Jan 16 17:01:58 2014 [LCM][I]: New VM state is UNKNOWN >>> Thu Jan 16 17:04:18 2014 [VMM][I]: VM found again, state is RUNNING >>> Thu Jan 16 17:04:39 2014 [LCM][I]: New VM state is UNKNOWN >>> Thu Jan 16 17:06:55 2014 [VMM][I]: VM found again, state is RUNNING >>> Thu Jan 16 17:07:06 2014 [LCM][I]: New VM state is UNKNOWN >>> Thu Jan 16 17:09:31 2014 [VMM][I]: VM found again, state is RUNNING >>> Thu Jan 16 17:09:31 2014 [LCM][I]: New VM state is UNKNOWN >>> Thu Jan 16 17:12:22 2014 [VMM][I]: VM found again, state is RUNNING >>> Thu Jan 16 17:12:27 2014 [LCM][I]: New VM state is UNKNOWN >>> Thu Jan 16 17:15:11 2014 [VMM][I]: VM found again, state is RUNNING >>> Thu Jan 16 17:15:22 2014 [LCM][I]: New VM state is UNKNOWN >>> Thu Jan 16 17:17:49 2014 [VMM][I]: VM found again, state is RUNNING >>> Thu Jan 16 17:18:00 2014 [LCM][I]: New VM state is UNKNOWN >>> Thu Jan 16 17:20:27 2014 [VMM][I]: VM found again, state is RUNNING >>> Thu Jan 16 17:20:34 2014 [LCM][I]: New VM state is UNKNOWN >>> Thu Jan 16 17:23:04 2014 [VMM][I]: VM found again, state is RUNNING >>> Thu Jan 16 17:23:08 2014 [LCM][I]: New VM state is UNKNOWN >>> Thu Jan 16 17:25:41 2014 [VMM][I]: VM found again, state is RUNNING >>> Thu Jan 16 17:25:43 2014 [LCM][I]: New VM state is UNKNOWN >>> >>> -- >>> Gerry O'Brien >>> >>> Systems Manager >>> School of Computer Science and Statistics >>> Trinity College Dublin >>> Dublin 2 >>> IRELAND >>> >>> 00 353 1 896 1341 >>> >>> _______________________________________________ >>> Users mailing list >>> [email protected] >>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org >>> >> >> > > > -- > Gerry O'Brien > > Systems Manager > School of Computer Science and Statistics > Trinity College Dublin > Dublin 2 > IRELAND > > 00 353 1 896 1341 > > _______________________________________________ > Users mailing list > [email protected] > http://lists.opennebula.org/listinfo.cgi/users-opennebula.org -- Javier Fontán Muiños Developer OpenNebula - The Open Source Toolkit for Data Center Virtualization www.OpenNebula.org | @OpenNebula | github.com/jfontan _______________________________________________ Users mailing list [email protected] http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
