See if this bug might be related to your problem... https://bugs.launchpad.net/nova/+bug/1060363
Byron Begin forwarded message "[Openstack] Base images removed in upgrade essex -> folsom and other stories": > We also came across an issue where some compute nodes were reporting bogus > resource stats. Eg: > > 2012-11-13 05:04:38 INFO nova.compute.manager [-] Updating host status > 2012-11-13 05:06:14 AUDIT nova.compute.resource_tracker [-] Free ram (MB): > -739665 > 2012-11-13 05:06:14 AUDIT nova.compute.resource_tracker [-] Free disk (GB): > 12654 > 2012-11-13 05:06:14 AUDIT nova.compute.resource_tracker [-] Free VCPUS: -188 > 2012-11-13 05:06:14 INFO nova.compute.resource_tracker [-] Compute_service > record updated for np-rcc6 > > This happened to be addressed by the following bug, it turns out it does a > regex for the db filter. > https://bugs.launchpad.net/nova/+bug/1060363 > > So a compute node of np-rcc5 would also pull in np-rcc50, np-rcc51.. and so > on and so on. > On Jan 7, 2013, at 9:50 AM, Jonathan Proulx <[email protected]> wrote: > Hi All, > > I have a growing problem in which compute nodes are puzzlingly over reporting > their resource utilization and thus appearing to be over utilized when they > are in fact empty. System is Ubuntu 12.04 using cloud archive Folsom > (2012.2-0ubuntu5~cloud0) problem appeared on a single node after upgrade from > Essex some months ago and has now grown to 5 nodes (the lowest numbered 5 > nodes both by IP and lexically by name) > > For example on the compute node "nova-1": > > 2013-01-07 10:39:43 INFO nova.compute.manager [-] Updating host status > 2013-01-07 10:41:02 AUDIT nova.compute.resource_tracker [-] Free ram (MB): > -397134 > 2013-01-07 10:41:02 AUDIT nova.compute.resource_tracker [-] Free disk (GB): > -3318 > 2013-01-07 10:41:02 AUDIT nova.compute.resource_tracker [-] Free VCPUS: -215 > 2013-01-07 10:41:02 INFO nova.compute.resource_tracker [-] Compute_service > record updated for nova-1 > > Oddly even though no instances are scheduled teh resource utilization does > vary, for example in the last 5hours: > > root@nova-1:~# grep 'Free VCPUS:' /var/log/nova/nova-compute.log|awk '{print > $NF}'|sort -n |uniq -c > 156 -218 > 3 -216 > 5 -215 > 2 -214 > 2 -212 > 1 -211 > 1 -210 > 5 -209 > 5 -208 > > # but no instances are running > root@nova-1:~# virsh list > Id Name State > ---------------------------------------------------- > > root@nova-1:~# > > # nor does OpenStack seem to *think* any instances are running or reserved by > any projects > # as seen by nova-manage service describe_resource nova-1 > > HOST PROJECT cpu mem(mb) hdd > nova-1 (total) 24 48295 602 > nova-1 (used_now) 233 433141 3740 > nova-1 (used_max) 0 0 0 > # note lack of a list of tenants here > > I can't replicate the issue intetionally but also can't clear appaerent > resource utilization. Tried direct manipulation of the database but that > gets reset by computenode reports, tried rebooting the nodes. I can always > fall back to just reinstalling them, but since this is still a pre-production > cluster I'd liek to understand what is happening. > > Anyone have an insight into why nova.compute.resource_tracker is so confused > or how I can force it to understand what resources are in use? Operationally > it isn't painful to reinstall, but it does hurt a bit not knowing what's > going on here. > > Thanks, > -Jon > _______________________________________________ > Mailing list: https://launchpad.net/~openstack > Post to : [email protected] > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : [email protected] Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp

