[Bug 1719770] Re: hypervisor stats issue after charm removal if nova-compute service not disabled first
Fix proposed to branch: master Review: https://review.openstack.org/518520 ** Changed in: nova Status: New => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1719770 Title: hypervisor stats aggregates resources from deleted and existing services if they share the same hostname To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1719770/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1719770] Re: hypervisor stats issue after charm removal if nova-compute service not disabled first
** Also affects: nova Importance: Undecided Status: New ** Changed in: nova Assignee: (unassigned) => Edward Hope-Morley (hopem) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1719770 Title: hypervisor stats aggregates resources from deleted and existing services if they share the same hostname To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1719770/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1719770] Re: hypervisor stats issue after charm removal if nova-compute service not disabled first
Bug Confirmed - http://paste.ubuntu.com/25880271/ Deploying nova-compute to a host that previously had a nova-compute deployed to it (i.e. hostname recycled) will result in nova hypervisor stats reporting stats from both the deleted and active entries of that service. In terms of the nova-compute charm i think the topic of how it behaves when removing units has come up before i.e. that it should somehow mark the service/host as deleted when removing a unit of nova-compute. The problem with doing this is that the compute service would need to be provided with admin credentials since it no longer has direct access to the db. This has been previously raised and the bug is still pending - https://bugs.launchpad.net/charms/+source/nova-compute/+bug/1317560. ** Changed in: nova (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1719770 Title: hypervisor stats issue after charm removal if nova-compute service not disabled first To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1719770/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1719770] Re: hypervisor stats issue after charm removal if nova-compute service not disabled first
Hi @afreiberger, i've run a test using our charms and Xenial Mitaka and here are the results - http://pastebin.ubuntu.com/25879833/ A few points for clarification: When you do juju remove-unit nova-compute/0 the charm is not performing any cleanup at all i.e. it will not notify the cloud controller that the node is to be disabled or deleted. So what you are left with following a remove-unit is a compute node whose database state represents it as "State:down" "disabled:0" and "deleted:0". Therefore Nova is correctly still counting the compute resources associated with that node as available. If, subsequent to removing the unit, you then issue a 'openstack compute service delete ' the node is marked as "deleted:" and it's resources are no longer counted as available (as can be clearly seen in the pastebin output I provided above). I am still verifying whether entries with the same hostname but different deleted status are skewing the stats and will report back once ive confirmed but in any case the findings so far hopefully show that regardless you always need to manually delete a compute host using the api following its removal with juju. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1719770 Title: hypervisor stats issue after charm removal if nova-compute service not disabled first To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1719770/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1719770] Re: hypervisor stats issue after charm removal if nova-compute service not disabled first
I'm still working on reproducing. While attempting reproduction, I had an environment where I had 3 hosts, dummy charms on each for ubuntu, then added nova-compute to 3. Removed nova-compute unit from the third host, and still saw stats for it in hypervisor-stats. There may be some cleanup missing in charm-nova-compute relation-depart hooks to disable/remove the service. http://pastebin.ubuntu.com/25824195/ I think to reproduce, it might require a full remove-machine, add-unit ubuntu --to machine-name, add-unit nova-compute --to . Ed (~dosaboy) may be working on this reproduction. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1719770 Title: hypervisor stats issue after charm removal if nova-compute service not disabled first To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1719770/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1719770] Re: hypervisor stats issue after charm removal if nova-compute service not disabled first
Marking 'Incomplete' for now until we get a reproducer figured out. ** Changed in: nova (Ubuntu) Status: Confirmed => Incomplete ** Changed in: nova (Ubuntu) Importance: Undecided => Medium -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1719770 Title: hypervisor stats issue after charm removal if nova-compute service not disabled first To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1719770/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1719770] Re: hypervisor stats issue after charm removal if nova-compute service not disabled first
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: nova (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1719770 Title: hypervisor stats issue after charm removal if nova-compute service not disabled first To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1719770/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1719770] Re: hypervisor stats issue after charm removal if nova-compute service not disabled first
I can not reproduce the problem, I see SQL has used 'compute_nodes.deleted = 0' to filter deleted services as the following debug info shows (mitaka). (Pdb) (Pdb) 'SELECT count(compute_nodes.id) AS count_1, sum(compute_nodes.vcpus) AS sum_1, sum(compute_nodes.memory_mb) AS sum_2, sum(compute_nodes.local_gb) AS sum_3, sum(compute_nodes.vcpus_used) AS sum_4, sum(compute_nodes.memory_mb_used) AS sum_5, sum(compute_nodes.local_gb_used) AS sum_6, sum(compute_nodes.free_ram_mb) AS sum_7, sum(compute_nodes.free_disk_gb) AS sum_8, sum(compute_nodes.current_workload) AS sum_9, sum(compute_nodes.running_vms) AS sum_10, sum(compute_nodes.disk_available_least) AS sum_11 \nFROM compute_nodes, services \nWHERE compute_nodes.deleted = :deleted_1 AND services.disabled = 0 AND services."binary" = :binary_1 AND (services.host = compute_nodes.host OR services.id = compute_nodes.service_id)' This is result I run above SQL in mysql directly, all are OK. mysql> SELECT count(compute_nodes.id) AS count_1, sum(compute_nodes.vcpus) AS sum_1, sum(compute_nodes.memory_mb) AS sum_2, sum(compute_nodes.local_gb) AS sum_3, sum(compute_nodes.vcpus_used) AS sum_4, sum(compute_nodes.memory_mb_used) AS sum_5, sum(compute_nodes.local_gb_used) AS sum_6, sum(compute_nodes.free_ram_mb) AS sum_7, sum(compute_nodes.free_disk_gb) AS sum_8, sum(compute_nodes.current_workload) AS sum_9, sum(compute_nodes.running_vms) AS sum_10, sum(compute_nodes.disk_available_least) AS sum_11 FROM compute_nodes, services WHERE compute_nodes.deleted = 0 AND services.disabled = 0 AND services.binary = 'nova-compute' AND (services.host = compute_nodes.host OR services.id = compute_nodes.service_id); +-+---+---+---+---+---+---+---+---+---+++ | count_1 | sum_1 | sum_2 | sum_3 | sum_4 | sum_5 | sum_6 | sum_7 | sum_8 | sum_9 | sum_10 | sum_11 | +-+---+---+---+---+---+---+---+---+---+++ | 2 | 4 | 7902 |76 | 0 | 1024 | 0 | 6878 |76 | 0 | 0 | 72 | +-+---+---+---+---+---+---+---+---+---+++ 1 row in set (0.00 sec) mysql> SELECT sum(compute_nodes.vcpus) FROM compute_nodes, services WHERE compute_nodes.deleted = 0 AND services.disabled = 0 AND services.binary = 'nova-compute' AND (services.host = compute_nodes.host OR services.id = compute_nodes.service_id); +--+ | sum(compute_nodes.vcpus) | +--+ |4 | +--+ 1 row in set (0.00 sec) Below are steps I used to create test env: 1, There are 3 nova-compute nodes initially. 2, Then use 'openstack compute service delete 10' to delete one compute service. 3, 'select * from services where id=10' will show deleted field of this record is not 0 no longer. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1719770 Title: hypervisor stats issue after charm removal if nova-compute service not disabled first To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1719770/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1719770] Re: hypervisor stats issue after charm removal if nova-compute service not disabled first
In models.py, both in Mitaka and in master, I've found that the relation between ComputeNode and Service is using the following join in the Service context: primaryjoin='and_(Service.host == Instance.host,' 'Service.binary == "nova-compute",' 'Instance.deleted == 0)', As in my case, I've redeployed a deleted node as the same hostname (Service.host) this join is relating a deleted ComputeNode.host entry to the non-deleted Service.host entry. If I look at both my compute_nodes and services tables, it appears they should potentially be joined on the "id" field, rather than the "host" field, at least for this specific query, but this potentially breaks the Service object relation model for other query contexts such as instances running on a hypervisor. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1719770 Title: hypervisor stats issue after charm removal if nova-compute service not disabled first To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1719770/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1719770] Re: hypervisor stats issue after charm removal if nova-compute service not disabled first
This code has had some significant refactoring since mitaka; so its possible this only impacts older openstack releases. Either way, this is not a charm problem but a Nova issue AFAICT - raising distro task. ** Also affects: nova (Ubuntu) Importance: Undecided Status: New ** Changed in: charm-nova-compute Status: New => Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1719770 Title: hypervisor stats issue after charm removal if nova-compute service not disabled first To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1719770/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs