Public bug reported: Description =========== According to https://bugs.launchpad.net/nova/+bug/1756179, Currently we delete a nova-compute service, will delete compute_node records, resource provider records and host mapping records in DB. I found if deleting service when nova-compute service is active, it's no problem for deleting compute_node records and resource_provider records in DB, but nova-compute will continue to report the old resource_provider uuid. So when we restart nova-compute to recover service, will rasie ResourceProviderCreationFailed.
Steps to reproduce ================== 1. Check enviroment and resource_provider table. # nova service-list | grep 'nova-compute' | 3d9092b0-e164-4094-8672-1c855971218d | nova-compute | devstack-q | nova | enabled | up | MariaDB [placement]> select uuid,name from resource_providers; +--------------------------------------+------------+ | uuid | name | +--------------------------------------+------------+ | edfff022-c19f-4720-85f9-fd947ae36b07 | devstack-q | +--------------------------------------+------------+ 2. Deleting a compute service when nova-compute process is running, check resource_provider table. # nova service-delete 3d9092b0-e164-4094-8672-1c855971218d MariaDB [placement]> select * from resource_providers; Empty set (0.00 sec) 3. Wait a minute, restart nova-compute process. # systemctl restart devstack@n-cpu Expected result =============== nova-compute work properly and report to resource_provider with new uuid. Actual result =============== nova-compute raise 409 when creae a new uuid resource_provider, and report 'No resource provider with uuid 52943fd2-d700-416f-9e16-7fe4744979b3 found'. I found if nova-compute running, it will resume the old uuid to resource_providers when this uuid is gone. So current resource_provider uuid in DB is still 'edfff022-c19f-4720-85f9-fd947ae36b07'. Then nova-compute will try to create a new resource provider with name 'devstack-q'. Unfortunately, the name column in tables is unique. So I think we should check compute_id existence first, then update resource_provider_tree. If not exist, rasie ComputeHostNotFound instead of reporting. ** Affects: nova Importance: Undecided Assignee: xulei (605423512-j) Status: New ** Tags: placement ** Changed in: nova Assignee: (unassigned) => xulei (605423512-j) ** Tags added: placement -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1817833 Title: Check compute_id existence when nova-compute reports info to placement Status in OpenStack Compute (nova): New Bug description: Description =========== According to https://bugs.launchpad.net/nova/+bug/1756179, Currently we delete a nova-compute service, will delete compute_node records, resource provider records and host mapping records in DB. I found if deleting service when nova-compute service is active, it's no problem for deleting compute_node records and resource_provider records in DB, but nova-compute will continue to report the old resource_provider uuid. So when we restart nova-compute to recover service, will rasie ResourceProviderCreationFailed. Steps to reproduce ================== 1. Check enviroment and resource_provider table. # nova service-list | grep 'nova-compute' | 3d9092b0-e164-4094-8672-1c855971218d | nova-compute | devstack-q | nova | enabled | up | MariaDB [placement]> select uuid,name from resource_providers; +--------------------------------------+------------+ | uuid | name | +--------------------------------------+------------+ | edfff022-c19f-4720-85f9-fd947ae36b07 | devstack-q | +--------------------------------------+------------+ 2. Deleting a compute service when nova-compute process is running, check resource_provider table. # nova service-delete 3d9092b0-e164-4094-8672-1c855971218d MariaDB [placement]> select * from resource_providers; Empty set (0.00 sec) 3. Wait a minute, restart nova-compute process. # systemctl restart devstack@n-cpu Expected result =============== nova-compute work properly and report to resource_provider with new uuid. Actual result =============== nova-compute raise 409 when creae a new uuid resource_provider, and report 'No resource provider with uuid 52943fd2-d700-416f-9e16-7fe4744979b3 found'. I found if nova-compute running, it will resume the old uuid to resource_providers when this uuid is gone. So current resource_provider uuid in DB is still 'edfff022-c19f-4720-85f9-fd947ae36b07'. Then nova-compute will try to create a new resource provider with name 'devstack-q'. Unfortunately, the name column in tables is unique. So I think we should check compute_id existence first, then update resource_provider_tree. If not exist, rasie ComputeHostNotFound instead of reporting. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1817833/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp