[Yahoo-eng-team] [Bug 1811870] Re: libvirt reporting incorrect value of 4k (small) pages
** Changed in: nova Status: In Progress => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1811870 Title: libvirt reporting incorrect value of 4k (small) pages Status in OpenStack Compute (nova): Won't Fix Bug description: libvirt < 4.3.0 had an issue whereby assigning more than 4 GB of huge pages would result in an incorrect value for the number of 4k (small) pages. This was tracked and fixed via rhbz#1569678 and the fixes appear to have been backported to the libvirt versions for RHEL 7.4+. However, this is still an issue with the versions of libvirt available on Ubuntu 16.04, 18.04 and who knows what else. We should either alert the user that the bug exists or, better again, work around the issue using the rest of the (correct) values for different page sizes. # Incorrect value (Ubuntu 16.04, libvirt 4.0.0) $ virsh capabilities | xmllint --xpath /capabilities/host/topology/cells/cell[1] - 16298528 3075208 4000 0 ... (3075208 * 4) + (4000 * 2048) != 16298528 # Correct values (Fedora ??, libvirt 4.10) $ virsh capabilities | xmllint --xpath /capabilities/host/topology/cells/cell[1] - 32359908 8038777 100 0 ... (8038777 * 4) + (100 * 2048) == 32359908 [1] https://bugzilla.redhat.com/show_bug.cgi?id=1569678 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1811870/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1821088] Re: Virtual Interface creation failed due to duplicate entry
** Changed in: nova/wallaby Status: New => Won't Fix ** Changed in: nova/victoria Status: New => Won't Fix ** Changed in: nova/train Status: New => Won't Fix ** Changed in: nova/ussuri Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1821088 Title: Virtual Interface creation failed due to duplicate entry Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) train series: Won't Fix Status in OpenStack Compute (nova) ussuri series: Won't Fix Status in OpenStack Compute (nova) victoria series: Won't Fix Status in OpenStack Compute (nova) wallaby series: Won't Fix Status in OpenStack Compute (nova) xena series: Fix Released Bug description: Seen once in a test on stable/rocky: http://logs.openstack.org/48/638348/1/gate/heat-functional-convg- mysql-lbaasv2-py35/9d70590/logs/screen-n-api.txt.gz?level=ERROR The traceback appears to be similar to the one reported in bug 1602357 (which raises the possibility that https://bugs.launchpad.net/nova/+bug/1602357/comments/8 is relevant here): ERROR nova.api.openstack.wsgi [None req-e05ce059-71c4-437d-91e0-e4bc896acca6 demo demo] Unexpected exception in API method: nova.exception_Remote.VirtualInterfaceCreateException_Remote: Virtual Interface creation failed pymysql.err.IntegrityError: (1062, "Duplicate entry 'fa:16:3e:9d:18:a6/aac0ca83-b3d2-4b28-ab15-de2d3a3e6e16-0' for key 'uniq_virtual_interfaces0address0deleted'") oslo_db.exception.DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, "Duplicate entry 'fa:16:3e:9d:18:a6/aac0ca83-b3d2-4b28-ab15-de2d3a3e6e16-0' for key 'uniq_virtual_interfaces0address0deleted'") [SQL: 'INSERT INTO virtual_interfaces (created_at, updated_at, deleted_at, deleted, address, network_id, instance_uuid, uuid, tag) VALUES (%(created_at)s, %(updated_at)s, %(deleted_at)s, %(deleted)s, %(address)s, %(network_id)s, %(instance_uuid)s, %(uuid)s, %(tag)s)'] [parameters: {'created_at': datetime.datetime(2019, 3, 20, 16, 11, 27, 753079), 'tag': None, 'uuid': 'aac0ca83-b3d2-4b28-ab15-de2d3a3e6e16', 'deleted_at': None, 'deleted': 0, 'address': 'fa:16:3e:9d:18:a6/aac0ca83-b3d2-4b28-ab15-de2d3a3e6e16', 'network_id': None, 'instance_uuid': '890675f9-3a1e-4a07-8bed-8648cea9fbb9', 'updated_at': None}] (Background on this error at: http://sqlalche.me/e/gkpj) (This sequence of exceptions occurs 3 times, I assume because retrying is normally sufficient to fix a duplicate entry problem.) The test was heat_integrationtests.functional.test_cancel_update.CancelUpdateTest.test_cancel_update_server_with_port To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1821088/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1441419] Re: port 'binding:host_id' can't be removed when VM is deleted
This was fixed in neutron. There's no bug against nova here. ** Changed in: nova Status: Confirmed => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1441419 Title: port 'binding:host_id' can't be removed when VM is deleted Status in OpenStack Compute (nova): Won't Fix Bug description: reproduce this problem: 1. create a neutron port 2. use this port to boot a VM 3. delete this VM 4. we can see port still exist, but the 'binding:host_id' can't be removed the reason is that in _unbind_ports, when it update the port, it set 'port_req_body['port']['binding:host_id'] = None', but for neutron, when update the port, if the attribute is None, it will not change def _unbind_ports(self, context, ports, neutron, port_client=None): port_binding = self._has_port_binding_extension(context, refresh_cache=True, neutron=neutron) if port_client is None: # Requires admin creds to set port bindings port_client = (neutron if not port_binding else get_client(context, admin=True)) for port_id in ports: port_req_body = {'port': {'device_id': '', 'device_owner': ''}} if port_binding: port_req_body['port']['binding:host_id'] = None To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1441419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1974173] [NEW] Remaining ports are not unbound if one port is missing
Public bug reported: As part of the instance deletion process, we must unbind ports associated with said instance. To do this, we loop over all ports currently attached to an instance. However, if neutron returns HTTP 404 (Not Found) for any of these ports, we will return early and fail to unbind the remaining ports. We've seen the problem in the context of Kubernetes on OpenStack. Our deinstaller is brute-force, so it deletes ports and servers at the same time, so a race means the port can get deleted early. This normally wouldn't be an issue as we'd just "untrunk" it and proceed to delete it. But that won't work for SR-IOV ports as in that case you cannot "untrunk" bound ports. The solution here is obvious: if we fail to find a port, we should simply skip that and continue unbinding everything else. ** Affects: nova Importance: Medium Assignee: Stephen Finucane (stephenfinucane) Status: Confirmed ** Tags: neutron ** Tags added: neutron ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) ** Changed in: nova Importance: Undecided => Medium ** Changed in: nova Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1974173 Title: Remaining ports are not unbound if one port is missing Status in OpenStack Compute (nova): Confirmed Bug description: As part of the instance deletion process, we must unbind ports associated with said instance. To do this, we loop over all ports currently attached to an instance. However, if neutron returns HTTP 404 (Not Found) for any of these ports, we will return early and fail to unbind the remaining ports. We've seen the problem in the context of Kubernetes on OpenStack. Our deinstaller is brute-force, so it deletes ports and servers at the same time, so a race means the port can get deleted early. This normally wouldn't be an issue as we'd just "untrunk" it and proceed to delete it. But that won't work for SR-IOV ports as in that case you cannot "untrunk" bound ports. The solution here is obvious: if we fail to find a port, we should simply skip that and continue unbinding everything else. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1974173/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1934770] [NEW] Mismatch between forced host and AZ prevents move operations
Public bug reported: When spawning a new instance, it's possible to force the instance to a specific host by using a special 'availability_zone[:host[:node]]' syntax for the 'availability_zone' field in the request. For example, when using OSC: openstack server create --availability-zone my-az:my-host ... my- server Doing so bypasses the scheduler, which means the 'AvailabilityZoneFilter' never runs to validate the availability zone- host combo. As a result, the availability zone portion of this value is effectively ignored and the host will be used regardless of the availability zone requested. This has some nasty side-effects. For one, the availability zone information stored on the instance is generated from the availability zone of the host the instance boots on, *not* the availability zone requested in the host. This means that when a user runs 'openstack server show' or 'openstack server list --long', they'll see different availability zone information to what they requested. However, the value requested *is* recorded in 'RequestSpec' object created for the instance. This is reused if we attempt future move operations and because the availability zone information was never verified, it's possible to end up with an instance that can't be moved since no host with the matching availability zone information exists. The two issues collide with each other since the failure logs in the latter case will reference one availability zone value, while inspecting the instance record will show another value. This is seriously confusing. The solution seems to be to either (a) error out when an invalid availability zone-host combo is requested or simply ignore the availability zone aspect of the request, opting to use the value of the host instead (with a warning, ideally). Note that microversion 2.74 introduced a better way of requesting a specific host without bypassing the scheduler, using 'host' and 'hypervisor_hostname' fields in the body of the instance create request, however, the old way of doing things is not yet deprecated and even if it was, we'd still have to support this for older microversions. We should fix this DB discrepancy one way or the other. ** Affects: nova Importance: Medium Assignee: Stephen Finucane (stephenfinucane) Status: Confirmed ** Tags: availability-zones scheduler ** Tags added: availability-zones -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1934770 Title: Mismatch between forced host and AZ prevents move operations Status in OpenStack Compute (nova): Confirmed Bug description: When spawning a new instance, it's possible to force the instance to a specific host by using a special 'availability_zone[:host[:node]]' syntax for the 'availability_zone' field in the request. For example, when using OSC: openstack server create --availability-zone my-az:my-host ... my- server Doing so bypasses the scheduler, which means the 'AvailabilityZoneFilter' never runs to validate the availability zone- host combo. As a result, the availability zone portion of this value is effectively ignored and the host will be used regardless of the availability zone requested. This has some nasty side-effects. For one, the availability zone information stored on the instance is generated from the availability zone of the host the instance boots on, *not* the availability zone requested in the host. This means that when a user runs 'openstack server show' or 'openstack server list --long', they'll see different availability zone information to what they requested. However, the value requested *is* recorded in 'RequestSpec' object created for the instance. This is reused if we attempt future move operations and because the availability zone information was never verified, it's possible to end up with an instance that can't be moved since no host with the matching availability zone information exists. The two issues collide with each other since the failure logs in the latter case will reference one availability zone value, while inspecting the instance record will show another value. This is seriously confusing. The solution seems to be to either (a) error out when an invalid availability zone-host combo is requested or simply ignore the availability zone aspect of the request, opting to use the value of the host instead (with a warning, ideally). Note that microversion 2.74 introduced a better way of requesting a specific host without bypassing the scheduler, using 'host' and 'hypervisor_hostname' fields in the body of the instance create request, however, the old way of doing things is not yet deprecated and even if it was, we'd still have to support this for older microversions. We should fix this DB discrepancy one way or the other. To manage notifications about this bug
[Yahoo-eng-team] [Bug 1933954] Re: The binding-extended extension is no longer reported for ML2/OVN
This appears to have been introduced with [1]. The solution is likely to add the missing attribute to the list of supported extensions reported for this backend [2] [1] https://review.opendev.org/c/openstack/neutron/+/793141 [2] https://github.com/openstack/neutron/blob/cbbab2fac5ae85d049a8201c06b58f4d7cb33495/neutron/common/ovn/extensions.py#L85 ** Summary changed: - test_live_migration_with_trunk failing due to Call _is_port_status_active returns false in 60.00 seconds + The binding-extended extension is no longer reported for ML2/OVN ** No longer affects: nova -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1933954 Title: The binding-extended extension is no longer reported for ML2/OVN Status in neutron: In Progress Bug description: https://zuul.opendev.org/t/openstack/builds?job_name=nova-live- migration=master Started failing on the 28th, I assume because of changes outside of Nova? https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_8f7/771362/28/check /nova-live-migration/8f76ccd/testr_results.html 2021-06-29 06:35:24,460 125131 DEBUG[tempest.lib.common.utils.test_utils] Call _is_port_status_active returns false in 60.00 seconds }}} Traceback (most recent call last): File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 89, in wrapper return func(*func_args, **func_kwargs) File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 70, in wrapper return f(*func_args, **func_kwargs) File "/opt/stack/tempest/tempest/api/compute/admin/test_live_migration.py", line 285, in test_live_migration_with_trunk self.assertTrue( File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/unittest2/case.py", line 702, in assertTrue raise self.failureException(msg) AssertionError: False is not true To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1933954/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1791243] Re: launch-instance-from-volume.rst is not latest version
This doc needs to be reworked, but I think we should do so from scratch rather than copying the (now very old) stuff from the manuals. ** Changed in: nova Status: In Progress => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1791243 Title: launch-instance-from-volume.rst is not latest version Status in OpenStack Compute (nova): Won't Fix Bug description: We lost some changes of doc/source/user/launch-instance-from- volume.rst in openstack-manuals after ocata We need upload the latest doc from manuals repo [1], and merge all later changes[2][3][4] into this doc. [1] I4a556b6a596a28c0350c7411c147459c3f06d084 [2] Ifa2e2bbb4c5f51f13d1a5832bd7dbf9f690fcad7 [3] Ida4cf70a7e53fd37ceeadb5629e3221072219689 [4] Ifb99e727110c4904a85bc4a13366c2cae300b8df To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1791243/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1930448] [NEW] 'VolumeNotFound' exception is not handled
01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi return self._cs_request(url, 'GET', **kwargs) Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.8/dist-packages/cinderclient/client.py", line 206, in _cs_request Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi return self.request(url, method, **kwargs) Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.8/dist-packages/cinderclient/client.py", line 192, in request Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi raise exceptions.from_response(resp, body) Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi nova.exception.VolumeNotFound: Volume 44d317a3-6183-4063-868b-aa0728576f5f could not be found. Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: INFO nova.api.openstack.wsgi [None req-7fe03627-c4ce-4f4b-9d5c-3abd6b88d3e3 demo admin] HTTP exception thrown: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: DEBUG nova.api.openstack.wsgi [None req-7fe03627-c4ce-4f4b-9d5c-3abd6b88d3e3 demo admin] Returning 500 to user: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. ** Affects: nova Importance: Medium Assignee: Stephen Finucane (stephenfinucane) Status: In Progress ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => Medium ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1930448 Title: 'VolumeNotFound' exception is not handled Status in OpenStack Compute (nova): In Progress Bug description: Attempting to attach a volume using an invalid ID currently results in a HTTP 500 error. This error should be handled and a HTTP 4xx error returned instead. $ openstack server create ... \ --block-device source_type=volume,uuid=44d317a3-6183-4063-868b-aa0728576f5f,destination_type=volume,delete_on_termination=true \ --wait test-server Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. (HTTP 500) (Request-ID: req-7fe03627-c4ce-4f4b-9d5c-3abd6b88d3e3) where '44d317a3-6183-4063-868b-aa0728576f5f' is not an UUID corresponding to a valid volume. A full traceback from nova-compute is provided below. Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi [None req-7fe03627-c4ce-4f4b-9d5c-3abd6b88d3e3 demo admin] Unexpected exception in API method: nova.exception.VolumeNotFound: Volume 44d317a3-6183-4063-868b-aa0728576f5f could not be found. Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi Traceback (most recent call last): Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/volume/cinder.py", line 432, in wrapper Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi res = method(self, ctx, volume_id, *args, **kwargs) Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/volume/cinder.py", line 498, in get Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi item = cinderclient( Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.8/dist-packages/cinderclient/v2/volumes.py", line 281, in get Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi return self._get("/volumes/%s" % volume_id, "volume") Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.8/dist-packages/cinderclient/base.py", line 293, in _get Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR nova.api.openstack.wsgi resp, body = self.api.client.get(url) Jun 01 15:05:14 devstack-ubuntu2004 devstack@
[Yahoo-eng-team] [Bug 1914592] Re: oslo.policy 3.6.1 breaks nova
** Changed in: oslo.policy Status: Confirmed => Fix Released ** Changed in: nova Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1914592 Title: oslo.policy 3.6.1 breaks nova Status in OpenStack Compute (nova): Fix Released Status in oslo.policy: Fix Released Bug description: As seen on the requirements change [1], a recently introduced version of oslo.policy appears to be breaking nova [2]. Initial investigations suggest both oslo.policy and nova are partially to blame. [1] https://review.opendev.org/c/openstack/requirements/+/773779 [2] https://d138d4f526b4feb9aa23-c0b1a48165a1318087e38ccc28dcb2b0.ssl.cf5.rackcdn.com/773779/1/check/cross-nova-functional/d9729b8/testr_results.html To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1914592/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1914592] [NEW] oslo.policy 3.6.1 breaks nova
Public bug reported: As seen on the requirements change [1], a recently introduced version of oslo.policy appears to be breaking nova [2]. Initial investigations suggest both oslo.policy and nova are partially to blame. [1] https://review.opendev.org/c/openstack/requirements/+/773779 [2] https://d138d4f526b4feb9aa23-c0b1a48165a1318087e38ccc28dcb2b0.ssl.cf5.rackcdn.com/773779/1/check/cross-nova-functional/d9729b8/testr_results.html ** Affects: nova Importance: High Assignee: Stephen Finucane (stephenfinucane) Status: Confirmed ** Affects: oslo.policy Importance: Critical Assignee: Stephen Finucane (stephenfinucane) Status: Confirmed ** Changed in: nova Importance: Undecided => High ** Changed in: nova Status: New => Confirmed ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) ** Also affects: oslo.policy Importance: Undecided Status: New ** Changed in: oslo.policy Status: New => Confirmed ** Changed in: oslo.policy Importance: Undecided => Critical ** Changed in: oslo.policy Assignee: (unassigned) => Stephen Finucane (stephenfinucane) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1914592 Title: oslo.policy 3.6.1 breaks nova Status in OpenStack Compute (nova): Confirmed Status in oslo.policy: Confirmed Bug description: As seen on the requirements change [1], a recently introduced version of oslo.policy appears to be breaking nova [2]. Initial investigations suggest both oslo.policy and nova are partially to blame. [1] https://review.opendev.org/c/openstack/requirements/+/773779 [2] https://d138d4f526b4feb9aa23-c0b1a48165a1318087e38ccc28dcb2b0.ssl.cf5.rackcdn.com/773779/1/check/cross-nova-functional/d9729b8/testr_results.html To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1914592/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1912167] Re: Mistake in unit test: test_get_pinning_isolate_policy_bug_1889633
That is incorrect. The pcpuset field was added to the InstanceNUMACell object in commit 867d4471013bf6a70cd3e9e809daf80ea358df92 [1]. [1] https://github.com/openstack/nova/commit/867d4471013bf6a70cd3e9e809daf80ea358df92 ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1912167 Title: Mistake in unit test: test_get_pinning_isolate_policy_bug_1889633 Status in OpenStack Compute (nova): Invalid Bug description: Description === objects.InstanceNUMACell don't get pcpuset attribute In test_get_pinning_isolate_policy_bug_1889633 objects.InstanceNUMACell has cpuset and pcpuset passed to constructor. Only cpuset is valid but has wrong value (it should have cpu to pin if pinning is required). This test may confuse developers that pcpuset is available for objects.InstanceNUMACell and It may not pass on custom code that require proper cpuset value. Fix for Nova Train: diff --git a/nova/tests/unit/virt/test_hardware.py b/nova/tests/unit/virt/test_hardware.py index 8e6c049f04..5a153f7480 100644 --- a/nova/tests/unit/virt/test_hardware.py +++ b/nova/tests/unit/virt/test_hardware.py @@ -3247,8 +3247,7 @@ class CPUPinningCellTestCase(test.NoDBTestCase, _CPUPinningTestCaseBase): mempages=[], ) inst_pin = objects.InstanceNUMACell( -cpuset=set(), -pcpuset={0, 1}, +cpuset={0, 1}, memory=2048, cpu_policy=fields.CPUAllocationPolicy.DEDICATED, cpu_thread_policy=fields.CPUThreadAllocationPolicy.ISOLATE, To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1912167/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1914259] [NEW] Disabled USB controller breaks PPC64LE hosts
Public bug reported: As discussed on the mailing list [1], a recent change disabling the USB controller when no USB devices are found [2] has broken the PPC64LE third party CI job. It seems libvirt will add an implicit USB keyboard and mouse on PPC64 and PPC64LE architectures [3]. We probably need to special case this architecture. [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020153.html [2] https://review.opendev.org/c/openstack/nova/+/756549 [3] https://github.com/libvirt/libvirt/blob/3d42a57666/src/qemu/qemu_domain.c#L3559-L3560 ** Affects: nova Importance: High Assignee: Stephen Finucane (stephenfinucane) Status: Confirmed ** Tags: libvirt ppc64 ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => High ** Tags added: libvirt ppc64 ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1914259 Title: Disabled USB controller breaks PPC64LE hosts Status in OpenStack Compute (nova): Confirmed Bug description: As discussed on the mailing list [1], a recent change disabling the USB controller when no USB devices are found [2] has broken the PPC64LE third party CI job. It seems libvirt will add an implicit USB keyboard and mouse on PPC64 and PPC64LE architectures [3]. We probably need to special case this architecture. [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020153.html [2] https://review.opendev.org/c/openstack/nova/+/756549 [3] https://github.com/libvirt/libvirt/blob/3d42a57666/src/qemu/qemu_domain.c#L3559-L3560 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1914259/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1909269] Re: I create a server_groups vm , but server_group_members doesn't add one.
This is one of two per-user quotas, the other being 'key_pairs', where usages are not considered when validating limit create/update. They're always at zero. You can find more information at [1] [1] https://github.com/openstack/nova/blob/7527fdf6eafe47f0f783e9cdae8b79b76d6ca6b3/nova/quota.py#L178-L182 ** Tags added: quotas ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1909269 Title: I create a server_groups vm , but server_group_members doesn't add one. Status in OpenStack Compute (nova): Invalid Bug description: I create a server_groups vm , but server_group_members doesn't add one. After the virtual machine is created successfully."nova qouta-show --detail" is executed on the compute node,search result "server_group_members" parameter "in-use" doesn't add one. . To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1909269/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1909972] Re: a number of tests fail under ppc64el arch
As noted in the libvirt driver [1], we only test against x86 and x86_64. While this would be relatively easy to fix, the lack of a gate job means it will likely regress again in the future and also means we can't justifiably make this architecture as supported. I think the more likely issue is this: I'm marking this bug as severity:serious since your package has only Architecture:all binary packages, and should thus, in theory, build everywhere. Failure to build on ppc64el might indicate a serious issue in this package or in another package. Setting Architecture to indicate support for x86 and x86_64 only would seem far more sensible to me. [1] https://github.com/openstack/nova/blob/46899968619e4ea0ff2ab380977619bb29578d43/nova/virt/libvirt/driver.py#L572-L581 ** Changed in: nova Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1909972 Title: a number of tests fail under ppc64el arch Status in OpenStack Compute (nova): Won't Fix Bug description: Hi, As per this Debian bug entry: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=976954 a number of unit tests are failing under ppc64el arch. Please fix these or exclude the tests on this arch. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1909972/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1908507] Re: vif quotas not set for tap interface
This feature essentially deprecated, given it is only supported by specific backends, and it is unlikely that we will extend it any further. As a result, I'm marking this as WONTFIX and suggest you investigate neutron's native QoS support instead. You can find documentation for this here [1]. To the best of my knowledge, the QoS support is available for any ML2-based backends, including Calico plugin, but this will require some background reading. [1] https://docs.openstack.org/neutron/latest/admin/config-qos.html ** Changed in: nova Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1908507 Title: vif quotas not set for tap interface Status in OpenStack Compute (nova): Won't Fix Bug description: Description === Despite vif_inbound_average and vif_outbound_average being set, bandwidth settings are not propagated to an instance xml config in libvirt when using tap interface. Steps to reproduce == - nova flavor-key network_test set quota:vif_inbound_average=10240 - nova flavor-key network_test set quota:vif_outbound_average=10240 - create VM with said flavor - verify vm libvirt xml config Expected result === - tag is present in instance-id.xml config - bandwidth via iperf test is being shaped Actual result === - tag is not set - traffic is not limited Environment === - nova-compute 2:21.1.0-0ubuntu1~cloud0 - libvirt-daemon 6.0.0-0ubuntu8.4~cloud0 - Calico neutron plugin with network_type set to flat - Libvirt + KVM Proposed fix === Probably missing "designer.set_vif_bandwidth_config(conf, inst_type)" in method get_config_tap(..) Logs === nova-compute.log 2020-12-16 13:13:10.202 74913 DEBUG nova.virt.hardware [req-66bf23dd-7486-4e3d-9bda-1f23943f2379 2f8c89255e23468bbd2bd0ea6391a3cd c9604e4b7a0c443eb451181727e4e00a - default default] Getting desirable topologies for flavor Flavor(created_at=2020-12-16T10:20:59Z,deleted=False,deleted_at=None,description=None,disabled=False,ephemeral_gb=0,extra_specs={quota:vif_inbound_average='10240',quota:vif_inbound_peak='10240',quota:vif_outbound_average='10240',quota:vif_outbound_peak='10240'},flavorid='89c4daca-4ef3-4835-83e5-891f8e3c2664',id=204,is_public=True,memory_mb=4096,name='network_test',projects=,root_gb=10,rxtx_factor=1.0,swap=0,updated_at=None,vcpu_weight=0,vcpus=4) and image_meta ImageMeta(checksum='ecf90ee0a6b453638f95c7bfba9d17e2',container_format='bare',created_at=2020-10-07T09:17:40Z,direct_url=,disk_format='qcow2',id=f619fd08-3e7e-4ab8-a9b4-a8a13e575863,min_disk=0,min_ram=0,name='centos-7-chef',owner='c9604e4b7a0c443eb451181727e4e00a',properties=ImageMetaProps,protected=,size=1470693376,status='active',tags=,updated_at=2020-10-07T09:19:45Z,virtual_size=,visibility=), allow threads: True _get_desirable_cpu_topologies /usr/lib/python3/dist-packages/nova/virt/hardware.py:594 ... 2020-12-16 13:13:10.235 74913 DEBUG nova.virt.libvirt.vif [req-66bf23dd-7486-4e3d-9bda-1f23943f2379 2f8c89255e23468bbd2bd0ea6391a3cd c9604e4b7a0c443eb451181727e4e00a - default default] vif_type=tap instance=Instance(access_ip_v4=None,access_ip_v6=None,architecture=None, auto_disk_config=False,availability_zone='dc2',cell_name=None,cleaned=False,config_drive='',created_at=2020-12-16T12:13:30Z,default_ephemeral_device=None,default_swap_device=None,deleted=False,deleted_at=None,device_metadata=None,disable_terminate=False,display_descript ion='martin-net-test-1',display_name='martin-net-test-1',ec2_ids=EC2Ids,ephemeral_gb=0,ephemeral_key_uuid=None,fault=,flavor=Flavor(204),hidden=False,host='cmp08-dc2.ost.mall.local',hostname='martin-net-test-1',id=35253,image_ref='f619fd08-3e7e-4ab8-a9b4-a8a13e575863 ',info_cache=InstanceInfoCache,instance_type_id=204,kernel_id='',key_data='abc123',key_name='molexa',keypairs=KeyPairList,launch_index=0,launched_at=None,launched_on='cmp08-dc2.ost.mall.local',locked=False,locked_by=None,memory_mb=4096,metadata={},migration_context=None,new_flavor=None,node='cmp08-dc2.ost.mall.local',numa_topology=None,old_fl avor=None,os_type=None,pci_devices=,pci_requests=InstancePCIRequests,power_state=0,progress=0,project_id='c9604e4b7a0c443eb451181727e4e00a',ramdisk_id='',reservation_id='r-48w9d5fs',resources=None,root_device_name='/dev/vda',root_gb=10,security_groups=SecurityGroupLi st,services=,shutdown_terminate=False,system_metadata={boot_roles='admin,member,reader,heat_stack_owner',image_base_image_ref='f619fd08-3e7e-4ab8-a9b4-a8a13e575863',image_container_format='bare',image_disk_format='qcow2',image_min_disk='10',image_min_ram='0',image_ow
[Yahoo-eng-team] [Bug 1906266] Re: After upgrade: "libvirt.libvirtError: Requested operation is not valid: format of backing image %s of image %s was not specified"
Given the above, the solution here seems to be to update your version of libvirt to >= 6.1.0. I'm going to mark this as WONTFIX. If this does not resolve the issue, please reset to new and provide information on the version of libvirt you've tested with and detailed logs from nova- compute showing the error. ** Changed in: nova Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1906266 Title: After upgrade: "libvirt.libvirtError: Requested operation is not valid: format of backing image %s of image %s was not specified" Status in OpenStack Compute (nova): Won't Fix Bug description: In a site upgraded to Ussuri we are getting faults starting instances 2020-11-30 13:41:40.586 232871 ERROR oslo_messaging.rpc.server libvirt.libvirtError: Requested operation is not valid: format of backing image '/var/lib/nova/instances/_base/xxx' of image '/var/lib/nova/instances/xxx' was not specified in the image metadata (See https://libvirt.org/kbase/backing_chains.html for troubleshooting) Bug #1864020 reports similar symptoms, where due to an upstream change in Libvirt v6.0.0+ images need the backing format specified. The fix for Bug #1864020 handles the case for new instances. However, for upgraded instances we're hitting the same problem, as those still don't have backing format specified. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1906266/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1907216] Re: Wrong image ref after unshelve
Moving this to invalid based on the comments from Lucian on [1] I think this should be fixed at the Hyper-V driver level. The stashed image will be removed from Glance once the instance is unstashed, so there's no value in updating the Nova instance db record to point to it. In fact, users are probably interested in the original image. [1] https://review.opendev.org/c/openstack/nova/+/765924 ** Changed in: nova Status: New => Confirmed ** Changed in: nova Status: Confirmed => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1907216 Title: Wrong image ref after unshelve Status in compute-hyperv: New Status in OpenStack Compute (nova): Invalid Bug description: After an instance is unshelved, the instance image ref will point to the original image instead of the snapshot created during the shelving [1][2]. Subsequent instance operations will use the wrong image id. For example, in case of cold migrations, Hyper-V instances will be unable to boot since the differencing images will have the wrong base [3]. Other image related operations might be affected as well. As pointed out by Matt Riedemann on the patch [1], Nova shouldn't set back the original image id, instead it should use the snapshot id. [1] I3bba0a230044613e07122a6d122597e5b8d43438 [2] https://github.com/openstack/nova/blob/22.0.1/nova/compute/manager.py#L6625 [3] http://paste.openstack.org/raw/800822/ To manage notifications about this bug go to: https://bugs.launchpad.net/compute-hyperv/+bug/1907216/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1908133] Re: Nova does not track shared ceph pools across multiple nodes
*** This bug is a duplicate of bug 1522307 *** https://bugs.launchpad.net/bugs/1522307 This is a well-known issue. Closing as a duplicate. ** This bug has been marked a duplicate of bug 1707256 Scheduler report client does not account for shared resource providers ** This bug is no longer a duplicate of bug 1707256 Scheduler report client does not account for shared resource providers ** This bug has been marked a duplicate of bug 1522307 Disk usage not work for shared storage -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1908133 Title: Nova does not track shared ceph pools across multiple nodes Status in OpenStack Compute (nova): New Bug description: Environment: - tested in focal-victoria and bionic-stein == Steps to reproduce: 1) Deploy OpenStack having 2 nova-compute nodes 2) Configure both compute nodes to have a RBD backend pointing to the same pool in ceph as below: [libvirt] images_type = rbd images_rbd_pool = nova 3) run "openstack hypervisor show" on each node. Both will show the full pool capacity: local_gb | 29 local_gb_used| 0 free_disk_gb | 29 disk_available_least | 15 4) create a 20gb instance and run "openstack hypervisor show" again on the node it landed: local_gb | 29 local_gb_used| 20 free_disk_gb | 9 disk_available_least | 15 5) create another 20GB one. It will land on the other hypervisor 6) try to create a third 20GB one, it will fail because placement will not return an allocation candidate. This is correct. 7) Now ssh to both the instances and fill their disk (actually based on disk_available_least that is read from ceph df, only one may need to be filled) 8) I/O for all instances will be frozen as the ceph pool runs out of space, and the nova-compute service freezes on "create_image" whenever a new instance is attempted to be created there, causing it to be reported as "down". 9) disk_available_least will be updated to 0, but that doesn't prevent new instances from being scheduled. This is the first problem as both compute nodes have their tracking disconnected from the ceph pool on "free_disk_gb" and "local_gb_used", while "disk_available_least" is not used by the scheduler to prevent the problem while disk_allocation_ratio is 1.0 (it is used by live- migration appropriately though). Alternatively (as a possible solution/fix/workaround), following the steps in [0] and [1] to have placement as a centralized place for the shared ceph pool. I ran the following steps: 10) openstack resource provider create ceph_nova_pool 11) openstack resource provider inventory set --os-placement-api- version 1.19 --resource DISK_GB=30 12) openstack resource provider trait set --os-placement-api-version 1.19 --trait MISC_SHARES_VIA_AGGREGATE 13) openstack resource provider aggregate set --aggregate --aggregate --generation 2 --os-placement-api-version 1.19 14) Deleted all instances and repeated steps 4, 5 and 6 but same result 15) openstack resource provider set --name --parent-provider --os-placement-api-version 1.19 16) openstack resource provider set --name --parent-provider --os-placement-api-version 1.19 17) Deleted all instances and repeated steps 4, 5 and 6. Now I was able to create 3 instances, where 1 of them had allocations from the ceph_nova_pool resource provider. The created resource_provider is being treated as an "extra" resource provider. 18) Deleted 2 instances that had allocations from the compute nodes 19) openstack resource provider inventory delete --resource-class DISK_GB 20) openstack resource provider inventory delete --resource-class DISK_GB 21) watch openstack allocation candidate list --resource DISK_GB=20 --os-placement-api-version 1.19 Now, the list would be empty, until nova-compute periodically updates the inventory with its local_gb value and we go back to the state at step 17. == Expected result: - For the first approach, it is expected that scheduling would be affected by the disk_available_least value (accordingly to disk_allocation_ratio as well) to avoid allowing the creation of instances when there is no space. - For the second approach, it is expected that there is a way to prevent nova-compute when periodically updating a specific inventory, or guarantee that its inventory is shared with another resource_provider instead of an "extra" one. [0] https://github.com/openstack/placement/blob/c02a073c523d363d7136677ab12884dc4ec03e6f/placement/objects/research_context.py#L1107 [1] https://docs.openstack.org/placement/latest/user/provider-tree.html To manage notifications about this bug go to:
[Yahoo-eng-team] [Bug 1907179] Re: resize revert will let new data lost!
I assume you're referring to instances with ephemeral or local storage? If so, this is expected behaviour. Resize revert deletes the instance on the destination host and resumes the old instance on the source host. You can work around this by using boot from volume or RBD. ** Changed in: nova Status: New => Opinion ** Changed in: nova Status: Opinion => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1907179 Title: resize revert will let new data lost! Status in OpenStack Compute (nova): Invalid Bug description: * description: Hi all,I found a serious problem in revert resize action for a vm. the revert resize operation will let the increments data lost. the reproduction steps show below. * Step-by-step reproduction steps: 1、 create a new vm, which flavor choice C1-R2-D10(Core:1, Ram: 2G, Disk:10), image:centos 76 2、login the vm, and touch a new file: 1.txt, and write some message to the file. like: echo 'aaa' > 1.txt 3、do resize operation for the vm. change flavor to C2-R4-D20. but do not confirm or revert the operation now. 4、login the vm again, and touch another new file: 2.txt, and write some message to the file. like: echo 'bbb' > 2.txt 5、now we do revert-resize operation, revert the resize operation which in 2 step. 6、when the revert resize done, login in the vm, and we will found the 2.txt will lost. * Expected output: I think the increment 2.txt added between do resize and revert should be preserved. * Actual output: but the 2.txt lost !! * Version: ** OpenStack version: Rocky ** Linux distro, kernel. CentOS Linux release 7.8.2003 (Core)、3.10.0-1127.el7.x86_64 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1907179/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1906781] Re: One of the image's name is Chinese, the execution of glance image-list shows the error "ascii codec can't encode characters in position 953-954: ordinal not in range
I don't know how this is related to nova. Moving to glanceclient. ** Project changed: nova => python-glanceclient -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1906781 Title: One of the image's name is Chinese, the execution of glance image-list shows the error "ascii codec can't encode characters in position 953-954: ordinal not in range(128)". Status in Glance Client: New Bug description: One of the image's name is Chinese, the execution of glance image-list shows the error "ascii codec can't encode characters in position 953-954: ordinal not in range(128)". 1.create a image name is Chinese. 2.executing the "glance image-list". 3.error:ascii codec can't encode characters in position 953-954: ordinal not in range(128) To manage notifications about this bug go to: https://bugs.launchpad.net/python-glanceclient/+bug/1906781/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1903879] Re: Server Remove Fixed IP is not working in the Rocky
This is an issue with OSC, not nova [1]. The fix should be relatively easy but no one has had a chance to address it yet, unfortunately. [1] https://storyboard.openstack.org/#!/story/2002925 ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1903879 Title: Server Remove Fixed IP is not working in the Rocky Status in OpenStack Compute (nova): Invalid Bug description: Description === I am testing features of the Rocky release before we'll upgrade from Queens and I have possibly found a bug, but maybe I am just doing something wrong. Steps to reproduce == 1. create a server instance (IP given by DHCP = 10.244.255.28) 2. add a new fixed IP address to this instance (IP given by DHCP = 10.244.255.22) 3. list instance to see the result os1-lab1:~ # openstack server list --long | ID | Name | Status | Task State | Power State | Networks | Image Name | Image ID | Flavor Name | Flavor ID| Availability Zone | Host | Properties | | 70f85125-d90f-4eba-899d-3c89e2bea697 | lwq-test-snap | ACTIVE | None | Running | lab-net=10.244.255.28, 2aff:::::3b, 10.244.255.22, 2aff:::::26 | lwq-test-snap | 554564df-245c-4e6b-8a02-47556e684c0b | t1.2c2r10d | a905bd3c-db79-415f-abd8-29666db713b4 | az2 | os1-lab10 | | 4. try to remove last added IP os1-lab1.ko:~ # openstack server remove fixed ip lwq-test-snap 10.244.255.22 remove_fixed_ip 5. nothing happened, even logs are clear and the server list shows the exact same output as posted above Expected result === Specified IP address should be remove. I've used this procedure on Queens release and before for countless times. Actual result = Nothing happened as shown above. Environment === 1. Exact version of OpenStack you are running. See the following list for all releases: http://docs.openstack.org/releases/ os1-lab1.ko:~ # dpkg -l | grep nova ii nova-api 2:18.3.0-0ubuntu1~cloud1 all OpenStack Compute - API frontend ii nova-common2:18.3.0-0ubuntu1~cloud1 all OpenStack Compute - common files ii nova-conductor 2:18.3.0-0ubuntu1~cloud1 all OpenStack Compute - conductor service ii nova-novncproxy2:18.3.0-0ubuntu1~cloud1 all OpenStack Compute - NoVNC proxy ii nova-placement-api 2:18.3.0-0ubuntu1~cloud1 all OpenStack Compute - placement API frontend ii nova-scheduler 2:18.3.0-0ubuntu1~cloud1 all OpenStack Compute - virtual machine scheduler ii python3-nova 2:18.3.0-0ubuntu1~cloud1 all OpenStack Compute Python 3 libraries ii python3-novaclient 2:11.0.0-0ubuntu1~cloud0 all client library for OpenStack Compute API - 3.x 2. Which hypervisor did you use? - Libvirt + KVM os1-lab1.ko:~ # dpkg -l | grep libvirt ii libvirt0:amd64 4.0.0-1ubuntu8.17 amd64 2. Which storage type did you use? - local storage + raw qcow2 3. Which networking type did you use? - nova-network + calico Logs & Configs == - I did not found anything useful, please specify what you would like to collect. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1903879/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1581977] Re: Invalid input for dns_name when spawning instance with .number at the end
I disagree. We already do sanitization of the hostname and fallback to a hostname 'Server-{instance.uuid}' if that returns an empty string. I think we should also do this fallback if the hostname is not a valid FQDN. Personally, I'd rather we provided a mechanism to set hostnames that was entirely decoupled from the instance name, like below, but that's a lot of work and I don't want to do it :) openstack server create --hostname foo.bar ... Until someone puts in the effort to do that, extending what we have will do just fine. ** Changed in: nova Status: Opinion => Triaged ** Changed in: nova Importance: Wishlist => Low -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1581977 Title: Invalid input for dns_name when spawning instance with .number at the end Status in OpenStack Compute (nova): Triaged Bug description: When attempting to deploy an instance with a name which ends in dot (e.g. .123, as in an all-numeric TLD) or simply a name that, after conversion to dns_name, ends as ., nova conductor fails with the following error: 2016-05-15 13:15:04.824 ERROR nova.scheduler.utils [req-4ce865cd-e75b- 4de8-889a-ed7fc7fece18 admin demo] [instance: c4333432-f0f8-4413-82e8-7f12cdf3b5c8] Error from last host: silpixa00394065 (node silpixa00394065): [u'Traceback (most recent call last):\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 1926, in _do_build_and_run_instance\nfilter_properties)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2116, in _build_and_run_instance\ninstance_uuid=instance.uuid, reason=six.text_type(e))\n', u"RescheduledException: Build of instance c4333432-f0f8-4413-82e8-7f12cdf3b5c8 was re-scheduled: Invalid input for dns_name. Reason: 'networking-ovn-ubuntu-16.04' not a valid PQDN or FQDN. Reason: TLD '04' must not be all numeric.\nNeutron server returns request_ids: ['req-7317c3e3-2875-4073-8076-40e944845b69']\n"] This throws one instance of the infamous Horizon message: Error: No valid host was found. There are not enough hosts available. This issue was observed using stable/mitaka via DevStack (nova commit fb3f1706c68ea5b58f05ea810c6339f2449959de). In the above example, the instance name is "networking-ovn (Ubuntu 16.04)", which resulted in an attempted dns_name="networking-ovn- ubuntu-16.04", where the 04 was interpreted as a TLD and, consequently, an invalid TLD. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1581977/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1852727] Re: PCI passthrough documentation does not describe the steps necessary to passthrough PFs
** Also affects: nova/trunk Importance: Undecided Status: New ** Changed in: nova/trunk Status: New => Confirmed ** Changed in: nova Importance: Undecided => Low ** Changed in: nova/trunk Importance: Undecided => Low ** Changed in: nova/trunk Assignee: (unassigned) => Stephen Finucane (stephenfinucane) ** No longer affects: nova/trunk ** Also affects: nova/train Importance: Undecided Status: New ** Changed in: nova/train Importance: Undecided => Low ** Changed in: nova/train Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1852727 Title: PCI passthrough documentation does not describe the steps necessary to passthrough PFs Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) train series: Confirmed Bug description: This came up on IRC [1]. By default, nova will not allow you to use PF devices unless you specifically request this type of device. This is intentional behavior to allow users to whitelist all devices from a particular vendor and avoid passing through the PF device when they meant to only consume the VFs. In the future, we might want to prevent whitelisting of both PF and VFs, but for now we should document the current behavior. [1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova /%23openstack-nova.2019-11-15.log.html#t2019-11-15T08:39:17 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1852727/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1904446] [NEW] 'GetPMEMNamespacesFailed' is not a valid exception
Public bug reported: Attempting to retrieve a non-existent PMEM device results in the following traceback: ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova During handling of the above exception, another exception occurred: ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova Traceback (most recent call last): ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/bin/nova-compute", line 10, in ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova sys.exit(main()) ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/lib/python3.6/site-packages/nova/cmd/compute.py", line 57, in main ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova topic=compute_rpcapi.RPC_TOPIC) ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/lib/python3.6/site-packages/nova/service.py", line 271, in create ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova periodic_interval_max=periodic_interval_max) ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/lib/python3.6/site-packages/nova/service.py", line 129, in __init__ ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova self.manager = manager_class(host=self.host, *args, **kwargs) ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 571, in __init__ ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova self.driver = driver.load_compute_driver(self.virtapi, compute_driver) ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/lib/python3.6/site-packages/nova/virt/driver.py", line 1911, in load_compute_driver ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova virtapi) ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/lib/python3.6/site-packages/oslo_utils/importutils.py", line 44, in import_object ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova return import_class(import_str)(*args, **kwargs) ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 446, in __init__ ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova vpmem_conf=CONF.libvirt.pmem_namespaces) ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 477, in _discover_vpmems ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova vpmems_host = self._get_vpmems_on_host() ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 512, in _get_vpmems_on_host ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova raise exception.GetPMEMNamespacesFailed(reason=reason) ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova AttributeError: module 'nova.exception' has no attribute 'GetPMEMNamespacesFailed' ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova It seems there was a typo introduced when this code was added. The code referenced 'GetPMEMNamespacesFailed' but the exception, which has since been removed since it was "unused", was called 'GetPMEMNamespaceFailed'. ** Affects: nova Importance: Medium Status: Confirmed ** Tags: libvirt ** Changed in: nova Importance: Undecided => Medium ** Changed in: nova Status: New => Confirmed ** Tags added: libvirt -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1904446 Title: 'GetPMEMNamespacesFailed' is not a valid exception Status in OpenStack Compute (nova): Confirmed Bug description: Attempting to retrieve a non-existent PMEM device results in the following traceback: ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova During handling of the above exception, another exception occurred: ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova Traceback (most recent call last): ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/bin/nova-compute", line 10, in ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova sys.exit(main()) ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/lib/python3.6/site-packages/nova/cmd/compute.py", line 57, in main ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova topic=compute_rpcapi.RPC_TOPIC) ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova File "/usr/lib/python3.6/site-packages/nova/service.py", line 271, in create ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova
[Yahoo-eng-team] [Bug 1904051] [NEW] Intermittent failures in cross-cell functional tests
Public bug reported: Some functional tests are failing due to the following error: Captured traceback: ~~~ Traceback (most recent call last): File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/test_cross_cell_migrate.py", line 1076, in test_resize_revert_from_stopped self.api.post_server_action(server['id'], {'migrate': None}) File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py", line 268, in post_server_action return self.api_post( File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py", line 210, in api_post return APIResponse(self.api_request(relative_uri, **kwargs)) File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py", line 186, in api_request raise OpenStackApiException( nova.tests.functional.api.client.OpenStackApiException: Unexpected status code: {"conflictingRequest": {"code": 409, "message": "Cannot 'migrate' instance 8841d71c-c29d-4dc8-9736-98dbc6ee221f while it is in task_state resize_reverting"}} This appears to be because we're not waiting for the resize-revert operation to fully complete before attempting other operations. We need to wait for the versioned notification emitted on the source compute, which occurs after the instance's task state has been updated, as opposed to simply waiting for the migration record to change status, which occurs before (and on the destination node). ** Affects: nova Importance: Medium Assignee: Stephen Finucane (stephenfinucane) Status: In Progress ** Tags: gate-failure ** Changed in: nova Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1904051 Title: Intermittent failures in cross-cell functional tests Status in OpenStack Compute (nova): In Progress Bug description: Some functional tests are failing due to the following error: Captured traceback: ~~~ Traceback (most recent call last): File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/test_cross_cell_migrate.py", line 1076, in test_resize_revert_from_stopped self.api.post_server_action(server['id'], {'migrate': None}) File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py", line 268, in post_server_action return self.api_post( File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py", line 210, in api_post return APIResponse(self.api_request(relative_uri, **kwargs)) File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py", line 186, in api_request raise OpenStackApiException( nova.tests.functional.api.client.OpenStackApiException: Unexpected status code: {"conflictingRequest": {"code": 409, "message": "Cannot 'migrate' instance 8841d71c-c29d-4dc8-9736-98dbc6ee221f while it is in task_state resize_reverting"}} This appears to be because we're not waiting for the resize-revert operation to fully complete before attempting other operations. We need to wait for the versioned notification emitted on the source compute, which occurs after the instance's task state has been updated, as opposed to simply waiting for the migration record to change status, which occurs before (and on the destination node). To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1904051/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1898554] [NEW] Legacy 'InstanceNUMACell' with 'mixed' policy results in 'TypeError'
Public bug reported: We added support for the 'mixed' CPU policy in Victoria. This required changes to the 'cpu_policy' field of the 'InstanceNUMACell' object. As part of that change, we had to check that the consumer of the o.vo supported the 'mixed' policy and, if not, raise an 'ObjectActionError'. Unfortunately we're attempting to use a tuple as a string in the string formatting for that exception's error message. As a result, if you attempt to actually raise it, you see the following: TypeError: not all arguments converted during string formatting ** Affects: nova Importance: Low Assignee: Stephen Finucane (stephenfinucane) Status: Confirmed ** Tags: libvirt numa ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => Low ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) ** Tags added: libvirt numa -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1898554 Title: Legacy 'InstanceNUMACell' with 'mixed' policy results in 'TypeError' Status in OpenStack Compute (nova): Confirmed Bug description: We added support for the 'mixed' CPU policy in Victoria. This required changes to the 'cpu_policy' field of the 'InstanceNUMACell' object. As part of that change, we had to check that the consumer of the o.vo supported the 'mixed' policy and, if not, raise an 'ObjectActionError'. Unfortunately we're attempting to use a tuple as a string in the string formatting for that exception's error message. As a result, if you attempt to actually raise it, you see the following: TypeError: not all arguments converted during string formatting To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1898554/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1898272] [NEW] "mixed" policy calculations don't account for host cells with no free shared CPUs
Public bug reported: The 'mixed' CPU policy allows us to use both shared and dedicated CPUs (VCPU and PCPU) in the same instance. The expectation is that the both sets of CPUs will use host cores from the same NUMA node(s). The current code does appear to be doing this, at least for single NUMA nodes, however, it does not account for NUMA nodes without any shared CPUs. # Steps to reproduce Configure a dual NUMA node host so that all cores from one node are assigned to '[compute] cpu_shared_set', while all the cores from the other node are assigned to '[compute] cpu_dedicated_set'. For example, on a host where cores 0-5 are on node 0, while cores 6-11 are on node 1: [compute] cpu_shared_set = 0-5 cpu_dedicated_set = 6-11 Now attempt to boot a guest using the mixed policy, e.g. $ openstack flavor create --vcpu 4 --ram 512 --disk 1 \ --property 'hw:cpu_policy=mixed' --property 'hw:cpu_dedicated_mask=^0' \ test.mixed $ openstack server create --os-compute-api-version=2.latest \ --flavor test.mixed --image cirros-0.5.1-x86_64-disk --nic none --wait \ test-server # Expected result The instance should fail to schedule as the 'NUMATopologyFilter' should reject the host. # Actual result The instance is scheduled but fails to boot since the following invalid XML snippet is generated: 4096 # <--- here This results in the following traceback in the nova-compute logs. ERROR nova.compute.manager [instance: ...] Traceback (most recent call last): ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/compute/manager.py", line 2625, in _build_resources ERROR nova.compute.manager [instance: ...] yield resources ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/compute/manager.py", line 2398, in _build_and_run_instance ERROR nova.compute.manager [instance: ...] accel_info=accel_info) ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3752, in spawn ERROR nova.compute.manager [instance: ...] cleanup_instance_disks=created_disks) ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6749, in _create_guest_with_network ERROR nova.compute.manager [instance: ...] cleanup_instance_disks=cleanup_instance_disks) ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ ERROR nova.compute.manager [instance: ...] self.force_reraise() ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, self.value, self.tb) ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise ERROR nova.compute.manager [instance: ...] raise value ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6718, in _create_guest_with_network ERROR nova.compute.manager [instance: ...] post_xml_callback=post_xml_callback) ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6643, in _create_guest ERROR nova.compute.manager [instance: ...] guest = libvirt_guest.Guest.create(xml, self._host) ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 145, in create ERROR nova.compute.manager [instance: ...] encodeutils.safe_decode(xml)) ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ ERROR nova.compute.manager [instance: ...] self.force_reraise() ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, self.value, self.tb) ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise ERROR nova.compute.manager [instance: ...] raise value ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 141, in create ERROR nova.compute.manager [instance: ...] guest = host.write_instance_config(xml) ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/host.py", line 1144, in write_instance_config ERROR nova.compute.manager [instance: ...] domain = self.get_connection().defineXML(xml) ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 190, in doit ERROR nova.compute.manager [instance: ...] result = proxy_call(self._autowrap, f, *args, **kwargs)
[Yahoo-eng-team] [Bug 1896496] [NEW] Combination of 'hw_video_ram' image metadata prop, 'hw_video:ram_max_mb' extra spec raises error
nager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/libvirt.py", line 3703, in defineXML ERROR nova.compute.manager [instance: ...] if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self) ERROR nova.compute.manager [instance: ...] libvirt.libvirtError: XML error: cannot parse video vram '8192.0' ERROR nova.compute.manager [instance: ...] This appears to be a Python 3 thing, introduced by division of ints now returning a float. Steps to reproduce: 1. Set the 'hw_video_ram' image metadata property on an image: $ openstack image set --property hw_video_ram=8 $IMAGE 2. Set the 'hw_video:ram_max_mb' flavor extra spec on a flavor: $ openstack flavor update --property hw_video:ram_max_mb=16384 $FLAVOR 3. Create a server using this flavor and image: $ openstack server create --image $IMAGE --flavor $FLAVOR ... test- server Expected result: Instance should be created with 8MB of VRAM. Actual result: Instance fails to create. ** Affects: nova Importance: Low Assignee: Stephen Finucane (stephenfinucane) Status: Confirmed ** Tags: libvirt ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => Low ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) ** Tags added: libvirt -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1896496 Title: Combination of 'hw_video_ram' image metadata prop, 'hw_video:ram_max_mb' extra spec raises error Status in OpenStack Compute (nova): Confirmed Bug description: The 'hw_video_ram' image metadata property is used to configure the amount of memory allocated to VRAM. Using it requires specifying the 'hw_video:ram_max_mb' extra spec or you'll get the following error: nova.exception.RequestedVRamTooHigh: The requested amount of video memory 8 is higher than the maximum allowed by flavor 0. However, specifying these currently results in a libvirt failure. ERROR nova.compute.manager [None ...] [instance: 11a71ae4-e410-4856-aeab-eea6ca4784c5] Failed to build and run instance: libvirt.libvirtError: XML error: cannot parse video vram '8192.0' ERROR nova.compute.manager [instance: ...] Traceback (most recent call last): ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/compute/manager.py", line 2333, in _build_and_run_instance ERROR nova.compute.manager [instance: ...] accel_info=accel_info) ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3632, in spawn ERROR nova.compute.manager [instance: ...] cleanup_instance_disks=created_disks) ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6527, in _create_domain_and_network ERROR nova.compute.manager [instance: ...] cleanup_instance_disks=cleanup_instance_disks) ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ ERROR nova.compute.manager [instance: ...] self.force_reraise() ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, self.value, self.tb) ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise ERROR nova.compute.manager [instance: ...] raise value ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6496, in _create_domain_and_network ERROR nova.compute.manager [instance: ...] post_xml_callback=post_xml_callback) ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6425, in _create_domain ERROR nova.compute.manager [instance: ...] guest = libvirt_guest.Guest.create(xml, self._host) ERROR nova.compute.manager [instance: ...] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 127, in create ERROR nova.compute.manager [instance: ...] encodeutils.safe_decode(xml)) ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ ERROR nova.compute.manager [instance: ...] self.force_reraise() ERROR nova.compute.manager [instance: ...] File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, self.value, self.tb) ERROR nova.compute.manager
[Yahoo-eng-team] [Bug 1599400] Re: nova boot has unexpected API error
The move to validate these parameters at the API layer introduced in Stein combined with the flavor extra spec validation work in Ussuri (API microversion 2.86 or later) should have seen off this issue. ** Changed in: nova Status: In Progress => Won't Fix ** Changed in: nova Assignee: Ken'ichi Ohmichi (oomichi) => (unassigned) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1599400 Title: nova boot has unexpected API error Status in OpenStack Compute (nova): Won't Fix Bug description: Description: = Nova allow users to set free-form flavor extra-specs "hw:cpu_policy" and "hw:cpu_thread_policy". However, these values are not true free- form values, but rather enum values. Specifying an invalid value for one of these values, and booting an instance with the invalid flavor will result in an uncaught ValueError in Nova and a HTTP 500 code being returned to the user. Reproduce: = # 1. create flavor 11 with an illegal extra_spec "hw:cpu_thread_policy=shared" $ nova flavor-create test 11 128 1 3 $ nova flavor-key 11 set hw:cpu_policy=dedicated $ nova flavor-key 11 set hw:cpu_thread_policy=shared # 2. boot an instance from that malformed flavor 11 $ nova boot --image --flavor 11 test Output: = ERROR (ClientException): Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. (HTTP 500) (Request-ID: req-a26ad5f3-7982-4361-8817-0ab111ac9ab1) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1599400/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1616539] Re: architecture not validated in "openstack image create"
This bug should be filed against glance, rather than nova. From what I can tell, glance provides a config option to allow users to opt-in to only allowing valid image metadata properties. ** Changed in: nova Status: In Progress => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1616539 Title: architecture not validated in "openstack image create" Status in OpenStack Compute (nova): Invalid Bug description: On Liberty $ openstack image create \ --public \ --container-format bare \ --disk-format qcow2 \ --min-disk 2 --min-ram 512 \ --file /home/images/SLES12SP1-cloudimage.qcow2 \ SLES12SP1-x86_64 +--+--+ | Field| Value| +--+--+ | checksum | fcdeb8b10730ac96bccc9a121ee030f4 | | container_format | bare | | created_at | 2016-08-02T20:51:45Z | | disk_format | qcow2| | file | /v2/images/e7f289aa-e689-4f0a-a0a0-43f341986fd5/file | | id | e7f289aa-e689-4f0a-a0a0-43f341986fd5 | | min_disk | 2| | min_ram | 512 | | name | SLES12SP1-x86_64 | | owner| f7ed231f244b4b2db8b1e580f36e1580 | | protected| False| | schema | /v2/schemas/image| | size | 362847744| | status | active | | updated_at | 2016-08-02T20:51:51Z | | virtual_size | None | | visibility | public | +--+--+ # openstack image set \ --name SLES12-SP1 \ --architecture x96_64 \ --os-distro sles \ # <-- the problem --os-version 12.1 \ SLES12SP1-x86_64 +--+--+ | Field| Value| +--+--+ | architecture | x96_64 | | checksum | fcdeb8b10730ac96bccc9a121ee030f4 | | container_format | bare | | created_at | 2016-08-02T20:51:45Z | | disk_format | qcow2| | file | /v2/images/e7f289aa-e689-4f0a-a0a0-43f341986fd5/file | | id | e7f289aa-e689-4f0a-a0a0-43f341986fd5 | | min_disk | 2| | min_ram | 512 | | name | SLES12-SP1 | | os_distro| sles | | os_version | 12.1 | | owner| f7ed231f244b4b2db8b1e580f36e1580 | | protected| False| | schema | /v2/schemas/image| | size | 362847744| | status | active | | tags | [] | | updated_at | 2016-08-02T20:53:00Z | | virtual_size | None | | visibility | public | +--+--+ $ openstack server create \ --flavor m1.smaller \ --image SLES12-SP1 \ vm01 Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. (HTTP 500) (Request-ID: req-5fc1ad67-86ad-4a69-b7f4-905861a8f2fc) # openstack image set \ --name SLES12-SP1 \ --architecture x86_64 \ --os-distro sles \ --os-version 12.1 \ SLES12-SP1
[Yahoo-eng-team] [Bug 1466451] Re: Nova should verify that devname in pci_passthrough_whitelist is not empty
** Changed in: nova Status: In Progress => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1466451 Title: Nova should verify that devname in pci_passthrough_whitelist is not empty Status in OpenStack Compute (nova): Won't Fix Bug description: According to https://wiki.openstack.org/wiki/SR-IOV-Passthrough-For-Networking: "The devname can be a valid PCI device name. The only device names that are supported are those displayed by the Linux utility ifconfig -a and correspond to either a PF or a VF on a vNIC" However it's possible to supply an empty string as devname e.g. pci_passthrough_whitelist = {"devname": "", "physical_network":"physnet2"} It's also possible to have an entry: pci_passthrough_whitelist = {"physical_network":"physnet2"} which shouldn't be valid. Nova should verify that devname is not an empty string and that devname,address or product_id/vendor_id are supplied. Version == python-nova-2015.1.0-4.el7ost.noarch Expected result = Nova compute should fail to start when specifying an empty string for devname when using physical_network or when not specifying devname,address or product_id/vendor_id To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1466451/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1299151] Re: nova-consoleauth processes requests when disabled
As noted in the review, nova-consoleauth has been removed so this bug no longer makes sense. ** Changed in: nova Status: In Progress => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1299151 Title: nova-consoleauth processes requests when disabled Status in OpenStack Compute (nova): Won't Fix Bug description: Not sure if this is a bug or not. But nova-consoleauth will process requests even if it is listed as disabled in the service list. | nova-consoleauth | u9-p| internal | disabled | up | | nova-consoleauth | u10-p | internal | enabled | up| | nova-consoleauth | u11-p | internal | enabled | up| In this case I can watch as u9-p continues to process requests from the message bus. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1299151/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1810490] Re: wrong link of gabbits
** Changed in: nova Status: In Progress => Fix Released ** Changed in: nova Importance: Undecided => Low -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1810490 Title: wrong link of gabbits Status in OpenStack Compute (nova): Fix Released Bug description: This bug tracker is for errors with the documentation, use the following as a template and remove or add fields as you see fit. Convert [ ] into [x] to check boxes: - [x] This doc is inaccurate in this way: The "gabbits" url is incorrect at https://docs.openstack.org/placement/latest/#rest-api - [ ] This is a doc addition request. - [x] I have a fix to the document that I can paste below including example: input and output. The correct is http://git.openstack.org/cgit/openstack/placement/tree/placement/tests/functional/gabbits If you have a troubleshooting or support issue, use the following resources: - Ask OpenStack: http://ask.openstack.org - The mailing list: http://lists.openstack.org - IRC: 'openstack' channel on Freenode --- Release: 0.0.1.dev10886 on 2018-11-19 19:26:28 SHA: 9d42491910e66ecd15767238bb617ed5984283f2 Source: https://git.openstack.org/cgit/openstack/placement/tree/doc/source/index.rst URL: https://docs.openstack.org/placement/latest/ To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1810490/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1840139] Re: Libvirt: Correct usage _guest_add_memory_balloon
I'm not entirely sure what the issue is here, but this doesn't sound like a bug per sé. At least, it's not something an end user will encounter. ** Changed in: nova Status: In Progress => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1840139 Title: Libvirt: Correct usage _guest_add_memory_balloon Status in OpenStack Compute (nova): Invalid Bug description: From the code, function _guest_add_memory_balloon in [1], if mem_stats_period_seconds set to 0 or negative value, the memory usage statistics will disabled. Is mem_stats_period_seconds can control the virtual memory balloon device added? Isn't it only control memory usage statistics? The virtual memory balloon device will be added by Libvirt as a default behavior.[2] So the name "_guest_add_memory_balloon" maybe misleading. [1] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py [2] https://libvirt.org/formatdomain.html#elementsMemBalloon To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1840139/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1806079] Re: revert use of stestr in stable/pike
** No longer affects: nova -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1806079 Title: revert use of stestr in stable/pike Status in Ubuntu Cloud Archive: Fix Released Bug description: The following commit changed dependencies of nova in the stable/pike branch and switched it to use stestr. There aren't any other projects (as far as I can tell) that use stestr in pike. This causes issues, for example, the Ubuntu cloud archive for pike doesn't have stestr. If possible I think this should be reverted. commit 5939ae995fdeb2746346ebd81ce223e4fe891c85 Date: Thu Jul 5 16:09:17 2018 -0400 Backport tox.ini to switch to stestr The pike branch was still using ostestr (instead of stestr) which makes running tests significantly different from queens or master. To make things behave the same way this commit backports most of the tox.ini from queens so that pike will behave the same way for running tests. This does not use the standard backport mechanism because it involves a lot of different commits over time. It's also not a functional change for nova itself, so the proper procedure is less important here. Change-Id: Ie207afaf8defabc1d1eb9332f43a9753a00f784d To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1806079/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1855934] Re: new versions of flake8 parse typeing coments
This was fixed in 26c1567a16d0bbf9ae19327aeafaa7ebc4394946. ** Changed in: nova Status: In Progress => Invalid ** Changed in: nova Status: Invalid => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1855934 Title: new versions of flake8 parse typeing coments Status in OpenStack Compute (nova): Fix Released Bug description: while playing with pre-commit i notice that new versions of flake8 parse type annotion comments. if you have not imported the relevent typing module then it fails with F821 undefined name nova/virt/hardware.py:1396:5: F821 undefined name 'Optional' nova/virt/hardware.py:1396:5: F821 undefined name 'List' nova/virt/hardware.py:1396:5: F821 undefined name 'Set' nova/virt/hardware.py:1426:5: F821 undefined name 'Optional' nova/virt/hardware.py:1426:5: F821 undefined name 'List' nova/virt/hardware.py:1426:5: F821 undefined name 'Set' nova/virt/hardware.py:1456:5: F821 undefined name 'Optional' nova/virt/hardware.py:1483:5: F821 undefined name 'Optional' nova/virt/hardware.py:1525:5: F821 undefined name 'Optional' nova/virt/hardware.py:1624:5: F821 undefined name 'Tuple' nova/virt/hardware.py:1646:5: F821 undefined name 'Optional' nova/virt/hardware.py:1658:5: F821 undefined name 'Optional' nova/virt/hardware.py:1674:5: F821 undefined name 'List' nova/virt/hardware.py:1696:5: F821 undefined name 'Optional' nova/virt/hardware.py:1920:29: F821 undefined name 'List' nova/virt/hardware.py:1939:31: F821 undefined name 'Set' while this is not an issue today because we pin to an old version of flake8 we should still fix this just as a code hygiene issue. given this has no impact on the running code im going to triage this as low an push a trivial patch. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1855934/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1847095] Re: The Nova Quobyte driver should use the LibvirtMountedFileSystemVolumeDriver parent class
** Changed in: nova Status: In Progress => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1847095 Title: The Nova Quobyte driver should use the LibvirtMountedFileSystemVolumeDriver parent class Status in OpenStack Compute (nova): Won't Fix Bug description: See note at [1] stating that all LibvirtBaseFileSystemVolumeDriver children should subclass LibvirtMountedFileSystemVolumeDriver instead. [1] https://github.com/openstack/nova/blob/c6218428e9b29a2c52808ec7d27b4b21aadc0299/nova/virt/libvirt/volume/fs.py#L101 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1847095/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1741810] Re: Filter AggregateImagePropertiesIsolation doesn't Work
We discussed this on IRC today [1]. In short, we realize that this was a change in behaviour introduced in Liberty that should have been better discussed at the time. However, Liberty was many years ago and it's genuinely debatable whether this was ever intended behaviour, let alone something we'd want to reintroduce support for. Having discussed this, we're going to document this change in behaviour in the docs and leave it there. If this (support for arbitrary image metadata properties in this filter) is something you still see value in, we'd probably have to treat it as a new feature. I'd encourage you to file a spec [2] so we can evaluate the idea. If not, hopefully the documentation change helps clarify things. [1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2020-09-16.log.html#t2020-09-16T13:18:37 [2] https://specs.openstack.org/openstack/nova-specs/readme.html ** Changed in: nova Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1741810 Title: Filter AggregateImagePropertiesIsolation doesn't Work Status in OpenStack Compute (nova): Won't Fix Bug description: Description === I tried to use filter AggregateImagePropertiesIsolation to isolate Windows instance for reducing number of Windows Licenses. I think nova scheduler in pike release, filter AggregateImagePropertiesIsolation always returned all hosts. If this is a bug, filter AggregateImagePropertiesIsolation needs to be fixed. Steps to reproduce == # add filter to nova.conf and restart nova scheduler [filter_scheduler] enabled_filters = AggregateImagePropertiesIsolation,RetryFilter,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter # image create with os property openstack image create --min-disk 3 --min-ram 512 --disk-format qcow2 --public --file windows.img img_windows openstack image create --min-disk 1 --min-ram 64 --disk-format qcow2 --public --file cirros-0.3.5-x86_64-disk.img img_linux openstack image set --property os=windows img_windows openstack image set --property os=linux img_linux # host aggregate create with os property openstack aggregate create os_win openstack aggregate add host os_win compute01 openstack aggregate add host os_win compute02 openstack aggregate set --property os=windows os_win openstack aggregate create os_linux openstack aggregate add host os_linux compute03 openstack aggregate add host os_linux compute04 openstack aggregate add host os_linux compute05 openstack aggregate set --property os=linux os_linux # create flavor openstack flavor create --ram 1024 --disk 1 --vcpus 1 --public small openstack flavor create --ram 4096 --disk 20 --vcpus 2 --public medium # create windows instances openstack server create --image img_windows --network test-net --flavor medium --max 10 test-win Expected result === Windows instances can be found in compute01, compute02 only Actual result = Windows instance was found in every hosts. Environment === 1. Nova's version (nova-scheduler)[nova@control01 /]$ rpm -qa | grep nova python-nova-17.0.0-0.20171206190932.cbdc893.el7.centos.noarch openstack-nova-scheduler-17.0.0-0.20171206190932.cbdc893.el7.centos.noarch openstack-nova-common-17.0.0-0.20171206190932.cbdc893.el7.centos.noarch python2-novaclient-9.1.0-0.20170804194758.0a53d19.el7.centos.noarch 2. hypervisor (nova-libvirt)[root@compute01 /]# rpm -qa | grep kvm qemu-kvm-common-ev-2.9.0-16.el7_4.11.1.x86_64 libvirt-daemon-kvm-3.2.0-14.el7_4.5.x86_64 qemu-kvm-ev-2.9.0-16.el7_4.11.1.x86_64 2. Storage ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable) 3. Networking Neutron with OpenVSwitch Logs & Configs == $ tail -f nova-scheduler.log | grep AggregateImagePropertiesIsolation 2018-01-08 11:52:53.964 6 DEBUG nova.filters [req-3828686f-1d46-407a-bebb-14f7a573c52e 9b1f4f0bcea2428c93b8b4276ba67cb7 188be4011b2b49529cbdd6eade152233 - default default] Filter AggregateImagePropertiesIsolation returned 5 host(s) get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104 # add filter to nova.conf and restart nova scheduler [filter_scheduler] enabled_filters = AggregateImagePropertiesIsolation,RetryFilter,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1741810/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe :
[Yahoo-eng-team] [Bug 1728600] Re: Test test_network_basic_ops fails time to time, port doesn't become ACTIVE quickly
** Changed in: nova Status: New => Incomplete ** No longer affects: nova -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1728600 Title: Test test_network_basic_ops fails time to time, port doesn't become ACTIVE quickly Status in tempest: In Progress Bug description: Test test_network_basic_ops fails time to time, port doesn't become ACTIVE quickly Trace: Traceback (most recent call last): File "tempest/scenario/test_security_groups_basic_ops.py", line 185, in setUp self._deploy_tenant(self.primary_tenant) File "tempest/scenario/test_security_groups_basic_ops.py", line 349, in _deploy_tenant self._set_access_point(tenant) File "tempest/scenario/test_security_groups_basic_ops.py", line 316, in _set_access_point self._assign_floating_ips(tenant, server) File "tempest/scenario/test_security_groups_basic_ops.py", line 322, in _assign_floating_ips client=tenant.manager.floating_ips_client) File "tempest/scenario/manager.py", line 836, in create_floating_ip port_id, ip4 = self._get_server_port_id_and_ip4(thing) File "tempest/scenario/manager.py", line 814, in _get_server_port_id_and_ip4 "No IPv4 addresses found in: %s" % ports) File "/usr/local/lib/python2.7/dist-packages/unittest2/case.py", line 845, in assertNotEqual raise self.failureException(msg) AssertionError: 0 == 0 : No IPv4 addresses found in: [{u'allowed_address_pairs': [], u'extra_dhcp_opts': [], u'updated_at': u'2017-10-30T10:04:41Z', u'device_owner': u'compute:None', u'revision_number': 9, u'port_security_enabled': True, u'binding:profile': {}, u'fixed_ips': [{u'subnet_id': u'd522b2e5-7e56-4d08-843c-c434c3c2af97', u'ip_address': u'10.100.0.12'}], u'id': u'20d59775-906d-4390-b193-a8ec81817ddb', u'security_groups': [u'908eb03d-2477-49ab-ab9a-fcfae454', u'cf62ee1b-eb73-44d0-9ad8-65bb32885505'], u'binding:vif_details': {u'port_filter': True, u'ovs_hybrid_plug': True}, u'binding:vif_type': u'ovs', u'mac_address': u'fa:16:3e:02:f3:e8', u'project_id': u'0a8532fba2194d32996c3ba46ae35c96', u'status': u'BUILD', u'binding:host_id': u'cfg01', u'description': u'', u'tags': [], u'device_id': u'5ad8f2be-3cbb-49aa-8d72-e81ca6789665', u'name': u'', u'admin_state_up': True, u'network_id': u'49491fd4-2c1e-4c46-8166-b4648eb75f84', u'tenant_id': u'0a8532fba2194d32996c3ba46ae35c96', u'created_at': u'2017-10-30T10:04:37Z', u'binding:vnic_type': u'normal'}] Ran 1 test in 25.096s To manage notifications about this bug go to: https://bugs.launchpad.net/tempest/+bug/1728600/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1892562] Re: Choose a security group when creating an instance with a port that has disabled port security
This is expected behavior. From the API reference: One or more security groups. Specify the name of the security group in the name attribute. If you omit this attribute, the API creates the server in the default security group. Requested security groups are not applied to pre-existing ports. This is a pre-existing port so the security groups will not apply. [1] https://docs.openstack.org/api-ref/compute/?expanded=create-server- detail#id11 ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1892562 Title: Choose a security group when creating an instance with a port that has disabled port security Status in OpenStack Compute (nova): Invalid Bug description: Description === When creating an instance using a port that has disabled port security, if you choose a security group, an error should be expected to throw an exception, but the result is that the creation is successful. Although the instance as expected no security group, but I think in the process of creating an instance should throw an exception and give some hints, instead of successfully creating but not displaying the security group. Steps to reproduce == * Create an instance use a port that has disabled port security * Use the port in the previous step to create an instance, and select a security group when creating the instance. Expected result === Instance creation failed and an exception was thrown. Actual result = Successfully created instance To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1892562/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1894771] Re: Hypervisor shows negative numbers after launching instances on baremetal nodes
This is expected behavior. The ironic driver does not report free disk, RAM and memory via the 'get_available_resource' driver API [1] which means the resource tracker is essentially subtracting usage from 0. That's considered okay though [2]. In general, the 'os-hypervisors' API, which the 'nova hypervisor-show' command uses, is considered very broken and will likely be removed in a future release. You should rely on placement for an authoritative view on resource consumption. [1] https://github.com/openstack/nova/blob/e0f088c95d05e9cf32d4af4c7cfc20566b17f8e1/nova/virt/ironic/driver.py#L355-L357 [2] https://github.com/openstack/nova/blob/e0f088c95d05e9cf32d4af4c7cfc20566b17f8e1/nova/compute/resource_tracker.py#L1255 ** Changed in: nova Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1894771 Title: Hypervisor shows negative numbers after launching instances on baremetal nodes Status in OpenStack Compute (nova): Won't Fix Bug description: Testing with Train version with ironic driver. Before launching instances on baremetal nodes, # nova hypervisor-show command shows 0 for vcpus, memory and disk fields, which are set to zero in ironic.driver code. This is still acceptable as baremetal resources are counted in resource class, however, after launching instance on the baremetal node, the vcpu/mem/disk fields appear to be negative in hypervisor-show details, and the negative numbers correlate with the flavor's vcpu/mem/disk fields. [root@train ~(keystone_admin)]# nova hypervisor-show e12c91fb-4c73-406f-8b9e-b0ef3c9c829a +-+--+ | Property| Value| +-+--+ | cpu_info| {} | | current_workload| 0| | disk_available_least| 0| | free_disk_gb| -100 | | free_ram_mb | -16384 | | host_ip | 192.168.10.111 | | hypervisor_hostname | e12c91fb-4c73-406f-8b9e-b0ef3c9c829a | | hypervisor_type | ironic | | hypervisor_version | 1| | id | e12c91fb-4c73-406f-8b9e-b0ef3c9c829a | | local_gb| 0| | local_gb_used | 100 | | memory_mb | 0| | memory_mb_used | 16384| | running_vms | 1| | service_disabled_reason | None | | service_host| train.ironic| | service_id | 23464515-e938-47b1-807e-fb0e3d8250e3 | | state | up | | status | enabled | | vcpus | 0| | vcpus_used | 8| +-+--+ The hypervisor detail does not affect the functions of baremetal instances, but is quite confusing. Besides, nova quotas and usages are also affected by the baremetal flavor's vcpu/mem/disk fields, which maybe not able to describe the resources that the instance occupies. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1894771/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1892033] Re: Failed to start nova-compute with libvirt-xen
The libvirt+xen driver has been untested for many cycles and has been deprecated in Victoria, with an eye on removal in Wallaby or later. I don't think warrants being fixed. ** Changed in: nova Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1892033 Title: Failed to start nova-compute with libvirt-xen Status in OpenStack Compute (nova): Won't Fix Bug description: Description === I deployed ussuri env from ubuntu-cloud:ussuri. Configure one compute node with xen and libvirt, then nova-compute serivce can not be started. Got error 'libvirt.libvirtError: this function is not supported by the connection driver: virNodeGetCPUMap'. Steps to reproduce == 1. Install nova-compute 2. Configure nova.conf as below: [libvirt] virt_type = xen 3. Start nova-compute service Expected result === Nova-compute starts successfully Actual result = Got error Environment === root@xen-cmp01:~# dpkg -l | grep nova-compute ii nova-compute 2:21.0.0-0ubuntu0.20.04.1~cloud0 all OpenStack Compute - compute node base ii nova-compute-kvm 2:21.0.0-0ubuntu0.20.04.1~cloud0 all OpenStack Compute - compute node (KVM) ii nova-compute-libvirt 2:21.0.0-0ubuntu0.20.04.1~cloud0 all OpenStack Compute - compute node libvirt support root@xen-cmp01:~# dpkg -l | grep libvirt ii libvirt-clients 6.0.0-0ubuntu8.2~cloud0 amd64Programs for the libvirt library ii libvirt-daemon 6.0.0-0ubuntu8.2~cloud0 amd64Virtualization daemon ii libvirt-daemon-driver-qemu 6.0.0-0ubuntu8.2~cloud0 amd64Virtualization daemon QEMU connection driver ii libvirt-daemon-driver-storage-rbd6.0.0-0ubuntu8.2~cloud0 amd64Virtualization daemon RBD storage driver ii libvirt-daemon-driver-xen6.0.0-0ubuntu8.2~cloud0 amd64Virtualization daemon Xen connection driver ii libvirt-daemon-system6.0.0-0ubuntu8.2~cloud0 amd64Libvirt daemon configuration files ii libvirt-daemon-system-systemd6.0.0-0ubuntu8.2~cloud0 amd64Libvirt daemon configuration files (systemd) ii libvirt0:amd64 6.0.0-0ubuntu8.2~cloud0 amd64library for interfacing with different virtualization systems ii nova-compute-libvirt 2:21.0.0-0ubuntu0.20.04.1~cloud0 all OpenStack Compute - compute node libvirt support ii python3-libvirt 6.1.0-1~cloud0 amd64libvirt Python 3 bindings root@xen-cmp01:~# dpkg -l | grep xen ii grub-xen-bin 2.02-2ubuntu8.17 amd64GRand Unified Bootloader, version 2 (Xen binaries) ii grub-xen-host2.02-2ubuntu8.17 amd64GRand Unified Bootloader, version 2 (Xen host version) ii libvirt-daemon-driver-xen6.0.0-0ubuntu8.2~cloud0 amd64Virtualization daemon Xen connection driver ii libxen-4.9:amd64 4.9.2-0ubuntu1 amd64Public libs for Xen ii libxenstore3.0:amd64 4.9.2-0ubuntu1 amd64Xenstore communications library for Xen ii python3-os-xenapi0.3.4-0ubuntu3~cloud0 all XenAPI library for OpenStack projects - Python 3.x ii xen-hypervisor-4.9-amd64 4.9.2-0ubuntu1 amd64Xen Hypervisor on AMD64 ii xen-utils-4.94.9.2-0ubuntu1 amd64XEN administrative tools ii xen-utils-common 4.9.2-0ubuntu1 all Xen administrative tools - common files ii xenstore-utils 4.9.2-0ubuntu1 amd64Xenstore command line utilities for Xen Logs & Configs == 2020-08-18 12:23:30.739 12029 ERROR nova.compute.manager
[Yahoo-eng-team] [Bug 1685152] Re: [RFE] SR-IOV - HotPlug support
*** This bug is a duplicate of bug 1499269 *** https://bugs.launchpad.net/bugs/1499269 ** This bug has been marked a duplicate of bug 1499269 cannot attach direct type port (sr-iov) to existing instance -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1685152 Title: [RFE] SR-IOV - HotPlug support Status in OpenStack Compute (nova): Expired Bug description: The Nova interface-attach API needs be enhanced to support SR-IOV. There is a Newton blueprint for this: https://review.openstack.org/#/c/139910/ It has been abandoned, and need to be picked up for ocata/pike. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1685152/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1894095] [NEW] Running periodic task during live migration results in incorrect usage
Public bug reported: With the introduction of NUMA-aware live migration in Train, we now do proper claiming and, if necessary, unclaiming of resources at the destination host. However, the latter uses the same mechanism as resize/cold migrate confirm/revert, which means its subject to the same races as those highlighted in bug 1879878. This bug tracks the live migration side of the work to fix that. ** Affects: nova Importance: Medium Assignee: Stephen Finucane (stephenfinucane) Status: Confirmed ** Affects: nova/train Importance: Undecided Status: New ** Affects: nova/ussuri Importance: Undecided Status: New ** Tags: libvirt live-migration numa ** Tags added: numa ** Tags added: libvirt live-migration ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => Medium ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) ** Also affects: nova/train Importance: Undecided Status: New ** Also affects: nova/ussuri Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1894095 Title: Running periodic task during live migration results in incorrect usage Status in OpenStack Compute (nova): Confirmed Status in OpenStack Compute (nova) train series: New Status in OpenStack Compute (nova) ussuri series: New Bug description: With the introduction of NUMA-aware live migration in Train, we now do proper claiming and, if necessary, unclaiming of resources at the destination host. However, the latter uses the same mechanism as resize/cold migrate confirm/revert, which means its subject to the same races as those highlighted in bug 1879878. This bug tracks the live migration side of the work to fix that. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1894095/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1893864] Re: resolve ResourceProviderSyncFailed issue in python3
** Also affects: nova/trunk Importance: Undecided Status: New ** Also affects: nova/ussuri Importance: Undecided Status: New ** Also affects: nova/victoria Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1893864 Title: resolve ResourceProviderSyncFailed issue in python3 Status in OpenStack Compute (nova): New Status in OpenStack Compute (nova) trunk series: New Status in OpenStack Compute (nova) ussuri series: New Status in OpenStack Compute (nova) victoria series: New Bug description: Description === In recent code of Train, the VMs booted runs into ERROR state The synchronize the placement service with resource provider information supplied by the compute host fails This is caused by the "/" operation difference between python 2.x and python 3.x In python 2.x "/" int / int returns int while in python 3.x int / int return the real result. Environment === vmware setup Logs & Configs == ^[[01;31mERROR nova.scheduler.client.report [^[[01;36mNone req-030a4d73-5bf1-4080-8b39-d637270055e0 ^[[00;36madmin admin^[[01;31m] ^[[01;35m^[[01;31m[req-b1da7c1f-6f45-412c-9740-0f27495b1f23] Failed to update inventory to [{'VCPU': {'total': 12, 'reserved': 0, 'min_unit': 1, 'max_unit': 4, 'step_size': 1, 'allocation_ratio': 100.0}, 'MEMORY_MB': {'total': 49149, 'reserved': 512, 'min_unit': 1, 'max_unit': 16383, 'step_size': 1, 'allocation_ratio': 1.5}, 'DISK_GB': {'total': 3025, 'reserved': 0, 'min_unit': 1, 'max_unit': 0, 'step_size': 1, 'allocation_ratio': 1.0}}] for resource provider with UUID 33c124a0-1ebc-4a36-a1fd-b6cd9d104c49. Got 400: {"errors": [{"status": 400, "title": "Bad Request", "detail": "The server could not comply with the request since it is either malformed or otherwise incorrect.\n\n JSON does not validate: 0 is less than the minimum of 1 Failed validating 'minimum' in schema['properties']['inventories']['patternProperties']['^[A-Z0-9_]+$']['properties']['max_unit']: {'maximum': 2147483647, 'minimum': 1, 'type': 'integer'} On instance['inventories']['DISK_GB']['max_unit']: 0 ", "code": "placement.undefined_code", "request_id": "req-b1da7c1f-6f45-412c-9740-0f27495b1f23"}]}^[[00m^[[00m ^[[00;32mDEBUG oslo_concurrency.lockutils [^[[01;36mNone req-030a4d73-5bf1-4080-8b39-d637270055e0 ^[[00;36madmin admin^[[00;32m] ^[[01;35m^[[00;32mLock "compute_resources" released by "nova.compute.resource_tracker.ResourceTracker.instance_claim" :: held 1.016s^[[00m ^[[00;33m{{(pid=6347) inner /usr/local/lib/python3.6/dist-packages/oslo_concurrency/lockutils.py:339}}^[[00m^[[00m ^[[01;31mERROR nova.compute.manager [^[[01;36mNone req-030a4d73-5bf1-4080-8b39-d637270055e0 ^[[00;36madmin admin^[[01;31m] ^[[01;35m[instance: e5083b01-f490-4031-b25c-edd6d86dd62f] ^[[01;31mFailed to build and run instance^[[00m: nova.exception.ResourceProviderSyncFailed: Failed to synchronize the placement service with resource provider information supplied by the compute host. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1893864/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1892961] Re: set different VirtualDevice.key
** Also affects: nova/stein Importance: Undecided Status: New ** Also affects: nova/train Importance: Undecided Status: New ** Also affects: nova/victoria Importance: Undecided Assignee: Yingji Sun (yingjisun) Status: In Progress ** Also affects: nova/ussuri Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1892961 Title: set different VirtualDevice.key Status in OpenStack Compute (nova): In Progress Status in OpenStack Compute (nova) stein series: New Status in OpenStack Compute (nova) train series: New Status in OpenStack Compute (nova) ussuri series: New Status in OpenStack Compute (nova) victoria series: In Progress Bug description: When creating an instance with multiple nics on vsphere 7, such as, Create server using ports: "networks": [{"port": "1ff1fd0e-a7c1-400d- 8ee4-d8b6c94ed33b"}, {"port": "87aee6b2-c76a-4f10-9eab- a23ff9694b34"}], it will report error as below. 2020-02-03 22:56:02.654 13279 ERROR nova.compute.manager [req- b1ec16f5-e529-4c98-9a4c-4cb8782489d2 a2fa852dc11546dfaf4bb2d9c0460dcf ee69d923dc594779a5775abd2077bea8 - default default] [instance: a80f85ce-2c16-4022-b01e-fb6953243fc0] Instance failed to spawn: VimFaultException: Network interface 'VirtualE1000' uses network 'nsx.LogicalSwitch:3a603a1c-4df4-4b09-afd1-ac9b56979f5e', which is not accessible. The root cause is that starting from vsphere 7, VirtualDevice.key cannot be the same any more. Original the request to vcenter is --> deviceChange = (vim.vm.device.VirtualDeviceSpec) [ --> (vim.vm.device.VirtualDeviceSpec) { --> operation = "add", --> device = (vim.vm.device.VirtualE1000) { -->key = -47, -->backing = (vim.vm.device.VirtualEthernetCard.OpaqueNetworkBackingInfo) { --> opaqueNetworkId = "9c0d11f9-8388-465f-9a78-988134d44ab7", --> opaqueNetworkType = "nsx.LogicalSwitch" -->}, -->connectable = (vim.vm.device.VirtualDevice.ConnectInfo) { --> startConnected = true, --> allowGuestControl = true, --> connected = true, -->}, -->addressType = "manual", -->macAddress = "fa:16:3e:58:a9:24", -->wakeOnLanEnabled = true, -->externalId = "1ff1fd0e-a7c1-400d-8ee4-d8b6c94ed33b", --> }, --> }, --> (vim.vm.device.VirtualDeviceSpec) { --> operation = "add", --> device = (vim.vm.device.VirtualE1000) { -->key = -47, -->backing = (vim.vm.device.VirtualEthernetCard.OpaqueNetworkBackingInfo) { --> opaqueNetworkId = "00b14b88-4650-40c9-8216-f188b3f865cf", --> opaqueNetworkType = "nsx.LogicalSwitch" -->}, -->connectable = (vim.vm.device.VirtualDevice.ConnectInfo) { --> startConnected = true, --> allowGuestControl = true, --> connected = true, -->}, -->addressType = "manual", -->macAddress = "fa:16:3e:fc:d7:20", -->wakeOnLanEnabled = true, -->externalId = "87aee6b2-c76a-4f10-9eab-a23ff9694b34", --> }, --> }, We need to change 'key' to different values. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1892961/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1889633] [NEW] Pinned instance with thread policy can consume VCPU
Public bug reported: In Train, we introduced the concept of the 'PCPU' resource type to track pinned instance CPU usage. The '[compute] cpu_dedicated_set' is used to indicate which host cores should be used by pinned instances and, once this config option was set, nova would start reporting 'PCPU' resource types in addition to (or entirely instead of, if 'cpu_shared_set' was unset) 'VCPU'. Requests for pinned instances (via the 'hw:cpu_policy=dedicated' flavor extra spec or equivalent image metadata property) would result in a query for 'PCPU' inventory rather than 'VCPU', as previously done. We anticipated some upgrade issues with this change, whereby there could be a period during an upgrade in which some hosts would have the new configuration, meaning they'd be reporting PCPU, but the remainder would still be on legacy config and therefore would continue reporting just VCPU. An instance could be reasonably expected to land on any host, but since only the hosts with the new configuration were reporting 'PCPU' inventory and the 'hw:cpu_policy=dedicated' extra spec was resulting in a request for 'PCPU', the hosts with legacy configuration would never be consumed. We worked around this issue by adding support for a fallback placement query, enabled by default, which would make a second request using 'VCPU' inventory instead of 'PCPU'. The idea behind this was that the hosts with 'PCPU' inventory would be preferred, meaning we'd only try the 'VCPU' allocation if the preferred path failed. Crucially, we anticipated that if a host with new style configuration was picked up by this second 'VCPU' query, an instance would never actually be able to build there. This is because the new-style configuration would be reflected in the 'numa_topology' blob of the 'ComputeNode' object, specifically via the 'cpuset' (for cores allocated to 'VCPU') and 'pcpuset' (for cores allocated to 'PCPU') fields. With new-style configuration, both of these are set to unique values. If the scheduler had determined that there wasn't enough 'PCPU' inventory available for the instance, that would implicitly mean there weren't enough of the cores listed in the 'pcpuset' field still available. Turns out there's a gap in this thinking: thread policies. The 'isolate' CPU thread policy previously meant "give me a host with no hyperthreads, else a host with hyperthreads but mark the thread siblings of the cores used by the instance as reserved". This didn't translate to a new 'PCPU' world where we needed to know how many cores we were consuming up front before landing on the host. To work around this, we removed support for the latter case and instead relied on a trait, 'HW_CPU_HYPERTHEADING', to indicate whether a host had hyperthread support or not. Using the 'isolate' policy meant that trait could not be defined on the host, or the trait was "forbidden". The gap comes via a combination of this trait request and the fallback query. If we request the isolate thread policy, hosts with new-style configuration and sufficient PCPU inventory would nonetheless be rejected if they reported the 'HW_CPU_HYPERTHEADING' trait. However, these could get picked up in the fallback query and the instance would not fail to build on the host because of lack of 'PCPU' inventory. This means we end up with a pinned instance on a host using new-style configuration that is consuming 'VCPU' inventory. Boo. # Steps to reproduce 1. Using a host with hyperthreading support enabled, configure both '[compute] cpu_dedicated_set' and '[compute] cpu_shared_set' 2. Boot an instance with the 'hw:cpu_thread_policy=isolate' extra spec. # Expected result Instance should not boot since the host has hyperthreads. # Actual result Instance boots. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1889633 Title: Pinned instance with thread policy can consume VCPU Status in OpenStack Compute (nova): New Bug description: In Train, we introduced the concept of the 'PCPU' resource type to track pinned instance CPU usage. The '[compute] cpu_dedicated_set' is used to indicate which host cores should be used by pinned instances and, once this config option was set, nova would start reporting 'PCPU' resource types in addition to (or entirely instead of, if 'cpu_shared_set' was unset) 'VCPU'. Requests for pinned instances (via the 'hw:cpu_policy=dedicated' flavor extra spec or equivalent image metadata property) would result in a query for 'PCPU' inventory rather than 'VCPU', as previously done. We anticipated some upgrade issues with this change, whereby there could be a period during an upgrade in which some hosts would have the new configuration, meaning they'd be reporting PCPU, but the remainder would still be on legacy config and therefore would continue
[Yahoo-eng-team] [Bug 1889257] [NEW] Live migration of realtime instances is broken
Public bug reported: Attempting to live migrate an instance with realtime enabled fails on master (commit d4c857dfcb1). This appears to be a bug with the live migration of pinned instances feature introduced in Train. # Steps to reproduce Create a server using realtime attributes and then attempt to live migrate it. For example: $ openstack flavor create --ram 1024 --disk 0 --vcpu 4 \ --property 'hw:cpu_policy=dedicated' \ --property 'hw:cpu_realtime=yes' \ --property 'hw:cpu_realtime_mask=^0-1' \ realtime $ openstack server create --os-compute-api-version=2.latest \ --flavor realtime --image cirros-0.5.1-x86_64-disk --nic none \ --boot-from-volume 1 --wait \ test.realtime $ openstack server migrate --live-migration test.realtime # Expected result Instance should be live migrated. # Actual result The live migration never happens. Looking at the logs we see the following error: Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/eventlet/hubs/hub.py", line 461, in fire_timers timer() File "/usr/local/lib/python3.6/dist-packages/eventlet/hubs/timer.py", line 59, in __call__ cb(*args, **kw) File "/usr/local/lib/python3.6/dist-packages/eventlet/event.py", line 175, in _do_send waiter.switch(result) File "/usr/local/lib/python3.6/dist-packages/eventlet/greenthread.py", line 221, in main result = function(*args, **kwargs) File "/opt/stack/nova/nova/utils.py", line 670, in context_wrapper return func(*args, **kwargs) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 8966, in _live_migration_operation # is still ongoing, or failed File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise raise value File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 8959, in _live_migration_operation # 2. src==running, dst==paused File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 658, in migrate destination, params=params, flags=flags) File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 190, in doit result = proxy_call(self._autowrap, f, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 148, in proxy_call rv = execute(f, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 129, in execute six.reraise(c, e, tb) File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise raise value File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 83, in tworker rv = meth(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/libvirt.py", line 1745, in migrateToURI3 if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self) libvirt.libvirtError: vcpussched attributes 'vcpus' must not overlap Looking further, we see there are issues with the XML we are generating for the destination. Compare what we have on the source before updating the XML for the destination: DEBUG nova.virt.libvirt.migration [-] _update_numa_xml input xml= ... 4096 {{(pid=12600) _update_numa_xml /opt/stack/nova/nova/virt/libvirt/migration.py:97} To what we have after the update: DEBUG nova.virt.libvirt.migration [-] _update_numa_xml output xml= ... 4096 ... {{(pid=12600) _update_numa_xml /opt/stack/nova/nova/virt/libvirt/migration.py:131}} The issue is the 'vcpusched' elements. We're assuming there are only one of these elements when updating the XML for the destination [1]. Have to figure out why there are multiple elements and how best to handle this (likely by deleting and recreating everything). I suspect the reason we didn't spot this is because libvirt is rewriting the XML on us. This is what nova is providing libvirt upon boot: DEBUG nova.virt.libvirt.driver [...] [instance: ...] End _get_guest_xml xml= ... 4096 ... {{(pid=12600) _get_guest_xml /opt/stack/nova/nova/virt/libvirt/driver.py:6331}} but that's changed by time we get to recalculating things. The solution is probably to remove all 'vcpusched' elements and recreate them, rather than trying to update stuff inline. [1] https://github.com/openstack/nova/blob/21.0.0/nova/virt/libvirt/migration.py#L152-L155 ** Affe
[Yahoo-eng-team] [Bug 1888414] Re: Snapshot of stopped, suspended instance fails
I also see that when this fails, there are left over base files in '/opt/stack/data/nova/instances/_base'. ** Changed in: nova Status: Confirmed => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1888414 Title: Snapshot of stopped, suspended instance fails Status in OpenStack Compute (nova): Invalid Bug description: Attempting to create a snapshot of a shutdown instance fails. It seems nova assumes the instance exists and is running when attempting to create the snapshot. # Steps to reproduce $ openstack server create \ --os-compute-api-version=2.latest --flavor m1.tiny --image cirros-0.5.1-x86_64-disk \ --nic none --wait test.server $ openstack server stop test.server $ openstack server image create test.server # Expected result A snapshot of the instance root disk should be created. # Actual result The snapshot is not created. Attempts to resume the instance fail: $ openstack server start test.server Cannot 'start' instance aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda while it is in task_state image_pending_upload (HTTP 409) (Request-ID: req-39d4bd58-366b-4b93-8d7d-72a487183088) # Additional details I see the following in the logs: nova-compute[20898]: INFO nova.virt.libvirt.driver [None req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Skipping quiescing instance: QEMU guest agent is not enabled. nova-compute[20898]: INFO nova.virt.libvirt.driver [None req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Instance instance-000c disappeared while taking snapshot of it: [Error Code 42] Domain not found: no domain with matching uuid 'aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda' (instance-000c) nova-compute[20898]: DEBUG nova.compute.manager [None req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Instance disappeared during snapshot {{(pid=20898) _snapshot_instance /opt/stack/nova/nova/compute/manager.py:3874}} Compare with logs from snapshot of a running guest: nova-compute[20898]: INFO nova.virt.libvirt.driver [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Skipping quiescing instance: QEMU guest agent is not enabled. nova-compute[20898]: DEBUG nova.privsep.utils [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] Path '/opt/stack/data/nova/instances' supports direct I/O {{(pid=20898) supports_direct_io /opt/stack/nova/nova/privsep/utils.py:64}} nova-compute[20898]: DEBUG oslo_concurrency.processutils [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] Running cmd (subprocess): qemu-img convert -t none -O qcow2 -f qcow2 /opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581.delta /opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581 {{(pid=20898) execute /usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py:371}} nova-compute[20898]: DEBUG oslo_concurrency.processutils [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] CMD "qemu-img convert -t none -O qcow2 -f qcow2 /opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581.delta /opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581" returned: 0 in 0.403s {{(pid=20898) execute /usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py:408}} nova-compute[20898]: INFO nova.virt.libvirt.driver [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Snapshot extracted, beginning image upload nova-compute[20898]: INFO nova.virt.libvirt.driver [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Snapshot image upload complete nova-compute[20898]: INFO nova.compute.manager [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Took 2.44 seconds to snapshot the instance on the hypervisor. We see the same issue if the issues is suspended ('openstack server suspend'). There are no issues if the instance is paused, however ('openstack server pause') but To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1888414/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1888414] [NEW] Snapshot of stopped, suspended instance fails
Public bug reported: Attempting to create a snapshot of a shutdown instance fails. It seems nova assumes the instance exists and is running when attempting to create the snapshot. # Steps to reproduce $ openstack server create \ --os-compute-api-version=2.latest --flavor m1.tiny --image cirros-0.5.1-x86_64-disk \ --nic none --wait test.server $ openstack server stop test.server $ openstack server image create test.server # Expected result A snapshot of the instance root disk should be created. # Actual result The snapshot is not created. Attempts to resume the instance fail: $ openstack server start test.server Cannot 'start' instance aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda while it is in task_state image_pending_upload (HTTP 409) (Request-ID: req-39d4bd58-366b-4b93-8d7d-72a487183088) # Additional details I see the following in the logs: nova-compute[20898]: INFO nova.virt.libvirt.driver [None req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Skipping quiescing instance: QEMU guest agent is not enabled. nova-compute[20898]: INFO nova.virt.libvirt.driver [None req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Instance instance-000c disappeared while taking snapshot of it: [Error Code 42] Domain not found: no domain with matching uuid 'aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda' (instance-000c) nova-compute[20898]: DEBUG nova.compute.manager [None req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Instance disappeared during snapshot {{(pid=20898) _snapshot_instance /opt/stack/nova/nova/compute/manager.py:3874}} Compare with logs from snapshot of a running guest: nova-compute[20898]: INFO nova.virt.libvirt.driver [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Skipping quiescing instance: QEMU guest agent is not enabled. nova-compute[20898]: DEBUG nova.privsep.utils [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] Path '/opt/stack/data/nova/instances' supports direct I/O {{(pid=20898) supports_direct_io /opt/stack/nova/nova/privsep/utils.py:64}} nova-compute[20898]: DEBUG oslo_concurrency.processutils [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] Running cmd (subprocess): qemu-img convert -t none -O qcow2 -f qcow2 /opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581.delta /opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581 {{(pid=20898) execute /usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py:371}} nova-compute[20898]: DEBUG oslo_concurrency.processutils [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] CMD "qemu-img convert -t none -O qcow2 -f qcow2 /opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581.delta /opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581" returned: 0 in 0.403s {{(pid=20898) execute /usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py:408}} nova-compute[20898]: INFO nova.virt.libvirt.driver [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Snapshot extracted, beginning image upload nova-compute[20898]: INFO nova.virt.libvirt.driver [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Snapshot image upload complete nova-compute[20898]: INFO nova.compute.manager [None req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Took 2.44 seconds to snapshot the instance on the hypervisor. We see the same issue if the issues is suspended ('openstack server suspend'). There are no issues if the instance is paused, however ('openstack server pause') but ** Affects: nova Importance: Medium Status: Confirmed ** Tags: libvirt snapshot ** Tags added: libvirt snapshot ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => Medium ** Description changed: Attempting to create a snapshot of a shutdown instance fails. It seems nova assumes the instance exists and is running when attempting to create the snapshot. # Steps to reproduce - $ openstack server create \ - --os-compute-api-version=2.latest --flavor m1.tiny --image cirros-0.5.1-x86_64-disk \ - --nic none --wait test.server - $ openstack server stop test.server - $ openstack server image create test.server + $ openstack server create \ + --os-compute-api-version=2.latest --flavor m1.tiny --image cirros-0.5.1-x86_64-disk \ + --nic none --wait test.server + $ openstack server stop test.server + $ openstack server image create test.server # Expected result A
[Yahoo-eng-team] [Bug 1884231] [NEW] 'hw:realtime_mask' extra spec is not validated
or thread policy. For example: openstack flavor create --ram 512 --disk 1 --vcpus 2 \ --property 'hw:cpu_policy=dedicated' \ --property 'hw:emulator_threads_policy=isolate' \ --property 'hw:cpu_realtime=true' \ --property 'hw:cpu_realtime_mask=^2' \ test.rt Similarly, they could ensure at least one core in the range is valid: openstack flavor create --ram 512 --disk 1 --vcpus 2 \ --property 'hw:cpu_policy=dedicated' \ --property 'hw:emulator_threads_policy=isolate' \ --property 'hw:cpu_realtime=true' \ --property 'hw:cpu_realtime_mask=^1-5' \ test.rt However, both cases are still wrong and the 'hw:cpu_realtime_mask' value is almost certainly user error. Nova should be validating things properly and rejecting invalid values. we could probably also look at dropping the requirement to specify 'hw:cpu_realtime_mask' if 'hw:emulator_threads_policy' is configured, however, that's more of a feature than a bug. ** Affects: nova Importance: Medium Assignee: Stephen Finucane (stephenfinucane) Status: Confirmed ** Changed in: nova Importance: Undecided => Medium ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) ** Changed in: nova Status: New => Confirmed ** Description changed: The 'hw:realtime_mask' extra spec is (currently) used to specify what cores in a host should *not* be part of the realtime set of cores on the host. Currently, this is mandatory and omitting it will cause a HTTP 400 error. For example: - $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \ - --property hw:cpu_policy=dedicated - --property hw:cpu_realtime=yes \ - test.rt + $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \ + --property hw:cpu_policy=dedicated + --property hw:cpu_realtime=yes \ + test.rt will fail with: - Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU + Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU and 1 ordinary vCPU. See hw:cpu_realtime_mask or hw_cpu_realtime_mask Similarly, attempting to mask *all* values will result in a failure. For example: - $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \ - --property hw:cpu_policy=dedicated - --property hw:cpu_realtime=yes \ - --property hw:cpu_realtime_mask=^0-1 - test.rt + $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \ + --property hw:cpu_policy=dedicated + --property hw:cpu_realtime=yes \ + --property hw:cpu_realtime_mask=^0-1 + test.rt will also fail with: - Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU + Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU and 1 ordinary vCPU. See hw:cpu_realtime_mask or hw_cpu_realtime_mask However, the value is otherwise unvalidated by nova, which can cause libvirt to explode when specific values are passed. For example, consider the following flavor: - $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \ - --property hw:cpu_policy=dedicated - --property hw:cpu_realtime=yes \ - --property hw:cpu_realtime_mask='^2' \ - test.rt + $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \ + --property hw:cpu_policy=dedicated + --property hw:cpu_realtime=yes \ + --property hw:cpu_realtime_mask='^2' \ + test.rt This states that the instances should have two cores, and some imaginary third core (masks are 0-indexed) will be the non-realtime one. This is clearly nonsensical and, surely enough, creating an instance using this core causes things to go bang: - Failed to build and run instance: libvirt.libvirtError: invalid argument: Failed to parse bitmap '' - Traceback (most recent call last): - File "/opt/stack/nova/nova/compute/manager.py", line 2378, in _build_and_run_instance - accel_info=accel_info) - File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3702, in spawn - cleanup_instance_disks=created_disks) - File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6664, in _create_domain_and_network - cleanup_instance_disks=cleanup_instance_disks) - File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line220, in __exit__ - self.force_reraise() - File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line196, in force_reraise - six.reraise(self.type_, self.value, self.tb) - File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise - raise value - File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6633, in _create_domain_and_network - post_xml_callback=post_xml_callback) - File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6559, in _create_domain - guest = libvirt
[Yahoo-eng-team] [Bug 1879969] Re: confusing error message
*** This bug is a duplicate of bug 1879964 *** https://bugs.launchpad.net/bugs/1879964 ** This bug has been marked a duplicate of bug 1879964 Invalid value for 'hw:mem_page_size' raises confusing error -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1879969 Title: confusing error message Status in OpenStack Compute (nova): In Progress Bug description: Description When booting instance using flavor where Hugepages is activated with incorrect value get error: Invalid memory page size '0' (HTTP 400) (Request-ID: req-338bf619-3a54-45c5-9c59-ad8c1d425e91) Steps to reproduce openstack flavor create hugepage --ram 1024 --disk 10 --vcpus 1 openstack flavor set hugepage --property hw:mem_page_size=2M openstack instance create --flavor hugepage.. Invalid memory page size '0' (HTTP 400) (Request-ID: req-338bf619-3a54-45c5-9c59-ad8c1d425e91) Expected result Correct message that hugepages is wrongly set to 2M instead 2MB Actual result Invalid memory page size '0' (HTTP 400) (Request-ID: req-338bf619-3a54-45c5-9c59-ad8c1d425e91 Environment deployment tool : kolla-ansible https://github.com/openstack/kolla https://github.com/openstack/kolla-ansible Train + Centos8 + Libvirt/KVM + ZFS + Neutron/OVS Logs Output after trying boot instance: Invalid memory page size '0' (HTTP 400) (Request-ID: req-338bf619-3a54-45c5-9c59-ad8c1d425e91) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1879969/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1882919] [NEW] e1000e interface reported as unsupported
Public bug reported: Per this downstream bug [1], attempting to boot a Windows Server 2012 or 2016 image will fail because something (libosinfo?) is attempting to configure an e1000e VIF which nova does not explicitly support. There doesn't appear to be any reason not to support this, since libvirt, and specifically QEMU/KVM, support it. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1839808 ** Affects: nova Importance: Medium Assignee: Stephen Finucane (stephenfinucane) Status: In Progress ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => Medium ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1882919 Title: e1000e interface reported as unsupported Status in OpenStack Compute (nova): In Progress Bug description: Per this downstream bug [1], attempting to boot a Windows Server 2012 or 2016 image will fail because something (libosinfo?) is attempting to configure an e1000e VIF which nova does not explicitly support. There doesn't appear to be any reason not to support this, since libvirt, and specifically QEMU/KVM, support it. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1839808 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1882919/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1882821] [NEW] '[libvirt] file_backed_memory' and '[DEFAULT] reserved_host_memory_mb' are incompatible
Public bug reported: Per title, the '[libvirt] file_backed_memory' and '[DEFAULT] reserved_host_memory_mb' config options are incompatible. Not only does '[DEFAULT] reserved_host_memory_mb' not really make sense for file backed memory (if you want to reserve "memory", configure a lower '[libvirt] file_backed_memory' value) but configuring a value for '[libvirt] file_backed_memory' that is lower than the value for '[DEFAULT] reserved_host_memory_mb', which currently defaults to 512MB, will break nova's resource reporting to placement: nova.exception.ResourceProviderUpdateFailed: Failed to update resource provider via URL /resource_providers/f39bde61-6f73-4ccb-9488-6efb9689730f/inventories: {"errors": [{"status": 400, "title": "Bad Request", "detail": "The server could not comply with the request since it is either malformed or otherwise incorrect.\n\n Unable to update inventory for resource provider f39bde61-6f73-4ccb-9488-6efb9689730f: Invalid inventory for 'MEMORY_MB' on resource provider 'f39bde61-6f73-4ccb-9488-6efb9689730f'. The reserved value is greater than total. ", "code": "placement.undefined_code", "request_id": "req-977e43e7-1a7c-4309-96ec- 49a75bdea58a"}]} Ideally we should error out if both values are configured, however, doing so would be a breaking change. Instead, we can warn if these are incompatible and then error our in a future release. ** Affects: nova Importance: Medium Assignee: Stephen Finucane (stephenfinucane) Status: Confirmed ** Changed in: nova Importance: Undecided => Medium ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) ** Changed in: nova Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1882821 Title: '[libvirt] file_backed_memory' and '[DEFAULT] reserved_host_memory_mb' are incompatible Status in OpenStack Compute (nova): Confirmed Bug description: Per title, the '[libvirt] file_backed_memory' and '[DEFAULT] reserved_host_memory_mb' config options are incompatible. Not only does '[DEFAULT] reserved_host_memory_mb' not really make sense for file backed memory (if you want to reserve "memory", configure a lower '[libvirt] file_backed_memory' value) but configuring a value for '[libvirt] file_backed_memory' that is lower than the value for '[DEFAULT] reserved_host_memory_mb', which currently defaults to 512MB, will break nova's resource reporting to placement: nova.exception.ResourceProviderUpdateFailed: Failed to update resource provider via URL /resource_providers/f39bde61-6f73-4ccb-9488-6efb9689730f/inventories: {"errors": [{"status": 400, "title": "Bad Request", "detail": "The server could not comply with the request since it is either malformed or otherwise incorrect.\n\n Unable to update inventory for resource provider f39bde61-6f73-4ccb-9488-6efb9689730f: Invalid inventory for 'MEMORY_MB' on resource provider 'f39bde61-6f73-4ccb-9488-6efb9689730f'. The reserved value is greater than total. ", "code": "placement.undefined_code", "request_id": "req-977e43e7-1a7c-4309-96ec-49a75bdea58a"}]} Ideally we should error out if both values are configured, however, doing so would be a breaking change. Instead, we can warn if these are incompatible and then error our in a future release. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1882821/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1882233] [NEW] Libvirt driver always reports 'memory_mb_used' of 0
Public bug reported: The nova-compute service periodically logs a summary of the free RAM, disk and vCPUs as reported by the hypervisor. For example: Hypervisor/Node resource view: name=vtpm-f31.novalocal free_ram=7960MB free_disk=11.379043579101562GB free_vcpus=7 pci_devices=[{...}] On a recent deployment using the libvirt driver, it's observed that the 'free_ram' value never changes despite instances being created and destroyed. This is because the 'get_memory_mb_used' function in 'nova.virt.libvirt.host' always returns 0 unless the host platform - reported by 'sys.platform' is either 'linux2' or 'linux3'. Since Python 3.3, the major version is not included in this return value since it was misleading. This is low priority because the value only appears to be used for logging purposes and the values stored in e.g. the 'ComputeNode' object and reported to placement are calculated based on config options and number of instances on the node. We may wish to stop reporting this information instead. [1] https://stackoverflow.com/a/10429736/613428 ** Affects: nova Importance: Low Assignee: Stephen Finucane (stephenfinucane) Status: Confirmed ** Changed in: nova Importance: Undecided => Low ** Changed in: nova Status: New => Confirmed ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1882233 Title: Libvirt driver always reports 'memory_mb_used' of 0 Status in OpenStack Compute (nova): Confirmed Bug description: The nova-compute service periodically logs a summary of the free RAM, disk and vCPUs as reported by the hypervisor. For example: Hypervisor/Node resource view: name=vtpm-f31.novalocal free_ram=7960MB free_disk=11.379043579101562GB free_vcpus=7 pci_devices=[{...}] On a recent deployment using the libvirt driver, it's observed that the 'free_ram' value never changes despite instances being created and destroyed. This is because the 'get_memory_mb_used' function in 'nova.virt.libvirt.host' always returns 0 unless the host platform - reported by 'sys.platform' is either 'linux2' or 'linux3'. Since Python 3.3, the major version is not included in this return value since it was misleading. This is low priority because the value only appears to be used for logging purposes and the values stored in e.g. the 'ComputeNode' object and reported to placement are calculated based on config options and number of instances on the node. We may wish to stop reporting this information instead. [1] https://stackoverflow.com/a/10429736/613428 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1882233/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1736920] Re: Glance images are loaded into memory
I finally got around to investigating this today. tl;dr: there does not appear to be an issue here. The return value of 'glanceclient.Client.images.data' is 'glanceclient.common.utils.RequestIdProxy', owing to the use of the 'add_req_id_to_object' decorator [2]. This is *not* a generator, which means the 'inspect.isgenerator' conditional at [1] is False and we will never convert these large images to a list. In fact, there appears to be only one case that does trigger this: the 'glanceclient.Client.images.list' case, which returns a 'glanceclient.common.utils.GeneratorProxy' object due to the use of the 'add_req_id_to_generator' decorator. This is the function at the root of bug #1557584. As such, the fix is correct and there's nothing to do here besides possibly documenting things better in the code. [1] https://github.com/openstack/nova/blob/16.0.0/nova/image/glance.py#L167 [2] https://github.com/openstack/python-glanceclient/blob/3.1.1/glanceclient/v2/images.py#L200 [3] https://github.com/openstack/python-glanceclient/blob/3.1.1/glanceclient/v2/images.py#L85 ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1736920 Title: Glance images are loaded into memory Status in OpenStack Compute (nova): Invalid Status in OpenStack Security Advisory: Incomplete Bug description: Nova appears to be loading entire responses from glance into memory [1]. This is generally not an issue but these responses could be an entire images [2]. Given a large enough image, this seems like a potential avenue for DoS, not to mention being highly inefficient. [1] https://github.com/openstack/nova/blob/16.0.0/nova/image/glance.py#L167-L170 [2] https://github.com/openstack/nova/blob/16.0.0/nova/image/glance.py#L292-L295 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1736920/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1879964] [NEW] Invalid value for 'hw:mem_page_size' raises confusing error
Public bug reported: Configure a flavor like so: openstack flavor create hugepage --ram 1024 --disk 10 --vcpus 1 test openstack flavor set hugepage --property hw:mem_page_size=2M test Attempt to boot an instance. It will fail with the following error message: Invalid memory page size '0' (HTTP 400) (Request-ID: req- 338bf619-3a54-45c5-9c59-ad8c1d425e91) You wouldn't know from reading it, but this is because the property should read 'hw:mem_page_size=2MB' (note the extra 'B'). ** Affects: nova Importance: Low Assignee: Stephen Finucane (stephenfinucane) Status: In Progress ** Changed in: nova Importance: Undecided => Low ** Changed in: nova Status: New => Confirmed ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1879964 Title: Invalid value for 'hw:mem_page_size' raises confusing error Status in OpenStack Compute (nova): In Progress Bug description: Configure a flavor like so: openstack flavor create hugepage --ram 1024 --disk 10 --vcpus 1 test openstack flavor set hugepage --property hw:mem_page_size=2M test Attempt to boot an instance. It will fail with the following error message: Invalid memory page size '0' (HTTP 400) (Request-ID: req- 338bf619-3a54-45c5-9c59-ad8c1d425e91) You wouldn't know from reading it, but this is because the property should read 'hw:mem_page_size=2MB' (note the extra 'B'). To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1879964/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1689753] Re: ram_filter ignores hugepages which can create unstable guests
The RAM filter has been removed in recent versions of nova so there is nothing to resolve on master now and it's unlikely to be resolved for past releases. Closed as won't fix. ** Changed in: nova Status: Confirmed => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1689753 Title: ram_filter ignores hugepages which can create unstable guests Status in OpenStack Compute (nova): Won't Fix Bug description: environment info: OS:centos7.1 nova:15.0.2 problem description: There are 220G memory in compute-node.The 200G of them are hugepages.The page_size is 1G. Other 20G of them are normal memorys. When I boot a normal instance with the flavor of 30G mem and no hugepages. The instance is created successfully.But the OS become unstable even OOM because memory exhausted I think the instance should boot failed with ram_filter return 0 hosts.Rather than think that the memory is sufficient and spawn the instance in that compute_node To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1689753/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1803575] Re: RFE: Add an option to enable virtio-scsi for new Nova instances by default
While the issue you highlight here is real, this wouldn't be a good solution. The primary issue with this approach is the same issue we have with the '[libvirt] rx_queue_size' and '[libvirt] tx_queue_size' - namely that it can break live migration as two hosts with different values will result in a change in the instance XML. If we were to take this approach, we'd have to store the information as part of the instance and we don't have a way to do this other than via the flavor or image metadata. As such, I'm closing this as WONTFIX. With that said, we do recognize that there is a definite usability issue here. For context, the libosinfo integration in the libvirt driver [1] was supposed to resolve this kind of issue for us but the implementation of that feature is fundamentally broken and it will probably be ripped out in a future release. We're now working on an improved solution to this broader issue, but it will take a different form to this. [1] https://specs.openstack.org/openstack/nova- specs/specs/liberty/approved/libvirt-hardware-policy-from-libosinfo.html ** Changed in: nova Status: Triaged => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1803575 Title: RFE: Add an option to enable virtio-scsi for new Nova instances by default Status in OpenStack Compute (nova): Won't Fix Bug description: Description === Currently virtio-scsi is used only for libvirt instances created from the images with properties hw_scsi_model=virtio-scsi and hw_disk_bus=scsi set or from the volumes with the same image metadata. What is requested: config option in [libvirt] section of the nova.conf to enable virtio-scsi for all new instances by default, even if hw_scsi_model and hw_disk_bus properties for the image is not set. Why: we want virtio-scsi to be enabled for VMs created from users images even if they don't set this property explicitly, because we want to have most of the vms be able to issue BLKDISCARD with fstrim. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1803575/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1858019] Re: The flavor id is not limited when creating a flavor
Also agree. If we're going to do anything, it should be done on the client side. It should be possible to add a flag stating what field we wish to filter on (name or ID), if needed. Since there's nothing to do here from the server side, I'm going to close this as WONTFIX. ** Changed in: nova Status: Triaged => Won't Fix ** Changed in: nova Assignee: Choi-Sung-Hoon (knu-cse) => (unassigned) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1858019 Title: The flavor id is not limited when creating a flavor Status in OpenStack Compute (nova): Won't Fix Bug description: when creating a flavor by 'openstack flavor create --id --vcpus --ram --disk ', the parameter id is not limited. It can lead to ambiguities when id is set to an existed flavor name. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1858019/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1816454] Re: hw:mem_page_size is not respecting all documented values
Looks like this was resolved in https://review.opendev.org/#/c/673252/ ** Changed in: nova Status: New => Fix Released ** Tags added: doc -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1816454 Title: hw:mem_page_size is not respecting all documented values Status in OpenStack Compute (nova): Fix Released Bug description: Per the Rocky documentation for hugepages: https://docs.openstack.org/nova/rocky/admin/huge-pages.html 2MB hugepages can be specified either as: --property hw:mem_page_size=2Mb, or --property hw:mem_page_size=2048 However, whenever I use the former notation (2Mb), conductor fails with the misleading NUMA error below... whereas with the latter notation (2048), allocation succeeds and the resulting instance is backed with 2MB hugepages on an x86_64 platform (as verified by checking `/proc/meminfo | grep HugePages_Free` before/after stopping the created instance). ERROR nova.scheduler.utils [req-de6920d5-829b-411c-acd7-1343f48824c9 cb2abbb91da54209a5ad93a845b4cc26 cb226ff7932d40b0a48ec129e162a2fb - default default] [instance: 5b53d1d4-6a16-4db9-ab52-b267551c6528] Error from last host: node1 (node FQDN-REDACTED): ['Traceback (most recent call last):\n', ' File "/usr/lib/python3/dist- packages/nova/compute/manager.py", line 2106, in _build_and_run_instance\nwith rt.instance_claim(context, instance, node, limits):\n', ' File "/usr/lib/python3/dist- packages/oslo_concurrency/lockutils.py", line 274, in inner\n return f(*args, **kwargs)\n', ' File "/usr/lib/python3/dist- packages/nova/compute/resource_tracker.py", line 217, in instance_claim\npci_requests, overhead=overhead, limits=limits)\n', ' File "/usr/lib/python3/dist- packages/nova/compute/claims.py", line 95, in __init__\n self._claim_test(resources, limits)\n', ' File "/usr/lib/python3 /dist-packages/nova/compute/claims.py", line 162, in _claim_test\n "; ".join(reasons))\n', 'nova.exception.ComputeResourcesUnavailable: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology.\n', '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', ' File "/usr/lib/python3/dist- packages/nova/compute/manager.py", line 1940, in _do_build_and_run_instance\nfilter_properties, request_spec)\n', ' File "/usr/lib/python3/dist-packages/nova/compute/manager.py", line 2156, in _build_and_run_instance\ninstance_uuid=instance.uuid, reason=e.format_message())\n', 'nova.exception.RescheduledException: Build of instance 5b53d1d4-6a16-4db9-ab52-b267551c6528 was re- scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology.\n'] Additional info: I am using Debian testing (buster) and all OpenStack packages included therein. $ dpkg -l | grep nova ii nova-common 2:18.1.0-2 all OpenStack Compute - common files ii nova-compute 2:18.1.0-2 all OpenStack Compute - compute node ii nova-compute-kvm 2:18.1.0-2 all OpenStack Compute - compute node (KVM) ii python3-nova 2:18.1.0-2 all OpenStack Compute - libraries ii python3-novaclient2:11.0.0-2 all client library for OpenStack Compute API - 3.x $ dpkg -l | grep qemu ii ipxe-qemu 1.0.0+git-20161027.b991c67-1 all PXE boot firmware - ROM images for qemu ii qemu-block-extra:amd641:3.1+dfsg-2+b1 amd64extra block backend modules for qemu-system and qemu-utils ii qemu-kvm 1:3.1+dfsg-2+b1 amd64QEMU Full virtualization on x86 hardware ii qemu-system-common1:3.1+dfsg-2+b1 amd64QEMU full system emulation binaries (common files) ii qemu-system-data 1:3.1+dfsg-2 all QEMU full system emulation (data files) ii qemu-system-gui 1:3.1+dfsg-2+b1 amd64QEMU full system emulation binaries (user interface and audio support) ii qemu-system-x86 1:3.1+dfsg-2+b1 amd64QEMU full system emulation binaries (x86) ii qemu-utils1:3.1+dfsg-2+b1 amd64QEMU utilities * I forced nova to allocate on the same hypervisor (node1) when checking for the issue and can
[Yahoo-eng-team] [Bug 1863058] Re: Arm64 CI for Nova
Awesome! Could you bring this up on openstack-discuss [1]? It's far more likely to get eyes (and volunteers to help with issues you'll have) there. [1] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack- discuss ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1863058 Title: Arm64 CI for Nova Status in OpenStack Compute (nova): Invalid Bug description: Linaro has donate a cluster for OpenStack CI on Arm64. Now the cluster is ready, https://opendev.org/openstack/project-config/src/branch/master/nodepool/nl03.openstack.org.yaml#L414 We'd like to setup CI for Nova first. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1863058/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1864279] Re: Unable to attach more than 6 scsi volumes
Looks like it's been fixed on RHEL 7.7 too [1]. If you're on a different OS, I'd suggest opening a bug against the libvirt component for same and requesting a backport. I don't think there's much to do here from a nova perspective. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1741782 ** Bug watch added: Red Hat Bugzilla #1741782 https://bugzilla.redhat.com/show_bug.cgi?id=1741782 ** Changed in: nova Status: Confirmed => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1864279 Title: Unable to attach more than 6 scsi volumes Status in OpenStack Compute (nova): Won't Fix Bug description: Scsi volume with unit number 7 can not be attached because of this libvirt check: https://github.com/libvirt/libvirt/blob/89237d534f0fe950d06a2081089154160c6c2224/src/conf/domain_conf.c#L4796 Nova automatically increase volume unit number by 1, and when I attach 7th volume to vm I've got this error: 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [req-156a4725-279d-4173-9f11-85125e4a3e47] [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] Failed to attach volume at mountpoint: /dev/sdh: libvirt.libvirtError: Requested operation is not valid: Domain already contains a disk with that address 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] Traceback (most recent call last): 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] File "/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 1810, in attach_volume 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] guest.attach_device(conf, persistent=True, live=live) 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] File "/usr/lib/python3/dist-packages/nova/virt/libvirt/guest.py", line 305, in attach_device 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] self._domain.attachDeviceFlags(device_xml, flags=flags) 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 190, in doit 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] result = proxy_call(self._autowrap, f, *args, **kwargs) 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 148, in proxy_call 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] rv = execute(f, *args, **kwargs) 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 129, in execute 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] six.reraise(c, e, tb) 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] File "/usr/lib/python3/dist-packages/six.py", line 693, in reraise 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] raise value 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 83, in tworker 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] rv = meth(*args, **kwargs) 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] File "/usr/lib/python3/dist-packages/libvirt.py", line 605, in attachDeviceFlags 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self) 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] libvirt.libvirtError: Requested operation is not valid: Domain already contains a disk with that address 2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 3532baf6-a0a4-4a81-84f9-3622c713435f] After patching libvirt driver to skip unit 7 I can attach more than 6 volumes. ii nova-compute 2:20.0.0-0ubuntu1~cloud0 ii nova-compute-kvm 2:20.0.0-0ubuntu1~cloud0 ii nova-compute-libvirt 2:20.0.0-0ubuntu1~cloud0 ii
[Yahoo-eng-team] [Bug 1863757] Re: Insufficient memory for guest pages when using NUMA
*** This bug is a duplicate of bug 1734204 *** https://bugs.launchpad.net/bugs/1734204 Yes, this has been resolved since Stein as noted at 1734204. Unfortunately Queen is in Extended Maintenance and we no longer release new versions so this is not likely to be fixed there. ** This bug has been marked a duplicate of bug 1734204 Insufficient free host memory pages available to allocate guest RAM with Open vSwitch DPDK in Newton -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1863757 Title: Insufficient memory for guest pages when using NUMA Status in OpenStack Compute (nova): New Bug description: This is a Queens / Bionic openstack deploy. Compute nodes are using hugepages for nova instances (reserved at boot time): root@compute1:~# cat /proc/meminfo | grep -i huge AnonHugePages: 0 kB ShmemHugePages:0 kB HugePages_Total: 332 HugePages_Free: 184 HugePages_Rsvd:0 HugePages_Surp:0 Hugepagesize:1048576 kB There are two numa nodes, as follows: root@compute1:~# lscpu | grep -i numa NUMA node(s):2 NUMA node0 CPU(s): 0-19,40-59 NUMA node1 CPU(s): 20-39,60-79 Compute nodes are using DPDK, and memory for it has been reserved with the following directive: reserved-huge-pages: "node:0,size:1GB,count:8;node:1,size:1GB,count:8" A number of instances have already been created on node "compute1", until the point that current memory usage is as follows: root@compute1:~# cat /sys/devices/system/node/node*/meminfo | grep -i huge Node 0 AnonHugePages: 0 kB Node 0 ShmemHugePages:0 kB Node 0 HugePages_Total: 166 Node 0 HugePages_Free: 26 Node 0 HugePages_Surp: 0 Node 1 AnonHugePages: 0 kB Node 1 ShmemHugePages:0 kB Node 1 HugePages_Total: 166 Node 1 HugePages_Free:158 Node 1 HugePages_Surp: 0 Problem: When a new instance is created (8 cores and 32gb ram), nova tries to schedule it on numa node 0 and fails with "Insufficient free host memory pages available to allocate guest RAM", even though there is enough memory available on numa node 1. This behavior has been seem by other users also here (although the solution on that bug seems to be more a coincidence than a proper solution -- then classified as not a bug, which I don't believe is the case): https://bugzilla.redhat.com/show_bug.cgi?id=1517004 Flavor being used has nothing special except a property for hw:mem_page_size='large'. Instance is being forced to be created on "zone1::compute1", otherwise no kind of pinning of cpus or other resources. All the forcing of vm going to node0 seems to be nova's decision when instantiating it. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1863757/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1864422] Re: can the instance supports online keys updates?
To the best of my knowledge, this is not currently supported. We only support it for rebuild [1], which is a destructive operation. [1] https://specs.openstack.org/openstack/nova- specs/specs/queens/implemented/rebuild-keypair-reset.html ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1864422 Title: can the instance supports online keys updates? Status in OpenStack Compute (nova): Invalid Bug description: Description === As a tenant, the private key of the key may be lost, and the user needs to update the keys rather than break the business and stop instance. Do you have any Suggestions and ideas? Thank you very much. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1864422/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1864160] Re: mete_date shows region information
Please bring questions and support requests to either the openstack- discuss mailing list or the #openstack-nova IRC channel, please ** Changed in: nova Status: New => Opinion ** Changed in: nova Status: Opinion => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1864160 Title: mete_date shows region information Status in OpenStack Compute (nova): Invalid Bug description: Description === I want to show region_name in mete_date. Do you have any Suggestions and ideas? In the instance execution 'curl http://169.254.169.254/openstack/2013-04-04/mete_date.json ', the information of the current instance region (region_name) is displayed.Do you have any good ideas? Thank you very much. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1864160/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1866288] Re: tox pep8 fails on ubuntu 18.04.3
Yup, Rocky was tested on Xenial (16.04), not Bionic (18.04). Bionic doesn't provide a suitable Python interpreter for this older code. This is expected. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1866288 Title: tox pep8 fails on ubuntu 18.04.3 Status in OpenStack Compute (nova): Invalid Bug description: pep8 checking fails for rocky branch on ubuntu 18.04.3 root@mgt02:~/src/nova# tox -epep8 -vvv removing /root/src/nova/.tox/log using tox.ini: /root/src/nova/tox.ini using tox-3.1.0 from /usr/local/lib/python2.7/dist-packages/tox/__init__.pyc skipping sdist step pep8 start: getenv /root/src/nova/.tox/shared pep8 recreate: /root/src/nova/.tox/shared ERROR: InterpreterNotFound: python3.5 pep8 finish: getenv after 0.00 seconds __ summary ___ ERROR: pep8: InterpreterNotFound: python3.5 root@mgt02:~/src/nova# uname -a Linux mgt02 4.15.0-88-generic #88-Ubuntu SMP Tue Feb 11 20:11:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1866288/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1866373] [NEW] URLS in os-keypairs 'links' body are incorrect
Public bug reported: Similar to https://bugs.launchpad.net/nova/+bug/1864428, the URLs in the 'links' element of the response are incorrect. They read '/keypairs', not '/os-keypairs'. From the current api-ref (2020-03-06): { "keypairs": [ { "keypair": { "fingerprint": "7e:eb:ab:24:ba:d1:e1:88:ae:9a:fb:66:53:df:d3:bd", "name": "keypair-5d935425-31d5-48a7-a0f1-e76e9813f2c3", "type": "ssh", "public_key": "ssh-rsa B3NzaC1yc2EDAQABAAABAQCkF3MX59OrlBs3dH5CU7lNmvpbrgZxSpyGjlnE8Flkirnc/Up22lpjznoxqeoTAwTW034k7Dz6aYIrZGmQwe2TkE084yqvlj45Dkyoj95fW/sZacm0cZNuL69EObEGHdprfGJQajrpz22NQoCD8TFB8Wv+8om9NH9Le6s+WPe98WC77KLw8qgfQsbIey+JawPWl4O67ZdL5xrypuRjfIPWjgy/VH85IXg/Z/GONZ2nxHgSShMkwqSFECAC5L3PHB+0+/12M/iikdatFSVGjpuHvkLOs3oe7m6HlOfluSJ85BzLWBbvva93qkGmLg4ZAc8rPh2O+YIsBUHNLLMM/oQp Generated-by-Nova\n" } } ], "keypairs_links": [ { "href": "http://openstack.example.com/v2.1/6f70656e737461636b20342065766572/keypairs?limit=1=keypair-5d935425-31d5-48a7-a0f1-e76e9813f2c3;, "rel": "next" } ] } ** Affects: nova Importance: Low Status: New ** Changed in: nova Importance: Undecided => Low -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1866373 Title: URLS in os-keypairs 'links' body are incorrect Status in OpenStack Compute (nova): New Bug description: Similar to https://bugs.launchpad.net/nova/+bug/1864428, the URLs in the 'links' element of the response are incorrect. They read '/keypairs', not '/os-keypairs'. From the current api-ref (2020-03-06): { "keypairs": [ { "keypair": { "fingerprint": "7e:eb:ab:24:ba:d1:e1:88:ae:9a:fb:66:53:df:d3:bd", "name": "keypair-5d935425-31d5-48a7-a0f1-e76e9813f2c3", "type": "ssh", "public_key": "ssh-rsa B3NzaC1yc2EDAQABAAABAQCkF3MX59OrlBs3dH5CU7lNmvpbrgZxSpyGjlnE8Flkirnc/Up22lpjznoxqeoTAwTW034k7Dz6aYIrZGmQwe2TkE084yqvlj45Dkyoj95fW/sZacm0cZNuL69EObEGHdprfGJQajrpz22NQoCD8TFB8Wv+8om9NH9Le6s+WPe98WC77KLw8qgfQsbIey+JawPWl4O67ZdL5xrypuRjfIPWjgy/VH85IXg/Z/GONZ2nxHgSShMkwqSFECAC5L3PHB+0+/12M/iikdatFSVGjpuHvkLOs3oe7m6HlOfluSJ85BzLWBbvva93qkGmLg4ZAc8rPh2O+YIsBUHNLLMM/oQp Generated-by-Nova\n" } } ], "keypairs_links": [ { "href": "http://openstack.example.com/v2.1/6f70656e737461636b20342065766572/keypairs?limit=1=keypair-5d935425-31d5-48a7-a0f1-e76e9813f2c3;, "rel": "next" } ] } To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1866373/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1533087] Re: there is useless 'u' in the wrong info when execute a wrong nova command
This should not be an issue with Python 3, which is all we support now. Closing as a result. ** Changed in: python-novaclient Status: In Progress => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1533087 Title: there is useless 'u' in the wrong info when execute a wrong nova command Status in OpenStack Compute (nova): Invalid Status in python-novaclient: Won't Fix Bug description: [Summary] there is useless 'u' in the wrong info when execute a wrong nova command [Topo] devstack all-in-one node [Description and expect result] no useless 'u' in the wrong info when execute a wrong nova command [Reproduceable or not] reproduceable [Recreate Steps] 1) there is useless 'u' in the wrong info when execute a wrong nova command: root@45-59:/opt/stack/devstack# nova wrongcmd usage: nova [--version] [--debug] [--os-cache] [--timings] [--os-region-name ] [--service-type ] [--service-name ] [--volume-service-name ] [--os-endpoint-type ] [--os-compute-api-version ] [--bypass-url ] [--insecure] [--os-cacert ] [--os-cert ] [--os-key ] [--timeout ] [--os-auth-type ] [--os-auth-url OS_AUTH_URL] [--os-domain-id OS_DOMAIN_ID] [--os-domain-name OS_DOMAIN_NAME] [--os-project-id OS_PROJECT_ID] [--os-project-name OS_PROJECT_NAME] [--os-project-domain-id OS_PROJECT_DOMAIN_ID] [--os-project-domain-name OS_PROJECT_DOMAIN_NAME] [--os-trust-id OS_TRUST_ID] [--os-default-domain-id OS_DEFAULT_DOMAIN_ID] [--os-default-domain-name OS_DEFAULT_DOMAIN_NAME] [--os-user-id OS_USER_ID] [--os-user-name OS_USERNAME] [--os-user-domain-id OS_USER_DOMAIN_ID] [--os-user-domain-name OS_USER_DOMAIN_NAME] [--os-password OS_PASSWORD] ... error: argument : invalid choice: u'wrongcmd' ISSUE Try 'nova help ' for more information. root@45-59:/opt/stack/devstack# 2)below is a correct example for reference: root@45-59:/opt/stack/devstack# keystone wrongcmd usage: keystone [--version] [--debug] [--os-username ] [--os-password ] [--os-tenant-name ] [--os-tenant-id ] [--os-auth-url ] [--os-region-name ] [--os-identity-api-version ] [--os-token ] [--os-endpoint ] [--os-cache] [--force-new-token] [--stale-duration ] [--insecure] [--os-cacert ] [--os-cert ] [--os-key ] [--timeout ] ... keystone: error: argument : invalid choice: 'wrongcmd' [Configration] reproduceable bug, no need [logs] reproduceable bug, no need [Root cause anlyze or debug inf] reproduceable bug [Attachment] None To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1533087/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1860021] Re: nova-live-migration fails 100% with "mysql: command not found" on subnode
Marking as invalid for nova since the change needed was in DevStack, not nova. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1860021 Title: nova-live-migration fails 100% with "mysql: command not found" on subnode Status in devstack: Fix Released Status in OpenStack Compute (nova): Invalid Bug description: Since [1] nova-live-migration failures can be seen in devstack- subnodes-early.txt.gz like + ./stack.sh:main:1158 : is_glance_enabled + lib/glance:is_glance_enabled:90 : [[ , =~ ,glance ]] + lib/glance:is_glance_enabled:91 : [[ ,c-bak,c-vol,dstat,g-api,n-cpu,peakmem_tracker,placement-client,q-agt =~ ,g- ]] + lib/glance:is_glance_enabled:91 : return 0 + ./stack.sh:main:1159 : echo_summary 'Configuring Glance' + ./stack.sh:echo_summary:452 : [[ -t 3 ]] + ./stack.sh:echo_summary:458 : echo -e Configuring Glance + ./stack.sh:main:1160 : init_glance + lib/glance:init_glance:276 : rm -rf /opt/stack/data/glance/images + lib/glance:init_glance:277 : mkdir -p /opt/stack/data/glance/images + lib/glance:init_glance:280 : recreate_database glance + lib/database:recreate_database:110 : local db=glance + lib/database:recreate_database:111 : recreate_database_mysql glance + lib/databases/mysql:recreate_database_mysql:63 : local db=glance + lib/databases/mysql:recreate_database_mysql:64 : mysql -uroot -psecretmysql -h127.0.0.1 -e 'DROP DATABASE IF EXISTS glance;' /opt/stack/new/devstack/lib/databases/mysql: line 64: mysql: command not found + lib/databases/mysql:recreate_database_mysql:1 : exit_trap [1] https://review.opendev.org/#/c/702707/ To manage notifications about this bug go to: https://bugs.launchpad.net/devstack/+bug/1860021/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1860417] [NEW] Use of randint in functional tests is racey
Public bug reported: In change I475ea0fa5f2d5b197118f0ced5a0ff6907411972, we switched to using 'random.randint' to generate flavor.id values in functional tests. This has proven racey, as seen at [1]. [1] https://zuul.opendev.org/t/openstack/build/c308dab9bd2d43d0b40cf999a34af0f7/console ** Affects: nova Importance: Undecided Assignee: Stephen Finucane (stephenfinucane) Status: In Progress ** Tags: testing ** Tags added: testing ** Changed in: nova Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1860417 Title: Use of randint in functional tests is racey Status in OpenStack Compute (nova): In Progress Bug description: In change I475ea0fa5f2d5b197118f0ced5a0ff6907411972, we switched to using 'random.randint' to generate flavor.id values in functional tests. This has proven racey, as seen at [1]. [1] https://zuul.opendev.org/t/openstack/build/c308dab9bd2d43d0b40cf999a34af0f7/console To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1860417/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1858091] Re: Nova compute api v2.1/servers in train
The main change I can see in stable/train is the inclusion of API microversion 2.75, which made the API stricter and means you will now get a "400 error response for an unknown parameter in the querystring or request body" [1]. This is correct behavior from nova's perspective and it's rancher than needs to be fixed. You can get more information by looking at the body for the 4xx responses and the logs for the nova-api services. [1] https://docs.openstack.org/nova/latest/reference/api-microversion- history.html#id68 ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1858091 Title: Nova compute api v2.1/servers in train Status in kolla-ansible: New Status in OpenStack Compute (nova): Invalid Bug description: **Environment**: * OS (e.g. from /etc/os-release): Ubuntu * Kernel (e.g. `uname -a`): Linux host 4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux * Docker version if applicable (e.g. `docker version`): 19.03.2 * Kolla-Ansible version (e.g. `git head or tag or stable branch` or pip package version if using release): 9.0.0 * Docker image Install type (source/binary): source * Docker image distribution: train * Are you using official images from Docker Hub or self built? official * If self built - Kolla version and environment used to build: * Share your inventory file, globals.yml and other configuration files if relevant - I have updated kolla-ansible(to 9.0.0) and openstack images(to train) recently. Thus, I was using Rancher node driver to provision openstack instances and use it to deploy k8s cluster. With Stein everything was working smoothly. However, after I updated to Train version, Rancher started getting 400-403 error codes: ``` Error creating machine: Error in driver during machine creation: Expected HTTP response code [200] when accessing [POST http://10.0.225.254:8774/v2.1/os-keypairs], but got 403 instead or Error creating machine: Error in driver during machine creation: Expected HTTP response code [200] when accessing [POST http://10.0.225.254:8774/v2.1/servers], but got 400 instead ``` Thus, I am wondering if anything was changed to nova compute api's in Train version and what action can be done in order to fix that issue? I have reported that bug on Rancher github as well: https://github.com/rancher/rancher/issues/24813 cause I am not sure if its fully openstack-version related issue. Regards To manage notifications about this bug go to: https://bugs.launchpad.net/kolla-ansible/+bug/1858091/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1852727] [NEW] PCI passthrough documentation does not describe the steps necessary to passthrough PFs
Public bug reported: This came up on IRC [1]. By default, nova will not allow you to use PF devices unless you specifically request this type of device. This is intentional behavior to allow users to whitelist all devices from a particular vendor and avoid passing through the PF device when they meant to only consume the VFs. In the future, we might want to prevent whitelisting of both PF and VFs, but for now we should document the current behavior. [1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova /%23openstack-nova.2019-11-15.log.html#t2019-11-15T08:39:17 ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1852727 Title: PCI passthrough documentation does not describe the steps necessary to passthrough PFs Status in OpenStack Compute (nova): New Bug description: This came up on IRC [1]. By default, nova will not allow you to use PF devices unless you specifically request this type of device. This is intentional behavior to allow users to whitelist all devices from a particular vendor and avoid passing through the PF device when they meant to only consume the VFs. In the future, we might want to prevent whitelisting of both PF and VFs, but for now we should document the current behavior. [1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova /%23openstack-nova.2019-11-15.log.html#t2019-11-15T08:39:17 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1852727/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1847229] Re: Boot from volume docs don't work
Looks like mriedem tackled this already in commit 16027094ebabc5cd9f2e766431f18aadeff54a40. Excellent. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1847229 Title: Boot from volume docs don't work Status in OpenStack Compute (nova): Invalid Bug description: This bug tracker is for errors with the documentation, use the following as a template and remove or add fields as you see fit. Convert [ ] into [x] to check boxes: - [x] This doc is inaccurate in this way: The commands referenced don't work. Looks like a dodgy translation from novaclient to osc. - [ ] This is a doc addition request. - [ ] I have a fix to the document that I can paste below including example: input and output. If you have a troubleshooting or support issue, use the following resources: - Ask OpenStack: http://ask.openstack.org - The mailing list: http://lists.openstack.org - IRC: 'openstack' channel on Freenode --- Release: on 2019-02-21 00:29:11 SHA: 19b757a4ba52363607965900fe74533bb2db92a7 Source: https://opendev.org/openstack/nova/src/doc/source/user/launch-instance-from-volume.rst URL: https://docs.openstack.org/nova/latest/user/launch-instance-from-volume.html To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1847229/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1847229] [NEW] Boot from volume docs don't work
Public bug reported: This bug tracker is for errors with the documentation, use the following as a template and remove or add fields as you see fit. Convert [ ] into [x] to check boxes: - [x] This doc is inaccurate in this way: The commands referenced don't work. Looks like a dodgy translation from novaclient to osc. - [ ] This is a doc addition request. - [ ] I have a fix to the document that I can paste below including example: input and output. If you have a troubleshooting or support issue, use the following resources: - Ask OpenStack: http://ask.openstack.org - The mailing list: http://lists.openstack.org - IRC: 'openstack' channel on Freenode --- Release: on 2019-02-21 00:29:11 SHA: 19b757a4ba52363607965900fe74533bb2db92a7 Source: https://opendev.org/openstack/nova/src/doc/source/user/launch-instance-from-volume.rst URL: https://docs.openstack.org/nova/latest/user/launch-instance-from-volume.html ** Affects: nova Importance: Undecided Status: Invalid ** Tags: doc -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1847229 Title: Boot from volume docs don't work Status in OpenStack Compute (nova): Invalid Bug description: This bug tracker is for errors with the documentation, use the following as a template and remove or add fields as you see fit. Convert [ ] into [x] to check boxes: - [x] This doc is inaccurate in this way: The commands referenced don't work. Looks like a dodgy translation from novaclient to osc. - [ ] This is a doc addition request. - [ ] I have a fix to the document that I can paste below including example: input and output. If you have a troubleshooting or support issue, use the following resources: - Ask OpenStack: http://ask.openstack.org - The mailing list: http://lists.openstack.org - IRC: 'openstack' channel on Freenode --- Release: on 2019-02-21 00:29:11 SHA: 19b757a4ba52363607965900fe74533bb2db92a7 Source: https://opendev.org/openstack/nova/src/doc/source/user/launch-instance-from-volume.rst URL: https://docs.openstack.org/nova/latest/user/launch-instance-from-volume.html To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1847229/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1731865] Re: Use defusedxml function instead of lxml.etree.parse
As noted in the review, this isn't necessarily a huge issue and I'm not sure it's worth investing time on ** Changed in: nova Status: In Progress => Won't Fix ** Changed in: nova Assignee: Spencer Yu (yushb) => (unassigned) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1731865 Title: Use defusedxml function instead of lxml.etree.parse Status in OpenStack Compute (nova): Won't Fix Bug description: Due to https://docs.openstack.org/bandit/latest/blacklists/blacklist_calls.html#b313-b320-xml, we should use defusedxml function instead of lxml.etree.parse to prevent XML attacks. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1731865/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1781558] Re: Change default video model from cirrus to vga
I'm having a hard time understanding why we should change behavior here. There was a bug in QEMU, that bug has been fixed for over two years, and the fix should be present in QEMU packaged by any self-respecting distro. I'm going to mark this as wontfix. If you disagree, let me know. ** Changed in: nova Status: In Progress => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1781558 Title: Change default video model from cirrus to vga Status in OpenStack Compute (nova): Won't Fix Bug description: Change default video model from cirrus to vga Because of the bug of qemu[1], using cirrus video model is dangerous. To fix this problem, change the default video model, and disable cirrus forever. [1]: CVE-2017-2615 https://lists.gnu.org/archive/html/qemu- devel/2017-02/msg00015.html To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1781558/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1843836] [NEW] Failure to schedule if flavor contains non-CPU flag traits
Public bug reported: I'm seeing the following error locally: Sep 12 18:52:25 compute-small nova-conductor[28968]: ERROR nova.scheduler.utils [None req-b86b25c8-c89e-4449-bec3-c94948402e02 demo admin] [instance: a4056430-ed06-4cea-91b9-e15fd4b1979f] Error from last host: compute-small (node compute-small): [u'Traceback (most recent call last):\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2038, in _do_build_and_run_instance\nfilter_properties, request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2408, in _build_and_run_instance\ninstance_uuid=instance.uuid, reason=six.text_type(e))\n', u"RescheduledException: Build of instance a4056430-ed06-4cea-91b9-e15fd4b1979f was re-scheduled: No CPU model match traits, models: ['IvyBridge-IBRS'], required flags: set([None])\n"] This is affecting me when testing the 'PCPU' feature because we're rewriting the 'hw:cpu_thread_policy' to add a 'HW_CPU_HYPERTHREADING' trait, however, this can happen with any non-CPU flag trait (e.g. COMPUTE_SUPPORTS_MULTIATTACH) because of the following code: https://github.com/openstack/nova/blob/7a18209a8/nova/virt/libvirt/utils.py#L600 That will mean we can return a set contains 'None', which causes this later check to fail: https://github.com/openstack/nova/blob/7a18209a81539217a95ab7daad6bc67002768950/nova/virt/libvirt/driver.py#L4083 Since no CPU model will report a 'None' feature flag. ** Affects: nova Importance: Undecided Assignee: Stephen Finucane (stephenfinucane) Status: In Progress ** Tags: libvirt ** Description changed: I'm seeing the following error locally: Sep 12 18:52:25 compute-small nova-conductor[28968]: ERROR nova.scheduler.utils [None req-b86b25c8-c89e-4449-bec3-c94948402e02 demo admin] [instance: a4056430-ed06-4cea-91b9-e15fd4b1979f] Error from last host: compute-small (node compute-small): [u'Traceback (most recent call last):\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2038, in _do_build_and_run_instance\nfilter_properties, request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2408, in _build_and_run_instance\ninstance_uuid=instance.uuid, reason=six.text_type(e))\n', u"RescheduledException: Build of instance a4056430-ed06-4cea-91b9-e15fd4b1979f was re-scheduled: No CPU model match traits, models: ['IvyBridge-IBRS'], required flags: set([None])\n"] This is affecting me when testing the 'PCPU' feature because we're rewriting the 'hw:cpu_thread_policy' to add a 'HW_CPU_HYPERTHREADING' - trait, however, this can happen with any non-CPU flag trait because of - the following code: + trait, however, this can happen with any non-CPU flag trait (e.g. + COMPUTE_SUPPORTS_MULTIATTACH) because of the following code: https://github.com/openstack/nova/blob/7a18209a8/nova/virt/libvirt/utils.py#L600 That will mean we can return a set contains 'None', which causes this later check to fail: https://github.com/openstack/nova/blob/7a18209a81539217a95ab7daad6bc67002768950/nova/virt/libvirt/driver.py#L4083 Since no CPU model will report a 'None' feature flag. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1843836 Title: Failure to schedule if flavor contains non-CPU flag traits Status in OpenStack Compute (nova): In Progress Bug description: I'm seeing the following error locally: Sep 12 18:52:25 compute-small nova-conductor[28968]: ERROR nova.scheduler.utils [None req-b86b25c8-c89e-4449-bec3-c94948402e02 demo admin] [instance: a4056430-ed06-4cea-91b9-e15fd4b1979f] Error from last host: compute-small (node compute-small): [u'Traceback (most recent call last):\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2038, in _do_build_and_run_instance\nfilter_properties, request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2408, in _build_and_run_instance\ninstance_uuid=instance.uuid, reason=six.text_type(e))\n', u"RescheduledException: Build of instance a4056430-ed06-4cea-91b9-e15fd4b1979f was re-scheduled: No CPU model match traits, models: ['IvyBridge-IBRS'], required flags: set([None])\n"] This is affecting me when testing the 'PCPU' feature because we're rewriting the 'hw:cpu_thread_policy' to add a 'HW_CPU_HYPERTHREADING' trait, however, this can happen with any non-CPU flag trait (e.g. COMPUTE_SUPPORTS_MULTIATTACH) because of the following code: https://github.com/openstack/nova/blob/7a18209a8/nova/virt/libvirt/utils.py#L600 That will mean we can return a set contains 'None', which causes this later check to fail: https://github.com/openstack/nova/blob/7a18209a81539217a95ab7daad6bc67002768950/nova/virt/libvirt/driver.py#L4083 Since no CPU mod
[Yahoo-eng-team] [Bug 1843714] Re: nova-status documentation not in the list of man-pages
** Changed in: nova Status: Won't Fix => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1843714 Title: nova-status documentation not in the list of man-pages Status in OpenStack Compute (nova): Confirmed Bug description: When running "sphinx-build -b man doc/source doc/build/man", the nova- status man page is not build . It's missing from the man_pages list in doc/source/conf.py To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1843714/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1843714] Re: nova-status documentation not in the list of man-pages
** Changed in: nova Status: In Progress => Won't Fix ** Changed in: nova Importance: Undecided => Low -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1843714 Title: nova-status documentation not in the list of man-pages Status in OpenStack Compute (nova): Confirmed Bug description: When running "sphinx-build -b man doc/source doc/build/man", the nova- status man page is not build . It's missing from the man_pages list in doc/source/conf.py To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1843714/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1840807] [NEW] nova-manage man page refers to non-existent option
Public bug reported: The 'nova-manage db sync' command used to take a '--version' option but this was deprecated some time ago and recently removed. However, the man page for this command still references the old option: https://docs.openstack.org/nova/rocky/cli/nova-manage.html#nova-database ** Affects: nova Importance: Low Assignee: Stephen Finucane (stephenfinucane) Status: In Progress ** Tags: doc ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => Low ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1840807 Title: nova-manage man page refers to non-existent option Status in OpenStack Compute (nova): In Progress Bug description: The 'nova-manage db sync' command used to take a '--version' option but this was deprecated some time ago and recently removed. However, the man page for this command still references the old option: https://docs.openstack.org/nova/rocky/cli/nova-manage.html#nova- database To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1840807/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1831771] [NEW] UnexpectedDeletingTaskStateError exception can leave traces of VIFs on host
Public bug reported: This was originally reported in Bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=1668159 The 'UnexpectedDeletingTaskStateError' exception can be raised by something like aborting a large heat stack, where the instance hasn't finished setting up before the stack is aborted and the instances deleted. https://github.com/openstack/nova/blob/19.0.0/nova/db/sqlalchemy/api.py#L2864 We handle this in the compute manager and as part of that handling, we clean up the resource tracking of network interfaces. https://github.com/openstack/nova/blob/19.0.0/nova/compute/manager.py#L2034-L2040 However, we don't unplug these interfaces. This can result in things being left over on the host. We should attempt to unplug VIFs as part of this cleanup. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1831771 Title: UnexpectedDeletingTaskStateError exception can leave traces of VIFs on host Status in OpenStack Compute (nova): New Bug description: This was originally reported in Bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=1668159 The 'UnexpectedDeletingTaskStateError' exception can be raised by something like aborting a large heat stack, where the instance hasn't finished setting up before the stack is aborted and the instances deleted. https://github.com/openstack/nova/blob/19.0.0/nova/db/sqlalchemy/api.py#L2864 We handle this in the compute manager and as part of that handling, we clean up the resource tracking of network interfaces. https://github.com/openstack/nova/blob/19.0.0/nova/compute/manager.py#L2034-L2040 However, we don't unplug these interfaces. This can result in things being left over on the host. We should attempt to unplug VIFs as part of this cleanup. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1831771/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1831269] [NEW] Resize ignores mem_page_size in new flavor
Public bug reported: This was originally reported in Bugzilla. When attempting to resize a instance where the new flavor uses a different pagesize to the old flavor, the 'NUMATopologyFilter' evaluates hosts using the original pagesize value rather than the new one. Steps to Reproduce: 1. Create an instance with 2M hugepage size by setting flavor property: hw:mem_page_size=2048 2. Make sure every other compute node is configured with 1G huge pages 3. Create a new flavor with the property: hw:mem_page_size=1048576 4. Resize the instance as " openstack server resize --flavor " Expected results: Resize operation rebuilds the instance just like cold-migration. It should be able to apply all aspects of the new flavor. Actual results: Resize will fail wit the error message: "Host does not support requested memory pagesize. Requested: 2048 kB _numa_fit_instance_cell /usr/lib/python2.7/site-packages/nova/virt/hardware.py:936" ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1831269 Title: Resize ignores mem_page_size in new flavor Status in OpenStack Compute (nova): New Bug description: This was originally reported in Bugzilla. When attempting to resize a instance where the new flavor uses a different pagesize to the old flavor, the 'NUMATopologyFilter' evaluates hosts using the original pagesize value rather than the new one. Steps to Reproduce: 1. Create an instance with 2M hugepage size by setting flavor property: hw:mem_page_size=2048 2. Make sure every other compute node is configured with 1G huge pages 3. Create a new flavor with the property: hw:mem_page_size=1048576 4. Resize the instance as " openstack server resize --flavor " Expected results: Resize operation rebuilds the instance just like cold-migration. It should be able to apply all aspects of the new flavor. Actual results: Resize will fail wit the error message: "Host does not support requested memory pagesize. Requested: 2048 kB _numa_fit_instance_cell /usr/lib/python2.7/site-packages/nova/virt/hardware.py:936" To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1831269/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1830926] [NEW] Links to reno are incorrect
Public bug reported: There are multiple links to reno in the "release notes" section of the contributor guide: https://docs.openstack.org/nova/stein/contributor/releasenotes.html These are versioned links but reno is unversioned. This is resulting in breaking links when on stable branches like the above. ** Affects: nova Importance: Undecided Assignee: Stephen Finucane (stephenfinucane) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1830926 Title: Links to reno are incorrect Status in OpenStack Compute (nova): In Progress Bug description: There are multiple links to reno in the "release notes" section of the contributor guide: https://docs.openstack.org/nova/stein/contributor/releasenotes.html These are versioned links but reno is unversioned. This is resulting in breaking links when on stable branches like the above. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1830926/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1744455] Re: New instance on compute fails
*** This bug is a duplicate of bug 1672041 *** https://bugs.launchpad.net/bugs/1672041 ** Project changed: nova-solver-scheduler => nova ** This bug has been marked a duplicate of bug 1672041 nova.scheduler.client.report 409 Conflict -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1744455 Title: New instance on compute fails Status in OpenStack Compute (nova): New Bug description: I am installing openstack Pike release. While trying to create the instance on compute node, I see the below errors in nova-schedular logs: 2018-01-20 16:17:31.864 1011 INFO nova.scheduler.host_manager [req-f0b60f13-637f-4856-a321-76914742652c - - - - -] Successfully synced instances from host 'compute'. 2018-01-20 16:18:28.287 1011 WARNING nova.scheduler.client.report [req-07c5ee94-dd71-4328-8f63-f24550f16e37 c8e5bcf05f67431ba5c89518238ef4d7 6a17e79098ab478fa728b4ace304d591 - default default] Unable to submit allocation for instance c9120f12-02b7-4515-ba9f-37faca050cc3 (409 409 Conflict 409 Conflict There was a conflict when trying to complete your request. Unable to allocate inventory: Unable to create allocation for 'VCPU' on resource provider '5841aceb-452b-44b2-b96d-653c394a543c'. The requested amount would violate inventory constraints. ) 2018-01-20 16:18:28.919 1011 WARNING nova.scheduler.client.report [req-07c5ee94-dd71-4328-8f63-f24550f16e37 c8e5bcf05f67431ba5c89518238ef4d7 6a17e79098ab478fa728b4ace304d591 - default default] Unable to submit allocation for instance c9120f12-02b7-4515-ba9f-37faca050cc3 (409 409 Conflict 409 Conflict There was a conflict when trying to complete your request. Unable to allocate inventory: Unable to create allocation for 'VCPU' on resource provider '5841aceb-452b-44b2-b96d-653c394a543c'. The requested amount would violate inventory constraints. ) While Checking the resource available in nova registered compute. I don't see the problem. MariaDB [nova]> select id,vcpus,vcpus_used,hypervisor_type,host,uuid,memory_mb,memory_mb_used from compute_nodes; ++---++-+-+--+---++ | id | vcpus | vcpus_used | hypervisor_type | host| uuid | memory_mb | memory_mb_used | ++---++-+-+--+---++ | 1 | 1 | 0 | QEMU| compute | 5841aceb-452b-44b2-b96d-653c394a543c | 3723 |696 | ++---++-+-+--+---++ 1 row in set (0.00 sec) MariaDB [nova]> To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1744455/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1824813] [NEW] Unsetting '[DEFAULT] dhcp_domain' results in hostname corruption
Public bug reported: Unsetting '[DEFAULT] dhcp_domain' will result in the metadata service/config drive reporting an instance hostname of '${hostname}None' instead of '${hostname}'. This is clearly incorrect behavior. ** Affects: nova Importance: Low Assignee: Stephen Finucane (stephenfinucane) Status: In Progress ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => Low ** Changed in: nova Assignee: (unassigned) => Stephen Finucane (stephenfinucane) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1824813 Title: Unsetting '[DEFAULT] dhcp_domain' results in hostname corruption Status in OpenStack Compute (nova): In Progress Bug description: Unsetting '[DEFAULT] dhcp_domain' will result in the metadata service/config drive reporting an instance hostname of '${hostname}None' instead of '${hostname}'. This is clearly incorrect behavior. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1824813/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1822355] [NEW] Incomplete stubbing of os-vif in libvirt functional tests
Public bug reported: If a functional test fails, we see the following in the logs: 2019-03-29 17:37:10,856 INFO [nova.compute.manager] Terminating instance 2019-03-29 17:37:10,859 INFO [nova.api.openstack.requestlog] 127.0.0.1 "GET /v2/6f70656e737461636b20342065766572/servers/detail" status: 200 len: 1569 microversion: 2.1 time: 0.289134 2019-03-29 17:37:10,867 INFO [nova.virt.libvirt.driver] Instance destroyed successfully. 2019-03-29 17:37:10,869 ERROR [vif_plug_ovs.ovsdb.impl_vsctl] Unable to execute ['ovs-vsctl', '--timeout=120', '--oneline', '--format=json', '--db=tcp:127.0.0.1:6640', '--', '--if-exists', 'del-port', u'br-i nt', u'tap88dae9fa-0d']. Exception: You have attempted to start a privsep helper. This is not allowed in the gate, and indicates a failure to have mocked your tests. 2019-03-29 17:37:10,870 ERROR [os_vif] Failed to unplug vif VIFOpenVSwitch(active=True,address=00:0c:29:0d:11:74,bridge_name='br-int',has_traffic_filtering=False,id=88dae9fa-0dc6-49e3-8c29-3abc41e99ac9,netwo rk=Network(3cb9bc59-5699-4588-a4b1-b87f96708bc6),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap88dae9fa-0d') Traceback (most recent call last): File "/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/os_vif/__init__.py", line 110, in unplug plugin.unplug(vif, instance_info) File "/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/vif_plug_ovs/ovs.py", line 344, in unplug self._unplug_vif_generic(vif, instance_info) File "/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/vif_plug_ovs/ovs.py", line 318, in _unplug_vif_generic self.ovsdb.delete_ovs_vif_port(vif.network.bridge, vif.vif_name) File "/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/vif_plug_ovs/ovsdb/ovsdb_lib.py", line 90, in delete_ovs_vif_port linux_net.delete_net_dev(dev) File "/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 240, in _wrap self.start() File "/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 251, in start channel = daemon.RootwrapClientChannel(context=self) File "nova/tests/fixtures.py", line 2018, in __init__ raise Exception('You have attempted to start a privsep helper. ' Exception: You have attempted to start a privsep helper. This is not allowed in the gate, and indicates a failure to have mocked your tests. As that suggests, we have a problem we should rectify. ** Affects: nova Importance:
[Yahoo-eng-team] [Bug 1821733] [NEW] Failed to compute_task_build_instances: local variable 'sibling_set' referenced before assignment
Public bug reported: Reproduced from rhbz#1686511 (https://bugzilla.redhat.com/show_bug.cgi?id=1686511) When spawning an Openstack instance, this error is received: 2019-03-07 08:07:38.499 3124 WARNING nova.scheduler.utils [req-e577cf31-7a58-420f-8ba5-3f962569ab08 0c90c8d8b42c42e883d2135cc733cac4 8b869a98a43e4fc48001e0ff6d149fe6 - - -] Failed to compute_task_build_instances: local variable 'sibling_set' referenced before assignment Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming res = self.dispatcher.dispatch(message) File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch return self._do_dispatch(endpoint, method, ctxt, args) File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch result = func(ctxt, **new_args) File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 199, in inner return func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 104, in select_destinations dests = self.driver.select_destinations(ctxt, spec_obj) File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 53, in select_destinations selected_hosts = self._schedule(context, spec_obj) File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 113, in _schedule spec_obj, index=num) File "/usr/lib/python2.7/site-packages/nova/scheduler/host_manager.py", line 576, in get_filtered_hosts hosts, spec_obj, index) File "/usr/lib/python2.7/site-packages/nova/filters.py", line 89, in get_filtered_objects list_objs = list(objs) File "/usr/lib/python2.7/site-packages/nova/filters.py", line 44, in filter_all if self._filter_one(obj, spec_obj): File "/usr/lib/python2.7/site-packages/nova/scheduler/filters/__init__.py", line 44, in _filter_one return self.host_passes(obj, spec) File "/usr/lib/python2.7/site-packages/nova/scheduler/filters/numa_topology_filter.py", line 123, in host_passes pci_stats=host_state.pci_stats)) File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 1297, in numa_fit_instance_to_host host_cell, instance_cell, limits) File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 906, in _numa_fit_instance_cell host_cell, instance_cell) File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 854, in _numa_fit_instance_cell_with_pinning max(map(len, host_cell.siblings))) File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 805, in _pack_instance_onto_cores itertools.chain(*sibling_set))) UnboundLocalError: local variable 'sibling_set' referenced before assignment 2019-03-07 08:07:38.500 3124 WARNING nova.scheduler.utils [req- e577cf31-7a58-420f-8ba5-3f962569ab08 0c90c8d8b42c42e883d2135cc733cac4 8b869a98a43e4fc48001e0ff6d149fe6 - - -] [instance: 5bca186a-5a36-4b0f- 8b7a-f2f3bc168b29] Setting instance to ERROR state. This issues appears to be because of: https://github.com/openstack/nova/blob/da9f9c962fe00dbfc9c8fe9c47e964816d67b773/nova/virt/hardware.py#L875 This works normally because of loop variables in Python are available outside of the scope of the loop: >>> for x in range(5): ... pass ... >>> print(x) 4 and because there's usually something in sibling_sets. However, this is presumably failing for this user because there are no free cores at all on the given host. This is likely the race condition between the nova- scheduler and nova-compute services. ** Affects: nova Importance: Undecided Status: New ** Description changed: - Reproduced from rhbz#1686511. + Reproduced from rhbz#1686511 + (https://bugzilla.redhat.com/show_bug.cgi?id=1686511) When spawning an Openstack instance, this error is received: + 2019-03-07 08:07:38.499 3124 WARNING nova.scheduler.utils [req-e577cf31-7a58-420f-8ba5-3f962569ab08 0c90c8d8b42c42e883d2135cc733cac4 8b869a98a43e4fc48001e0ff6d149fe6 - - -] Failed to compute_task_build_instances: local variable 'sibling_set' referenced before assignment + Traceback (most recent call last): - 2019-03-07 08:07:38.499 3124 WARNING nova.scheduler.utils [req-e577cf31-7a58-420f-8ba5-3f962569ab08 0c90c8d8b42c42e883d2135cc733cac4 8b869a98a43e4fc48001e0ff6d149fe6 - - -] Failed to compute_task_build_instances: local variable 'sibling_set' referenced before assignment - Traceback (most recent call last): + File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming + res = self.dispatcher.dispatch(message) - File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py",
[Yahoo-eng-team] [Bug 1815591] [NEW] Out-of-date configuration options and no cross-referencing in scheduler filter guide
Public bug reported: We document all the filter schedulers in [1]. Most of these take some kind of configuration options and document this. However, there is no cross-referencing between these. This lack of cross-referencing also tends to lead to outdated docs as options get moved around, and I suspect there are at least a few typos in here. We should address this here at least. [1] https://docs.openstack.org/nova/rocky/user/filter-scheduler.html ** Affects: nova Importance: Low Assignee: Alexandra Settle (alexandra-settle) Status: New ** Tags: doc ** Changed in: nova Importance: Undecided => Low ** Tags added: doc -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1815591 Title: Out-of-date configuration options and no cross-referencing in scheduler filter guide Status in OpenStack Compute (nova): New Bug description: We document all the filter schedulers in [1]. Most of these take some kind of configuration options and document this. However, there is no cross-referencing between these. This lack of cross-referencing also tends to lead to outdated docs as options get moved around, and I suspect there are at least a few typos in here. We should address this here at least. [1] https://docs.openstack.org/nova/rocky/user/filter-scheduler.html To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1815591/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1814882] [NEW] Bandwidth limits specified in flavors are not applied to generic vHost User interfaces
Public bug reported: Libvirt supports setting bandwidth limits for various VIF types. https://github.com/openstack/nova/blob/bcfd2439bab7cfad942d7e6a187df6edb1d1bf09/nova/virt/libvirt/vif.py#L576 This is supported by pretty much all VIF types including vHost User interfaces defined by os-vif. However, generic vHost user interfaces do not set this field. This is a mistake and should be corrected. ** Affects: nova Importance: Undecided Status: New ** Description changed: Libvirt supports setting bandwidth limits for various VIF types. https://github.com/openstack/nova/blob/bcfd2439bab7cfad942d7e6a187df6edb1d1bf09/nova/virt/libvirt/vif.py#L576 - This is supported by pretty much all VIF types except one: generic vHost - user interfaces. This is a mistake and should be corrected. + This is supported by pretty much all VIF types including vHost User + interfaces defined by os-vif. However, generic vHost user interfaces do + not set this field. This is a mistake and should be corrected. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1814882 Title: Bandwidth limits specified in flavors are not applied to generic vHost User interfaces Status in OpenStack Compute (nova): New Bug description: Libvirt supports setting bandwidth limits for various VIF types. https://github.com/openstack/nova/blob/bcfd2439bab7cfad942d7e6a187df6edb1d1bf09/nova/virt/libvirt/vif.py#L576 This is supported by pretty much all VIF types including vHost User interfaces defined by os-vif. However, generic vHost user interfaces do not set this field. This is a mistake and should be corrected. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1814882/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1811886] [NEW] Overcommit allowed for pinned instances when using hugepages
Public bug reported: When working on a fix for bug 181097, it was noted that the check to ensure pinned instances do not overcommit was not pagesize aware. This means if an instance without hugepages boots on a host with a large number of hugepages allocated, it may not get all of the memory allocated to it. The solution seems to be to make the check pagesize aware. Test cases to prove this is the case are provided below. --- # Host information The memory capacity (and some other stuff) for our node: $ virsh capabilities | xmllint --xpath '/capabilities/host/topology/cells' - 16298528 3075208 4000 0 ... 16512884 3128797 4000 0 ... Clearly there are not 3075208 and 3128797 4k pages on NUMA nodes 0 and 1, respectively, since, for NUMA node 0, (3075208 * 4) + (4000 * 2048) != 16298528. We use [1] to resolve this. Instead we have 16298528 - (4000 * 2048) = 8106528 KiB memory (or 7.93 GiB) for NUMA cell 0 and something similar for cell 1. To make things easier, cell 1 is totally disabled by adding the following to 'nova-cpu.conf': [DEFAULT] vcpu_pin_set = 0-5,12-17 [1] https://review.openstack.org/631038 For all test cases I create the flavor then try to create two servers with the same flavor. # Test A, unpinned, implicit small pages, oversubscribed. This should work because we're not using a specific page size. $ openstack flavor create --vcpu 2 --disk 0 --ram 7168 test.numa $ openstack flavor set test.numa --property hw:numa_nodes=1 $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test1 $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test2 Expect: SUCCESS Actual: SUCCESS # Test B, unpinned, explicit small pages, oversubscribed This should fail because we are request a specific page size, though that size is small pages (4k). $ openstack flavor create --vcpu 2 --disk 0 --ram 7168 test.numa $ openstack flavor set test.numa --property hw:numa_nodes=1 $ openstack flavor set test.numa --property hw:mem_page_size=small $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test1 $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test2 Expect: FAILURE Actual: FAILURE # Test C, pinned, implicit small pages, oversubscribed This should fail because we don't allow oversubscription with CPU pinning. $ openstack flavor create --vcpu 2 --disk 0 --ram 7168 test.pinned $ openstack flavor set test.pinned --property hw:cpu_policy=dedicated $ openstack server create --flavor test.pinned --image cirros-0.3.6-x86_64-disk --wait test1 $ openstack server create --flavor test.pinned --image cirros-0.3.6-x86_64-disk --wait test2 Expect: FAILURE Actual: SUCCESS Interestingly, this fails on the third VM. This is likely because the total memory for that cell, 16298528 KiB, is sufficient to handle two instances but not three. # Test D, pinned, explicit small pages, oversubscribed This should fail because we don't allow oversubscription with CPU pinning. $ openstack flavor create --vcpu 2 --disk 0 --ram 7168 test.pinned $ openstack flavor set test.pinned --property hw:cpu_policy=dedicated $ openstack flavor set test.pinned --property hw:mem_page_size=small $ openstack server create --flavor test.pinned --image cirros-0.3.6-x86_64-disk --wait test1 $ openstack server create --flavor test.pinned --image cirros-0.3.6-x86_64-disk --wait test2 Expect: FAILURE Actual: FAILURE ** Affects: nova Importance: Undecided Assignee: Stephen Finucane (stephenfinucane) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1811886 Title: Overcommit allowed for pinned instances when using hugepages Status in OpenStack Compute (nova): In Progress Bug description: When working on a fix for bug 181097, it was noted that the check to ensure pinned instances do not overcommit was not pagesize aware. This means if an instance without hugepages boots on a host with a large number of hugepages allocated, it may not get all of the memory allocated to it. The solution seems to be to make the check pagesize aware. Test cases to prove this is the case are provided below. --- # Host information The memory capacity (and some other stuff) for our node: $ virsh capabilities | xmllint --xpath '/capabilities/host/topology/cells' - 16298528 3075208 4000 0 ... 16512884 3128797 4000 0 ... Clearly there are not 3075208 and 3128797 4k pages on NUMA nodes 0 and 1
[Yahoo-eng-team] [Bug 1811870] [NEW] libvirt reporting incorrect value of 4k (small) pages
Public bug reported: libvirt < 4.3.0 had an issue whereby assigning more than 4 GB of huge pages would result in an incorrect value for the number of 4k (small) pages. This was tracked and fixed via rhbz#1569678 and the fixes appear to have been backported to the libvirt versions for RHEL 7.4+. However, this is still an issue with the versions of libvirt available on Ubuntu 16.04, 18.04 and who knows what else. We should either alert the user that the bug exists or, better again, work around the issue using the rest of the (correct) values for different page sizes. # Incorrect value (Ubuntu 16.04, libvirt 4.0.0) $ virsh capabilities | xmllint --xpath /capabilities/host/topology/cells/cell[1] - 16298528 3075208 4000 0 ... (3075208 * 4) + (4000 * 2048) != 16298528 # Correct values (Fedora ??, libvirt 4.10) $ virsh capabilities | xmllint --xpath /capabilities/host/topology/cells/cell[1] - 32359908 8038777 100 0 ... (8038777 * 4) + (100 * 2048) == 32359908 [1] https://bugzilla.redhat.com/show_bug.cgi?id=1569678 ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1811870 Title: libvirt reporting incorrect value of 4k (small) pages Status in OpenStack Compute (nova): New Bug description: libvirt < 4.3.0 had an issue whereby assigning more than 4 GB of huge pages would result in an incorrect value for the number of 4k (small) pages. This was tracked and fixed via rhbz#1569678 and the fixes appear to have been backported to the libvirt versions for RHEL 7.4+. However, this is still an issue with the versions of libvirt available on Ubuntu 16.04, 18.04 and who knows what else. We should either alert the user that the bug exists or, better again, work around the issue using the rest of the (correct) values for different page sizes. # Incorrect value (Ubuntu 16.04, libvirt 4.0.0) $ virsh capabilities | xmllint --xpath /capabilities/host/topology/cells/cell[1] - 16298528 3075208 4000 0 ... (3075208 * 4) + (4000 * 2048) != 16298528 # Correct values (Fedora ??, libvirt 4.10) $ virsh capabilities | xmllint --xpath /capabilities/host/topology/cells/cell[1] - 32359908 8038777 100 0 ... (8038777 * 4) + (100 * 2048) == 32359908 [1] https://bugzilla.redhat.com/show_bug.cgi?id=1569678 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1811870/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1810977] [NEW] Oversubscription broken for instances with NUMA topologies
Public bug reported: As described in [1], the fix to [2] appears to have inadvertently broken oversubscription of memory for instances with a NUMA topology but no hugepages. Steps to reproduce: 1. Create a flavor that will consume > 50% available memory for your host(s) and specify an explicit NUMA topology. For example, on my all- in-one deployment where the host has 32GB RAM, we will request a 20GB instance: $ openstack flavor create --vcpu 2 --disk 0 --ram 20480 test.numa $ openstack flavor set test.numa --property hw:numa_nodes=2 2. Boot an instance using this flavor: $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test 3. Boot another instance using this flavor: $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test2 # Expected result: The second instance should boot. # Actual result: The second instance fails to boot. We see the following error message in the logs. nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] No specific pagesize requested for instance, selected pagesize: 4 {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1045}} nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] Not enough available memory to schedule instance with pagesize 4. Required: 10240, available: 5676, total: 15916. {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1055}} If we revert the patch that addressed the bug [3] then we revert to the correct behaviour and the instance boots. With this though, we obviously lose whatever benefits that change gave us. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001459.html [2] https://bugs.launchpad.net/nova/+bug/1734204 [3] https://review.openstack.org/#/c/532168 ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1810977 Title: Oversubscription broken for instances with NUMA topologies Status in OpenStack Compute (nova): New Bug description: As described in [1], the fix to [2] appears to have inadvertently broken oversubscription of memory for instances with a NUMA topology but no hugepages. Steps to reproduce: 1. Create a flavor that will consume > 50% available memory for your host(s) and specify an explicit NUMA topology. For example, on my all- in-one deployment where the host has 32GB RAM, we will request a 20GB instance: $ openstack flavor create --vcpu 2 --disk 0 --ram 20480 test.numa $ openstack flavor set test.numa --property hw:numa_nodes=2 2. Boot an instance using this flavor: $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test 3. Boot another instance using this flavor: $ openstack server create --flavor test.numa --image cirros-0.3.6-x86_64-disk --wait test2 # Expected result: The second instance should boot. # Actual result: The second instance fails to boot. We see the following error message in the logs. nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] No specific pagesize requested for instance, selected pagesize: 4 {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1045}} nova-scheduler[18295]: DEBUG nova.virt.hardware [None req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] Not enough available memory to schedule instance with pagesize 4. Required: 10240, available: 5676, total: 15916. {{(pid=18318) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1055}} If we revert the patch that addressed the bug [3] then we revert to the correct behaviour and the instance boots. With this though, we obviously lose whatever benefits that change gave us. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001459.html [2] https://bugs.launchpad.net/nova/+bug/1734204 [3] https://review.openstack.org/#/c/532168 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1810977/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1809136] [NEW] Unsupported VIF type unbound convert '_nova_to_osvif_vif_unbound' on compute restart
Public bug reported: This is a variant of an existing bug: - https://bugs.launchpad.net/nova/+bug/1738373 tracks a similar exception ('_nova_to_osvif_vif_binding_failed') on compute startup. There are also two other closely related bugs: - https://bugs.launchpad.net/nova/+bug/1783917 tracks this same exception ('_nova_to_osvif_vif_unbound') but for live migrations - https://bugs.launchpad.net/nova/+bug/1784579 tracks a similar exception ('_nova_to_osvif_vif_binding_failed') but for live migration In addition, there are a few bugs which are likely the root cause of all of the above issues (and this one) in the first place: - https://bugs.launchpad.net/nova/+bug/1751923 In this instance, as with bug 1738373, we are unable to start nova- compute service on compute node due to an os-vif invoked error. nova-compute.log on compute shows: 2018-05-12 16:42:47.323 305978 INFO os_vif [req-0a72cdea-843a-4932-b8a0-bc24c2f21d9f - - - - -] Successfully plugged vif VIFBridge(active=True,address=fa:16:3e:41:a9:2c,bridge_name='qbr8d027ff4-23',has_traffic_filtering=True,id=8d027ff4-2328-47df-9f9a-2c1a9914a83b,network=Network(9a98b244-b1d2-46b3-ab0e-be8456e3a984),plugin='ovs',port_profile=VIFPortProfileBase,preserve_on_delete=False,vif_name='tap8d027ff4-23') 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service [req-0a72cdea-843a-4932-b8a0-bc24c2f21d9f - - - - -] Error starting thread. 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service Traceback (most recent call last): 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/oslo_service/service.py", line 708, in run_service 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service service.start() 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/nova/service.py", line 117, in start 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service self.manager.init_host() 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1154, in init_host 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service self._init_instance(context, instance) 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 957, in _init_instance 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service self.driver.plug_vifs(instance, net_info) 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 703, in plug_vifs 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service self.vif_driver.plug(instance, vif) 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/vif.py", line 771, in plug 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service vif_obj = os_vif_util.nova_to_osvif_vif(vif) 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/nova/network/os_vif_util.py", line 408, in nova_to_osvif_vif 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service {'type': vif['type'], 'func': funcname}) 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service NovaException: Unsupported VIF type unbound convert '_nova_to_osvif_vif_unbound' 2018-05-12 16:42:47.369 305978 ERROR oslo_service.service Inspecting the available ports shows the port does exist, so this looks like a caching issue. [stack@director:~]$ neutron port-list | grep fa:16:3e:41:a9:2c | 8d027ff4-2328-47df-9f9a-2c1a9914a83b | | fa:16:3e:41:a9:2c | {"subnet_id": "1f5ed9bc-aa7d-49bd-ac48-23b430fc0eb4", "ip_address": "172.19.9.17"} | [stack@director:~]$ neutron port-show 8d027ff4-2328-47df-9f9a-2c1a9914a83b +---++ | Field | Value | +---++ | admin_state_up| True | | allowed_address_pairs | | | binding:host_id | overcloud-compute-7.localdomain | | binding:profile | {} | | binding:vif_details | {"port_filter": true, "ovs_hybrid_plug": true} | | binding:vif_type | ovs | | binding:vnic_type | normal
[Yahoo-eng-team] [Bug 1797146] [NEW] failed to boot guest with vnic_type direct when rx_queue_size, tx_queue_size and hw_vif_type are set
Public bug reported: Bug #1789074 addressed an issue with booting a guest with vnic_type direct when rx_queue_size and tx_queue_size. However, this failed to address an additional permutation: the user specifying hw_vif_type=virtio. If the user does this, the problem occurs once again. Reproduction steps are the same noted in bug #1789074 with one additional step needed: openstack image set --property hw_vif_type=virtio $IMAGE Once configured, boot an instance with this image and an SRIOV (PF or VF) interface and the instance will fail to spawn. This is because we first read and set the VIF model from the image metadata property: https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L134-L135 Which means a later check passes: https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L172 Without setting this property, that check would fail as we never configure the model for direct SR-IOV interfaces. https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L139 ** Affects: nova Importance: Medium Status: Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1797146 Title: failed to boot guest with vnic_type direct when rx_queue_size, tx_queue_size and hw_vif_type are set Status in OpenStack Compute (nova): Confirmed Bug description: Bug #1789074 addressed an issue with booting a guest with vnic_type direct when rx_queue_size and tx_queue_size. However, this failed to address an additional permutation: the user specifying hw_vif_type=virtio. If the user does this, the problem occurs once again. Reproduction steps are the same noted in bug #1789074 with one additional step needed: openstack image set --property hw_vif_type=virtio $IMAGE Once configured, boot an instance with this image and an SRIOV (PF or VF) interface and the instance will fail to spawn. This is because we first read and set the VIF model from the image metadata property: https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L134-L135 Which means a later check passes: https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L172 Without setting this property, that check would fail as we never configure the model for direct SR-IOV interfaces. https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L139 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1797146/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1614092] Re: SRIOV - PF / VM that assign to PF does not get vlan tag
As noted, this is resolved in Ocata. There is an issue with this currently but that's being tracked in #1743458 ** Changed in: nova Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1614092 Title: SRIOV - PF / VM that assign to PF does not get vlan tag Status in neutron: Invalid Status in OpenStack Compute (nova): Fix Released Bug description: During RFE testing Manage SR-IOV PFs as Neutron ports, I found VM booted with Neutron port vnic_type direct-physical but it does not get access to DHCP server. The problem is that the PF / VM does not get VLAN tag with the internal vlan. Workaround : Enter to the VM via console and set vlan interface. version RHOS 10 python-neutronclient-4.2.1-0.20160721230146.3b1c538.el7ost.noarch openstack-neutron-common-9.0.0-0.20160726001729.6a23add.el7ost.noarch python-neutron-9.0.0-0.20160726001729.6a23add.el7ost.noarch openstack-neutron-fwaas-9.0.0-0.20160720211704.c3e491c.el7ost.noarch openstack-neutron-metering-agent-9.0.0-0.20160726001729.6a23add.el7ost.noarch openstack-neutron-openvswitch-9.0.0-0.20160726001729.6a23add.el7ost.noarch puppet-neutron-9.1.0-0.20160725142451.4061b39.el7ost.noarch python-neutron-lib-0.2.1-0.20160726025313.405f896.el7ost.noarch openstack-neutron-ml2-9.0.0-0.20160726001729.6a23add.el7ost.noarch openstack-neutron-9.0.0-0.20160726001729.6a23add.el7ost.noarch openstack-neutron-sriov-nic-agent-9.0.0-0.20160726001729.6a23add.el7ost.noarch To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1614092/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1549915] Re: Lots of "NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported" observed in gate-cinder-python27 logs
These occur on the latest DevStack deploy. The opt and the warning both originate in glance so I'm reassigning. ** Changed in: cinder Status: Invalid => Confirmed ** Project changed: cinder => glance ** Project changed: glance => oslo.db -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1549915 Title: Lots of "NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported" observed in gate-cinder-python27 logs Status in oslo.db: Confirmed Bug description: There are lots of instances of "NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported" observed in gate-cinder-python27 logs. eg: http://logs.openstack.org/02/282002/1/check/gate-cinder-python27/332a226/console.html.gz ... 2016-02-18 22:42:12.214 | /home/jenkins/workspace/gate-cinder-python27/.tox/py27/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py:241: NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported 2016-02-18 22:42:12.214 | exception.NotSupportedWarning 2016-02-18 22:42:12.214 | 2016-02-18 22:42:12.224 | {3} cinder.tests.unit.api.contrib.test_admin_actions.AdminActionsAttachDetachTest.test_volume_force_detach_raises_remote_error [3.892236s] ... ok 2016-02-18 22:42:12.224 | 2016-02-18 22:42:12.224 | Captured stderr: 2016-02-18 22:42:12.224 | 2016-02-18 22:42:12.224 | /home/jenkins/workspace/gate-cinder-python27/.tox/py27/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py:241: NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported 2016-02-18 22:42:12.224 | exception.NotSupportedWarning ... To manage notifications about this bug go to: https://bugs.launchpad.net/oslo.db/+bug/1549915/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1670628] Re: nova-compute will try to re-plug the vif even if it exists for vhostuser port.
** Changed in: nova Status: Opinion => Confirmed ** Changed in: nova Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1670628 Title: nova-compute will try to re-plug the vif even if it exists for vhostuser port. Status in OpenStack Compute (nova): Confirmed Bug description: Description === In mitaka version, deploy neutron with ovs-dpdk. If we stop ovs-agent, then re-start the nova-compute,the vm in the host will get network connection failed. Steps to reproduce == deploy mitaka. with neutron, enabled ovs-dpdk, choose one compute node, where vm has network connection. run this in host, 1. #systemctl stop neutron-openvswitch-agent.service 2. #systemctl restart openstack-nova-compute.service then ping $VM_IN_THIS_HOST Expected result === ping $VM_IN_THIS_HOST would would success Actual result = ping $VM_IN_THIS_HOST failed. Environment === Centos7 ovs2.5.1 dpdk 2.2.0 openstack-nova-compute-13.1.1-1 Reason: after some digging, I found that nova-compute will try to plug the vif every time when it booting. Specially for vhostuser port, nova-compute will not check whether it exists as legacy ovs,and it will re-plug the port with vsctl args like "--if-exists del-port vhu". (refer https://github.com/openstack/nova/blob/stable/mitaka/nova/virt/libvirt/vif.py#L679-L683) after recreate the ovs vhostuser port, it will not get the right vlan tag which set from ovs agent. In the test environment, after restart the ovs agent, the agent will set a proper vlan id for the port. and the network connection will be resumed. Not sure it's a bug or config issue, do I miss something? there is also fp_plug type for vhostuser port, how could we specify it? To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1670628/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1744965] [NEW] 'emulator_threads_policy' doesn't work with 'vcpu_pin_set'
Public bug reported: When hyper threading is enabled, the way emulator_threads_policy allocates the extra cpu resource for emulator is not optimal. The instance I use for testing is a 6-vcpu VM; before enable this emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we enable hyper threading) in nova config, vcpu_pin_set=8,10,12,32,34,36 Now when we enable emulator_threads_policy, in stead of adding one more thread to this vcpu pin list in the nova config, I end up adding two more sibling threads (on the same core) vcpu_pin_set=8,10,12,16,32,34,36,40 So I ended up using 2 more threads, but only of them is used for emulator and the other thread is wasted. Originally reported on Bugzilla - https://bugzilla.redhat.com/1534669 ** Affects: nova Importance: Undecided Status: New ** Description changed: When hyper threading is enabled, the way emulator_threads_policy allocates the extra cpu resource for emulator is not optimal. - The instance I use for testing is a 6-vcpu VM; before enable this emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we enable hyper threading) in nova config, - vcpu_pin_set=8,10,12,32,34,36 + The instance I use for testing is a 6-vcpu VM; before enable this + emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we + enable hyper threading) in nova config, - Now when we enable emulator_threads_policy, in stead of adding one more thread to this vcpu pin list in the nova config, I end up adding two more sibling threads (on the same core) - vcpu_pin_set=8,10,12,16,32,34,36,40 + vcpu_pin_set=8,10,12,32,34,36 + + Now when we enable emulator_threads_policy, in stead of adding one more + thread to this vcpu pin list in the nova config, I end up adding two + more sibling threads (on the same core) + + vcpu_pin_set=8,10,12,16,32,34,36,40 So I ended up using 2 more threads, but only of them is used for emulator and the other thread is wasted. Originally reported on Bugzilla - https://bugzilla.redhat.com/534669 ** Description changed: When hyper threading is enabled, the way emulator_threads_policy allocates the extra cpu resource for emulator is not optimal. The instance I use for testing is a 6-vcpu VM; before enable this emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we enable hyper threading) in nova config, - vcpu_pin_set=8,10,12,32,34,36 + vcpu_pin_set=8,10,12,32,34,36 Now when we enable emulator_threads_policy, in stead of adding one more thread to this vcpu pin list in the nova config, I end up adding two more sibling threads (on the same core) vcpu_pin_set=8,10,12,16,32,34,36,40 So I ended up using 2 more threads, but only of them is used for emulator and the other thread is wasted. - Originally reported on Bugzilla - https://bugzilla.redhat.com/534669 + Originally reported on Bugzilla - https://bugzilla.redhat.com/1534669 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1744965 Title: 'emulator_threads_policy' doesn't work with 'vcpu_pin_set' Status in OpenStack Compute (nova): New Bug description: When hyper threading is enabled, the way emulator_threads_policy allocates the extra cpu resource for emulator is not optimal. The instance I use for testing is a 6-vcpu VM; before enable this emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we enable hyper threading) in nova config, vcpu_pin_set=8,10,12,32,34,36 Now when we enable emulator_threads_policy, in stead of adding one more thread to this vcpu pin list in the nova config, I end up adding two more sibling threads (on the same core) vcpu_pin_set=8,10,12,16,32,34,36,40 So I ended up using 2 more threads, but only of them is used for emulator and the other thread is wasted. Originally reported on Bugzilla - https://bugzilla.redhat.com/1534669 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1744965/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1743728] Re: giturl not working for api-ref (nova, neutron-lib)
** Also affects: nova Importance: Undecided Status: New ** No longer affects: openstack-doc-tools ** Also affects: neutron Importance: Undecided Status: New ** Tags added: api-ref -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1743728 Title: giturl not working for api-ref (nova, neutron-lib) Status in neutron: New Status in OpenStack Compute (nova): New Bug description: The report a bug link does not have a valid giturl for: https://developer.openstack.org/api-ref/network/ https://developer.openstack.org/api-ref/compute/ Note https://developer.openstack.org/api-ref/baremetal/ works fine. Did not check more. Note https://review.openstack.org/534666 might be part of the solution. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1743728/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp