[Yahoo-eng-team] [Bug 1811870] Re: libvirt reporting incorrect value of 4k (small) pages

2023-11-13 Thread Stephen Finucane
** Changed in: nova
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1811870

Title:
  libvirt reporting incorrect value of 4k (small) pages

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  libvirt < 4.3.0 had an issue whereby assigning more than 4 GB of huge
  pages would result in an incorrect value for the number of 4k (small)
  pages. This was tracked and fixed via rhbz#1569678 and the fixes
  appear to have been backported to the libvirt versions for RHEL 7.4+.
  However, this is still an issue with the versions of libvirt available
  on Ubuntu 16.04, 18.04 and who knows what else. We should either alert
  the user that the bug exists or, better again, work around the issue
  using the rest of the (correct) values for different page sizes.

  # Incorrect value (Ubuntu 16.04, libvirt 4.0.0)

  $ virsh capabilities | xmllint --xpath 
/capabilities/host/topology/cells/cell[1] -
  
    16298528
    3075208
    4000
    0
    ...
  

  (3075208 * 4) + (4000 * 2048) != 16298528

  # Correct values (Fedora ??, libvirt 4.10)

  $ virsh capabilities | xmllint --xpath 
/capabilities/host/topology/cells/cell[1] -
  
    32359908
    8038777
    100
    0
    ...
  

  (8038777 * 4) + (100 * 2048) == 32359908

  [1] https://bugzilla.redhat.com/show_bug.cgi?id=1569678

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1811870/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1821088] Re: Virtual Interface creation failed due to duplicate entry

2023-11-13 Thread Stephen Finucane
** Changed in: nova/wallaby
   Status: New => Won't Fix

** Changed in: nova/victoria
   Status: New => Won't Fix

** Changed in: nova/train
   Status: New => Won't Fix

** Changed in: nova/ussuri
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1821088

Title:
  Virtual Interface creation failed due to duplicate entry

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) train series:
  Won't Fix
Status in OpenStack Compute (nova) ussuri series:
  Won't Fix
Status in OpenStack Compute (nova) victoria series:
  Won't Fix
Status in OpenStack Compute (nova) wallaby series:
  Won't Fix
Status in OpenStack Compute (nova) xena series:
  Fix Released

Bug description:
  Seen once in a test on stable/rocky:

  http://logs.openstack.org/48/638348/1/gate/heat-functional-convg-
  mysql-lbaasv2-py35/9d70590/logs/screen-n-api.txt.gz?level=ERROR

  The traceback appears to be similar to the one reported in bug 1602357
  (which raises the possibility that
  https://bugs.launchpad.net/nova/+bug/1602357/comments/8 is relevant
  here):

  ERROR nova.api.openstack.wsgi [None req-e05ce059-71c4-437d-91e0-e4bc896acca6 
demo demo] Unexpected exception in API method: 
nova.exception_Remote.VirtualInterfaceCreateException_Remote: Virtual Interface 
creation failed
  pymysql.err.IntegrityError: (1062, "Duplicate entry 
'fa:16:3e:9d:18:a6/aac0ca83-b3d2-4b28-ab15-de2d3a3e6e16-0' for key 
'uniq_virtual_interfaces0address0deleted'")
  oslo_db.exception.DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, 
"Duplicate entry 'fa:16:3e:9d:18:a6/aac0ca83-b3d2-4b28-ab15-de2d3a3e6e16-0' for 
key 'uniq_virtual_interfaces0address0deleted'") [SQL: 'INSERT INTO 
virtual_interfaces (created_at, updated_at, deleted_at, deleted, address, 
network_id, instance_uuid, uuid, tag) VALUES (%(created_at)s, %(updated_at)s, 
%(deleted_at)s, %(deleted)s, %(address)s, %(network_id)s, %(instance_uuid)s, 
%(uuid)s, %(tag)s)'] [parameters: {'created_at': datetime.datetime(2019, 3, 20, 
16, 11, 27, 753079), 'tag': None, 'uuid': 
'aac0ca83-b3d2-4b28-ab15-de2d3a3e6e16', 'deleted_at': None, 'deleted': 0, 
'address': 'fa:16:3e:9d:18:a6/aac0ca83-b3d2-4b28-ab15-de2d3a3e6e16', 
'network_id': None, 'instance_uuid': '890675f9-3a1e-4a07-8bed-8648cea9fbb9', 
'updated_at': None}] (Background on this error at: http://sqlalche.me/e/gkpj)

  (This sequence of exceptions occurs 3 times, I assume because retrying
  is normally sufficient to fix a duplicate entry problem.)

  The test was
  
heat_integrationtests.functional.test_cancel_update.CancelUpdateTest.test_cancel_update_server_with_port

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1821088/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1441419] Re: port 'binding:host_id' can't be removed when VM is deleted

2022-05-19 Thread Stephen Finucane
This was fixed in neutron. There's no bug against nova here.

** Changed in: nova
   Status: Confirmed => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1441419

Title:
  port 'binding:host_id' can't be removed when VM is deleted

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  reproduce this problem:
  1. create a neutron port
  2. use this port to boot a VM
  3. delete this VM
  4. we can see port still exist, but the 'binding:host_id' can't be removed

  the reason is that in _unbind_ports, when it update the port, it set
  'port_req_body['port']['binding:host_id'] = None', but for neutron,
  when update the port, if the attribute is None, it will not change

  def _unbind_ports(self, context, ports,
neutron, port_client=None):

  port_binding = self._has_port_binding_extension(context,
  refresh_cache=True, neutron=neutron)
  if port_client is None:
  # Requires admin creds to set port bindings
  port_client = (neutron if not port_binding else
 get_client(context, admin=True))
  for port_id in ports:
  port_req_body = {'port': {'device_id': '', 'device_owner': ''}}
  if port_binding:
  port_req_body['port']['binding:host_id'] = None

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1441419/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1974173] [NEW] Remaining ports are not unbound if one port is missing

2022-05-19 Thread Stephen Finucane
Public bug reported:

As part of the instance deletion process, we must unbind ports
associated with said instance. To do this, we loop over all ports
currently attached to an instance. However, if neutron returns HTTP 404
(Not Found) for any of these ports, we will return early and fail to
unbind the remaining ports. We've seen the problem in the context of
Kubernetes on OpenStack. Our deinstaller is brute-force, so it deletes
ports and servers at the same time, so a race means the port can get
deleted early. This normally wouldn't be an issue as we'd just "untrunk"
it and proceed to delete it. But that won't work for SR-IOV ports as in
that case you cannot "untrunk" bound ports.

The solution here is obvious: if we fail to find a port, we should
simply skip that and continue unbinding everything else.

** Affects: nova
 Importance: Medium
     Assignee: Stephen Finucane (stephenfinucane)
 Status: Confirmed


** Tags: neutron

** Tags added: neutron

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1974173

Title:
  Remaining ports are not unbound if one port is missing

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  As part of the instance deletion process, we must unbind ports
  associated with said instance. To do this, we loop over all ports
  currently attached to an instance. However, if neutron returns HTTP
  404 (Not Found) for any of these ports, we will return early and fail
  to unbind the remaining ports. We've seen the problem in the context
  of Kubernetes on OpenStack. Our deinstaller is brute-force, so it
  deletes ports and servers at the same time, so a race means the port
  can get deleted early. This normally wouldn't be an issue as we'd just
  "untrunk" it and proceed to delete it. But that won't work for SR-IOV
  ports as in that case you cannot "untrunk" bound ports.

  The solution here is obvious: if we fail to find a port, we should
  simply skip that and continue unbinding everything else.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1974173/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1934770] [NEW] Mismatch between forced host and AZ prevents move operations

2021-07-06 Thread Stephen Finucane
Public bug reported:

When spawning a new instance, it's possible to force the instance to a
specific host by using a special 'availability_zone[:host[:node]]'
syntax for the 'availability_zone' field in the request. For example,
when using OSC:

  openstack server create --availability-zone my-az:my-host ... my-
server

Doing so bypasses the scheduler, which means the
'AvailabilityZoneFilter' never runs to validate the availability zone-
host combo. As a result, the availability zone portion of this value is
effectively ignored and the host will be used regardless of the
availability zone requested. This has some nasty side-effects. For one,
the availability zone information stored on the instance is generated
from the availability zone of the host the instance boots on, *not* the
availability zone requested in the host. This means that when a user
runs 'openstack server show' or 'openstack server list --long', they'll
see different availability zone information to what they requested.
However, the value requested *is* recorded in 'RequestSpec' object
created for the instance. This is reused if we attempt future move
operations and because the availability zone information was never
verified, it's possible to end up with an instance that can't be moved
since no host with the matching availability zone information exists.
The two issues collide with each other since the failure logs in the
latter case will reference one availability zone value, while inspecting
the instance record will show another value. This is seriously
confusing.

The solution seems to be to either (a) error out when an invalid
availability zone-host combo is requested or simply ignore the
availability zone aspect of the request, opting to use the value of the
host instead (with a warning, ideally). Note that microversion 2.74
introduced a better way of requesting a specific host without bypassing
the scheduler, using 'host' and 'hypervisor_hostname' fields in the body
of the instance create request, however, the old way of doing things is
not yet deprecated and even if it was, we'd still have to support this
for older microversions. We should fix this DB discrepancy one way or
the other.

** Affects: nova
 Importance: Medium
 Assignee: Stephen Finucane (stephenfinucane)
 Status: Confirmed


** Tags: availability-zones scheduler

** Tags added: availability-zones

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1934770

Title:
  Mismatch between forced host and AZ prevents move operations

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  When spawning a new instance, it's possible to force the instance to a
  specific host by using a special 'availability_zone[:host[:node]]'
  syntax for the 'availability_zone' field in the request. For example,
  when using OSC:

openstack server create --availability-zone my-az:my-host ... my-
  server

  Doing so bypasses the scheduler, which means the
  'AvailabilityZoneFilter' never runs to validate the availability zone-
  host combo. As a result, the availability zone portion of this value
  is effectively ignored and the host will be used regardless of the
  availability zone requested. This has some nasty side-effects. For
  one, the availability zone information stored on the instance is
  generated from the availability zone of the host the instance boots
  on, *not* the availability zone requested in the host. This means that
  when a user runs 'openstack server show' or 'openstack server list
  --long', they'll see different availability zone information to what
  they requested. However, the value requested *is* recorded in
  'RequestSpec' object created for the instance. This is reused if we
  attempt future move operations and because the availability zone
  information was never verified, it's possible to end up with an
  instance that can't be moved since no host with the matching
  availability zone information exists. The two issues collide with each
  other since the failure logs in the latter case will reference one
  availability zone value, while inspecting the instance record will
  show another value. This is seriously confusing.

  The solution seems to be to either (a) error out when an invalid
  availability zone-host combo is requested or simply ignore the
  availability zone aspect of the request, opting to use the value of
  the host instead (with a warning, ideally). Note that microversion
  2.74 introduced a better way of requesting a specific host without
  bypassing the scheduler, using 'host' and 'hypervisor_hostname' fields
  in the body of the instance create request, however, the old way of
  doing things is not yet deprecated and even if it was, we'd still have
  to support this for older microversions. We should fix this DB
  discrepancy one way or the other.

To manage notifications about this bug

[Yahoo-eng-team] [Bug 1933954] Re: The binding-extended extension is no longer reported for ML2/OVN

2021-06-29 Thread Stephen Finucane
This appears to have been introduced with [1]. The solution is likely to
add the missing attribute to the list of supported extensions reported
for this backend [2]

[1] https://review.opendev.org/c/openstack/neutron/+/793141
[2] 
https://github.com/openstack/neutron/blob/cbbab2fac5ae85d049a8201c06b58f4d7cb33495/neutron/common/ovn/extensions.py#L85

** Summary changed:

- test_live_migration_with_trunk failing due to Call _is_port_status_active 
returns false in 60.00 seconds
+ The binding-extended extension is no longer reported for ML2/OVN

** No longer affects: nova

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1933954

Title:
  The binding-extended extension is no longer reported for ML2/OVN

Status in neutron:
  In Progress

Bug description:
  https://zuul.opendev.org/t/openstack/builds?job_name=nova-live-
  migration=master

  Started failing on the 28th, I assume because of changes outside of
  Nova?

  
https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_8f7/771362/28/check
  /nova-live-migration/8f76ccd/testr_results.html

  2021-06-29 06:35:24,460 125131 DEBUG[tempest.lib.common.utils.test_utils] 
Call _is_port_status_active returns false in 60.00 seconds
  }}}

  Traceback (most recent call last):
File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 89, in 
wrapper
  return func(*func_args, **func_kwargs)
File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 70, in 
wrapper
  return f(*func_args, **func_kwargs)
File "/opt/stack/tempest/tempest/api/compute/admin/test_live_migration.py", 
line 285, in test_live_migration_with_trunk
  self.assertTrue(
File 
"/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/unittest2/case.py",
 line 702, in assertTrue
  raise self.failureException(msg)
  AssertionError: False is not true

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1933954/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1791243] Re: launch-instance-from-volume.rst is not latest version

2021-06-01 Thread Stephen Finucane
This doc needs to be reworked, but I think we should do so from scratch
rather than copying the (now very old) stuff from the manuals.

** Changed in: nova
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1791243

Title:
  launch-instance-from-volume.rst is not latest version

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  We lost some changes of doc/source/user/launch-instance-from-
  volume.rst in openstack-manuals after ocata

  We need upload the latest doc from manuals repo [1], and
  merge all later changes[2][3][4] into this doc.

  [1] I4a556b6a596a28c0350c7411c147459c3f06d084
  [2] Ifa2e2bbb4c5f51f13d1a5832bd7dbf9f690fcad7
  [3] Ida4cf70a7e53fd37ceeadb5629e3221072219689
  [4] Ifb99e727110c4904a85bc4a13366c2cae300b8df

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1791243/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1930448] [NEW] 'VolumeNotFound' exception is not handled

2021-06-01 Thread Stephen Finucane
01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi return self._cs_request(url, 'GET', **kwargs)
  Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/cinderclient/client.py", line 206, in 
_cs_request
  Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi return self.request(url, method, **kwargs)
  Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/cinderclient/client.py", line 192, in 
request
  Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi raise exceptions.from_response(resp, body)
  Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi nova.exception.VolumeNotFound: Volume 
44d317a3-6183-4063-868b-aa0728576f5f could not be found.
  Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi
  Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: INFO 
nova.api.openstack.wsgi [None req-7fe03627-c4ce-4f4b-9d5c-3abd6b88d3e3 demo 
admin] HTTP exception thrown: Unexpected API Error. Please report this at 
http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
  Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: 
  Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: DEBUG 
nova.api.openstack.wsgi [None req-7fe03627-c4ce-4f4b-9d5c-3abd6b88d3e3 demo 
admin] Returning 500 to user: Unexpected API Error. Please report this at 
http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.

** Affects: nova
 Importance: Medium
 Assignee: Stephen Finucane (stephenfinucane)
 Status: In Progress

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1930448

Title:
  'VolumeNotFound' exception is not handled

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Attempting to attach a volume using an invalid ID currently results in
  a HTTP 500 error. This error should be handled and a HTTP 4xx error
  returned instead.

$ openstack server create ... \
   --block-device 
source_type=volume,uuid=44d317a3-6183-4063-868b-aa0728576f5f,destination_type=volume,delete_on_termination=true
 \
   --wait test-server
Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ 
and attach the Nova API log if possible.
 (HTTP 500) (Request-ID: 
req-7fe03627-c4ce-4f4b-9d5c-3abd6b88d3e3)

  where '44d317a3-6183-4063-868b-aa0728576f5f' is not an UUID
  corresponding to a valid volume.

  A full traceback from nova-compute is provided below.

Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi [None req-7fe03627-c4ce-4f4b-9d5c-3abd6b88d3e3 demo 
admin] Unexpected exception in API method: nova.exception.VolumeNotFound: 
Volume 44d317a3-6183-4063-868b-aa0728576f5f could not be found.
Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi Traceback (most recent call last):
Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File "/opt/stack/nova/nova/volume/cinder.py", line 
432, in wrapper
Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi res = method(self, ctx, volume_id, *args, **kwargs)
Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File "/opt/stack/nova/nova/volume/cinder.py", line 
498, in get
Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi item = cinderclient(
Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/cinderclient/v2/volumes.py", line 281, 
in get
Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi return self._get("/volumes/%s" % volume_id, 
"volume")
Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/cinderclient/base.py", line 293, in _get
Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi resp, body = self.api.client.get(url)
Jun 01 15:05:14 devstack-ubuntu2004 devstack@

[Yahoo-eng-team] [Bug 1914592] Re: oslo.policy 3.6.1 breaks nova

2021-02-09 Thread Stephen Finucane
** Changed in: oslo.policy
   Status: Confirmed => Fix Released

** Changed in: nova
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1914592

Title:
  oslo.policy 3.6.1 breaks nova

Status in OpenStack Compute (nova):
  Fix Released
Status in oslo.policy:
  Fix Released

Bug description:
  As seen on the requirements change [1], a recently introduced version
  of oslo.policy appears to be breaking nova [2]. Initial investigations
  suggest both oslo.policy and nova are partially to blame.

  [1] https://review.opendev.org/c/openstack/requirements/+/773779
  [2] 
https://d138d4f526b4feb9aa23-c0b1a48165a1318087e38ccc28dcb2b0.ssl.cf5.rackcdn.com/773779/1/check/cross-nova-functional/d9729b8/testr_results.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1914592/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1914592] [NEW] oslo.policy 3.6.1 breaks nova

2021-02-04 Thread Stephen Finucane
Public bug reported:

As seen on the requirements change [1], a recently introduced version of
oslo.policy appears to be breaking nova [2]. Initial investigations
suggest both oslo.policy and nova are partially to blame.

[1] https://review.opendev.org/c/openstack/requirements/+/773779
[2] 
https://d138d4f526b4feb9aa23-c0b1a48165a1318087e38ccc28dcb2b0.ssl.cf5.rackcdn.com/773779/1/check/cross-nova-functional/d9729b8/testr_results.html

** Affects: nova
 Importance: High
 Assignee: Stephen Finucane (stephenfinucane)
 Status: Confirmed

** Affects: oslo.policy
 Importance: Critical
 Assignee: Stephen Finucane (stephenfinucane)
 Status: Confirmed

** Changed in: nova
   Importance: Undecided => High

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

** Also affects: oslo.policy
   Importance: Undecided
   Status: New

** Changed in: oslo.policy
   Status: New => Confirmed

** Changed in: oslo.policy
   Importance: Undecided => Critical

** Changed in: oslo.policy
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1914592

Title:
  oslo.policy 3.6.1 breaks nova

Status in OpenStack Compute (nova):
  Confirmed
Status in oslo.policy:
  Confirmed

Bug description:
  As seen on the requirements change [1], a recently introduced version
  of oslo.policy appears to be breaking nova [2]. Initial investigations
  suggest both oslo.policy and nova are partially to blame.

  [1] https://review.opendev.org/c/openstack/requirements/+/773779
  [2] 
https://d138d4f526b4feb9aa23-c0b1a48165a1318087e38ccc28dcb2b0.ssl.cf5.rackcdn.com/773779/1/check/cross-nova-functional/d9729b8/testr_results.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1914592/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1912167] Re: Mistake in unit test: test_get_pinning_isolate_policy_bug_1889633

2021-02-02 Thread Stephen Finucane
That is incorrect. The pcpuset field was added to the InstanceNUMACell
object in commit 867d4471013bf6a70cd3e9e809daf80ea358df92 [1].

[1]
https://github.com/openstack/nova/commit/867d4471013bf6a70cd3e9e809daf80ea358df92

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1912167

Title:
  Mistake in unit test: test_get_pinning_isolate_policy_bug_1889633

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Description
  ===

  objects.InstanceNUMACell don't get pcpuset attribute

  In test_get_pinning_isolate_policy_bug_1889633
  objects.InstanceNUMACell has cpuset and pcpuset passed to constructor.
  Only cpuset is valid but has wrong value (it should have cpu to pin if
  pinning is required).

  This test may confuse developers that pcpuset is available for
  objects.InstanceNUMACell and It may not pass on custom code that
  require proper cpuset value.

  
  Fix for Nova Train:

  diff --git a/nova/tests/unit/virt/test_hardware.py 
b/nova/tests/unit/virt/test_hardware.py
  index 8e6c049f04..5a153f7480 100644
  --- a/nova/tests/unit/virt/test_hardware.py
  +++ b/nova/tests/unit/virt/test_hardware.py
  @@ -3247,8 +3247,7 @@ class CPUPinningCellTestCase(test.NoDBTestCase, 
_CPUPinningTestCaseBase):
   mempages=[],
   )
   inst_pin = objects.InstanceNUMACell(
  -cpuset=set(),
  -pcpuset={0, 1},
  +cpuset={0, 1},
   memory=2048,
   cpu_policy=fields.CPUAllocationPolicy.DEDICATED,
   cpu_thread_policy=fields.CPUThreadAllocationPolicy.ISOLATE,

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1912167/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1914259] [NEW] Disabled USB controller breaks PPC64LE hosts

2021-02-02 Thread Stephen Finucane
Public bug reported:

As discussed on the mailing list [1], a recent change disabling the USB
controller when no USB devices are found [2] has broken the PPC64LE
third party CI job. It seems libvirt will add an implicit USB keyboard
and mouse on PPC64 and PPC64LE architectures [3]. We probably need to
special case this architecture.

[1] 
http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020153.html
[2] https://review.opendev.org/c/openstack/nova/+/756549
[3] 
https://github.com/libvirt/libvirt/blob/3d42a57666/src/qemu/qemu_domain.c#L3559-L3560

** Affects: nova
 Importance: High
 Assignee: Stephen Finucane (stephenfinucane)
 Status: Confirmed


** Tags: libvirt ppc64

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => High

** Tags added: libvirt ppc64

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1914259

Title:
  Disabled USB controller breaks PPC64LE hosts

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  As discussed on the mailing list [1], a recent change disabling the
  USB controller when no USB devices are found [2] has broken the
  PPC64LE third party CI job. It seems libvirt will add an implicit USB
  keyboard and mouse on PPC64 and PPC64LE architectures [3]. We probably
  need to special case this architecture.

  [1] 
http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020153.html
  [2] https://review.opendev.org/c/openstack/nova/+/756549
  [3] 
https://github.com/libvirt/libvirt/blob/3d42a57666/src/qemu/qemu_domain.c#L3559-L3560

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1914259/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1909269] Re: I create a server_groups vm , but server_group_members doesn't add one.

2021-01-15 Thread Stephen Finucane
This is one of two per-user quotas, the other being 'key_pairs', where
usages are not considered when validating limit create/update. They're
always at zero. You can find more information at [1]

[1]
https://github.com/openstack/nova/blob/7527fdf6eafe47f0f783e9cdae8b79b76d6ca6b3/nova/quota.py#L178-L182

** Tags added: quotas

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1909269

Title:
  I create a server_groups vm , but server_group_members  doesn't add
  one.

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  I create a server_groups vm , but server_group_members  doesn't add
  one.

  After the virtual machine is created successfully."nova qouta-show --detail" 
is executed on the compute node,search result "server_group_members" parameter 
"in-use" doesn't add one.
  .

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1909269/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1909972] Re: a number of tests fail under ppc64el arch

2021-01-08 Thread Stephen Finucane
As noted in the libvirt driver [1], we only test against x86 and x86_64.
While this would be relatively easy to fix, the lack of a gate job means
it will likely regress again in the future and also means we can't
justifiably make this architecture as supported. I think the more likely
issue is this:

  I'm marking this bug as severity:serious since your package has only
  Architecture:all binary packages, and should thus, in theory, build
  everywhere. Failure to build on ppc64el might indicate a serious issue
  in this package or in another package.

Setting Architecture to indicate support for x86 and x86_64 only would
seem far more sensible to me.

[1]
https://github.com/openstack/nova/blob/46899968619e4ea0ff2ab380977619bb29578d43/nova/virt/libvirt/driver.py#L572-L581

** Changed in: nova
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1909972

Title:
  a number of tests fail under ppc64el arch

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Hi,

  As per this Debian bug entry:
  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=976954

  a number of unit tests are failing under ppc64el arch. Please fix
  these or exclude the tests on this arch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1909972/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1908507] Re: vif quotas not set for tap interface

2020-12-17 Thread Stephen Finucane
This feature essentially deprecated, given it is only supported by
specific backends, and it is unlikely that we will extend it any
further. As a result, I'm marking this as WONTFIX and suggest you
investigate neutron's native QoS support instead. You can find
documentation for this here [1]. To the best of my knowledge, the QoS
support is available for any ML2-based backends, including Calico
plugin, but this will require some background reading.

[1] https://docs.openstack.org/neutron/latest/admin/config-qos.html

** Changed in: nova
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1908507

Title:
  vif quotas not set for tap interface

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Description
  ===
  Despite vif_inbound_average and vif_outbound_average being set, bandwidth 
settings are not propagated to an instance xml config in libvirt when using tap 
interface.

  Steps to reproduce
  ==
  - nova flavor-key network_test set quota:vif_inbound_average=10240
  - nova flavor-key network_test set quota:vif_outbound_average=10240
  - create VM with said flavor
  - verify vm libvirt xml config

  Expected result
  ===
  -  tag is present in instance-id.xml config
  - bandwidth via iperf test is being shaped

  Actual result
  ===
  -  tag is not set
  - traffic is not limited

  Environment
  ===
  - nova-compute 2:21.1.0-0ubuntu1~cloud0
  - libvirt-daemon 6.0.0-0ubuntu8.4~cloud0
  - Calico neutron plugin with network_type set to flat
  - Libvirt + KVM

  Proposed fix
  ===
  Probably missing "designer.set_vif_bandwidth_config(conf, inst_type)" in 
method get_config_tap(..)

  Logs
  ===
  nova-compute.log
  2020-12-16 13:13:10.202 74913 DEBUG nova.virt.hardware 
[req-66bf23dd-7486-4e3d-9bda-1f23943f2379 2f8c89255e23468bbd2bd0ea6391a3cd 
c9604e4b7a0c443eb451181727e4e00a - default default] Getting desirable 
topologies for flavor 
Flavor(created_at=2020-12-16T10:20:59Z,deleted=False,deleted_at=None,description=None,disabled=False,ephemeral_gb=0,extra_specs={quota:vif_inbound_average='10240',quota:vif_inbound_peak='10240',quota:vif_outbound_average='10240',quota:vif_outbound_peak='10240'},flavorid='89c4daca-4ef3-4835-83e5-891f8e3c2664',id=204,is_public=True,memory_mb=4096,name='network_test',projects=,root_gb=10,rxtx_factor=1.0,swap=0,updated_at=None,vcpu_weight=0,vcpus=4)
 and image_meta 
ImageMeta(checksum='ecf90ee0a6b453638f95c7bfba9d17e2',container_format='bare',created_at=2020-10-07T09:17:40Z,direct_url=,disk_format='qcow2',id=f619fd08-3e7e-4ab8-a9b4-a8a13e575863,min_disk=0,min_ram=0,name='centos-7-chef',owner='c9604e4b7a0c443eb451181727e4e00a',properties=ImageMetaProps,protected=,size=1470693376,status='active',tags=,updated_at=2020-10-07T09:19:45Z,virtual_size=,visibility=),
 allow threads: True _get_desirable_cpu_topologies 
/usr/lib/python3/dist-packages/nova/virt/hardware.py:594
  ...
  2020-12-16 13:13:10.235 74913 DEBUG nova.virt.libvirt.vif 
[req-66bf23dd-7486-4e3d-9bda-1f23943f2379 2f8c89255e23468bbd2bd0ea6391a3cd 
c9604e4b7a0c443eb451181727e4e00a - default default] vif_type=tap 
instance=Instance(access_ip_v4=None,access_ip_v6=None,architecture=None,
  
auto_disk_config=False,availability_zone='dc2',cell_name=None,cleaned=False,config_drive='',created_at=2020-12-16T12:13:30Z,default_ephemeral_device=None,default_swap_device=None,deleted=False,deleted_at=None,device_metadata=None,disable_terminate=False,display_descript
  
ion='martin-net-test-1',display_name='martin-net-test-1',ec2_ids=EC2Ids,ephemeral_gb=0,ephemeral_key_uuid=None,fault=,flavor=Flavor(204),hidden=False,host='cmp08-dc2.ost.mall.local',hostname='martin-net-test-1',id=35253,image_ref='f619fd08-3e7e-4ab8-a9b4-a8a13e575863
  
',info_cache=InstanceInfoCache,instance_type_id=204,kernel_id='',key_data='abc123',key_name='molexa',keypairs=KeyPairList,launch_index=0,launched_at=None,launched_on='cmp08-dc2.ost.mall.local',locked=False,locked_by=None,memory_mb=4096,metadata={},migration_context=None,new_flavor=None,node='cmp08-dc2.ost.mall.local',numa_topology=None,old_fl
  
avor=None,os_type=None,pci_devices=,pci_requests=InstancePCIRequests,power_state=0,progress=0,project_id='c9604e4b7a0c443eb451181727e4e00a',ramdisk_id='',reservation_id='r-48w9d5fs',resources=None,root_device_name='/dev/vda',root_gb=10,security_groups=SecurityGroupLi
  
st,services=,shutdown_terminate=False,system_metadata={boot_roles='admin,member,reader,heat_stack_owner',image_base_image_ref='f619fd08-3e7e-4ab8-a9b4-a8a13e575863',image_container_format='bare',image_disk_format='qcow2',image_min_disk='10',image_min_ram='0',image_ow
  

[Yahoo-eng-team] [Bug 1906266] Re: After upgrade: "libvirt.libvirtError: Requested operation is not valid: format of backing image %s of image %s was not specified"

2020-12-17 Thread Stephen Finucane
Given the above, the solution here seems to be to update your version of
libvirt to >= 6.1.0. I'm going to mark this as WONTFIX. If this does not
resolve the issue, please reset to new and provide information on the
version of libvirt you've tested with and detailed logs from nova-
compute showing the error.

** Changed in: nova
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1906266

Title:
  After upgrade: "libvirt.libvirtError: Requested operation is not
  valid: format of backing image %s of image %s was not specified"

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  In a site upgraded to Ussuri we are getting faults starting instances

  2020-11-30 13:41:40.586 232871 ERROR oslo_messaging.rpc.server
  libvirt.libvirtError: Requested operation is not valid: format of
  backing image '/var/lib/nova/instances/_base/xxx' of image
  '/var/lib/nova/instances/xxx' was not specified in the image metadata
  (See https://libvirt.org/kbase/backing_chains.html for
  troubleshooting)

  Bug #1864020 reports similar symptoms, where due to an upstream change
  in Libvirt v6.0.0+ images need the backing format specified.

  The fix for Bug #1864020 handles the case for new instances. However,
  for upgraded instances we're hitting the same problem, as those still
  don't have backing format specified.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1906266/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1907216] Re: Wrong image ref after unshelve

2020-12-15 Thread Stephen Finucane
Moving this to invalid based on the comments from Lucian on [1]

  I think this should be fixed at the Hyper-V driver level. The stashed image 
will be removed
  from Glance once the instance is unstashed, so there's no value in updating 
the Nova
  instance db record to point to it. In fact, users are probably interested in 
the original
  image.

[1] https://review.opendev.org/c/openstack/nova/+/765924

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1907216

Title:
  Wrong image ref after unshelve

Status in compute-hyperv:
  New
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  After an instance is unshelved, the instance image ref will point to
  the original image instead of the snapshot created during the shelving
  [1][2].

  Subsequent instance operations will use the wrong image id. For
  example, in case of cold migrations, Hyper-V instances will be unable
  to boot since the differencing images will have the wrong base [3].
  Other image related operations might be affected as well.

  As pointed out by Matt Riedemann on the patch [1], Nova shouldn't set
  back the original image id, instead it should use the snapshot id.

  [1] I3bba0a230044613e07122a6d122597e5b8d43438
  [2] 
https://github.com/openstack/nova/blob/22.0.1/nova/compute/manager.py#L6625
  [3] http://paste.openstack.org/raw/800822/

To manage notifications about this bug go to:
https://bugs.launchpad.net/compute-hyperv/+bug/1907216/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1908133] Re: Nova does not track shared ceph pools across multiple nodes

2020-12-15 Thread Stephen Finucane
*** This bug is a duplicate of bug 1522307 ***
https://bugs.launchpad.net/bugs/1522307

This is a well-known issue. Closing as a duplicate.

** This bug has been marked a duplicate of bug 1707256
   Scheduler report client does not account for shared resource providers

** This bug is no longer a duplicate of bug 1707256
   Scheduler report client does not account for shared resource providers

** This bug has been marked a duplicate of bug 1522307
   Disk usage not work for shared storage

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1908133

Title:
  Nova does not track shared ceph pools across multiple nodes

Status in OpenStack Compute (nova):
  New

Bug description:
  Environment:
  - tested in focal-victoria and bionic-stein

  ==

  Steps to reproduce:
  1) Deploy OpenStack having 2 nova-compute nodes
  2) Configure both compute nodes to have a RBD backend pointing to the same 
pool in ceph as below:

  [libvirt]
  images_type = rbd
  images_rbd_pool = nova

  3) run "openstack hypervisor show" on each node. Both will show the
  full pool capacity:

  local_gb | 29
  local_gb_used| 0
  free_disk_gb | 29
  disk_available_least | 15

  4) create a 20gb instance and run "openstack hypervisor show" again on
  the node it landed:

  local_gb | 29
  local_gb_used| 20
  free_disk_gb | 9
  disk_available_least | 15

  5) create another 20GB one. It will land on the other hypervisor
  6) try to create a third 20GB one, it will fail because placement will not 
return an allocation candidate. This is correct.
  7) Now ssh to both the instances and fill their disk (actually based on 
disk_available_least that is read from ceph df, only one may need to be filled)
  8) I/O for all instances will be frozen as the ceph pool runs out of space, 
and the nova-compute service freezes on "create_image" whenever a new instance 
is attempted to be created there, causing it to be reported as "down".
  9) disk_available_least will be updated to 0, but that doesn't prevent new 
instances from being scheduled.

  This is the first problem as both compute nodes have their tracking
  disconnected from the ceph pool on "free_disk_gb" and "local_gb_used",
  while "disk_available_least" is not used by the scheduler to prevent
  the problem while disk_allocation_ratio is 1.0 (it is used by live-
  migration appropriately though).

  Alternatively (as a possible solution/fix/workaround), following the
  steps in [0] and [1] to have placement as a centralized place for the
  shared ceph pool. I ran the following steps:

  10) openstack resource provider create ceph_nova_pool

  11) openstack resource provider inventory set --os-placement-api-
  version 1.19 --resource DISK_GB=30 

  12) openstack resource provider trait set --os-placement-api-version
  1.19  --trait MISC_SHARES_VIA_AGGREGATE

  13) openstack resource provider aggregate set 
  --aggregate  --aggregate
   --generation 2 --os-placement-api-version
  1.19

  14) Deleted all instances and repeated steps 4, 5 and 6 but same
  result

  15) openstack resource provider set --name 
  --parent-provider  
  --os-placement-api-version 1.19

  16) openstack resource provider set --name 
  --parent-provider  
  --os-placement-api-version 1.19

  17) Deleted all instances and repeated steps 4, 5 and 6. Now I was
  able to create 3 instances, where 1 of them had allocations from the
  ceph_nova_pool resource provider. The created resource_provider is
  being treated as an "extra" resource provider.

  18) Deleted 2 instances that had allocations from the compute nodes

  19) openstack resource provider inventory delete
   --resource-class DISK_GB

  20) openstack resource provider inventory delete
   --resource-class DISK_GB

  21) watch openstack allocation candidate list --resource DISK_GB=20
  --os-placement-api-version 1.19

  Now, the list would be empty, until nova-compute periodically updates
  the inventory with its local_gb value and we go back to the state at
  step 17.

  
  ==

  Expected result:
  - For the first approach, it is expected that scheduling would be affected by 
the disk_available_least value (accordingly to disk_allocation_ratio as well) 
to avoid allowing the creation of instances when there is no space.
  - For the second approach, it is expected that there is a way to prevent 
nova-compute when periodically updating a specific inventory, or guarantee that 
its inventory is shared with another resource_provider instead of an "extra" 
one.


  [0] 
https://github.com/openstack/placement/blob/c02a073c523d363d7136677ab12884dc4ec03e6f/placement/objects/research_context.py#L1107
  [1] https://docs.openstack.org/placement/latest/user/provider-tree.html

To manage notifications about this bug go to:

[Yahoo-eng-team] [Bug 1907179] Re: resize revert will let new data lost!

2020-12-11 Thread Stephen Finucane
I assume you're referring to instances with ephemeral or local storage?
If so, this is expected behaviour. Resize revert deletes the instance on
the destination host and resumes the old instance on the source host.
You can work around this by using boot from volume or RBD.

** Changed in: nova
   Status: New => Opinion

** Changed in: nova
   Status: Opinion => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1907179

Title:
  resize revert will let new data lost!

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  
  * description: 
Hi all,I found a serious problem in revert resize action for a vm. the 
revert resize operation will let the increments data lost. the reproduction 
steps show below.

  
  * Step-by-step reproduction steps: 
  1、 create a new vm, which flavor choice C1-R2-D10(Core:1, Ram: 2G, Disk:10), 
image:centos 76

  2、login the vm, and touch a new file: 1.txt, and write some message to
  the file. like: echo 'aaa' > 1.txt

  3、do resize operation for the vm. change flavor to C2-R4-D20. but do
  not confirm or revert the operation now.

  4、login the vm again, and touch another new file: 2.txt, and write
  some message to the file. like: echo 'bbb' > 2.txt

  5、now we do revert-resize operation, revert the resize operation which
  in 2 step.

  6、when the revert resize done, login in the vm, and we will found the
  2.txt will lost.

  
  * Expected output: 

  I think the increment 2.txt added between do resize and revert
  should be preserved.

  
  * Actual output: 

   but the 2.txt lost !!

  * Version:
** OpenStack version: Rocky
** Linux distro, kernel. CentOS Linux release 7.8.2003 
(Core)、3.10.0-1127.el7.x86_64

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1907179/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1906781] Re: One of the image's name is Chinese, the execution of glance image-list shows the error "ascii codec can't encode characters in position 953-954: ordinal not in range

2020-12-10 Thread Stephen Finucane
I don't know how this is related to nova. Moving to glanceclient.

** Project changed: nova => python-glanceclient

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1906781

Title:
  One of the image's name is Chinese, the execution of glance image-list
  shows the error "ascii codec can't encode characters in position
  953-954: ordinal not in range(128)".

Status in Glance Client:
  New

Bug description:
  One of the image's name is Chinese, the execution of glance image-list
  shows the error "ascii codec can't encode characters in position
  953-954: ordinal not in range(128)".

  1.create a image name is Chinese.
  2.executing the "glance image-list".
  3.error:ascii codec can't encode characters in position 953-954: ordinal not 
in range(128)

To manage notifications about this bug go to:
https://bugs.launchpad.net/python-glanceclient/+bug/1906781/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1903879] Re: Server Remove Fixed IP is not working in the Rocky

2020-12-10 Thread Stephen Finucane
This is an issue with OSC, not nova [1]. The fix should be relatively
easy but no one has had a chance to address it yet, unfortunately.

[1] https://storyboard.openstack.org/#!/story/2002925

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1903879

Title:
  Server Remove Fixed IP is not working in the Rocky

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Description
  ===
  I am testing features of the Rocky release before we'll upgrade from Queens 
and I have possibly found a bug, but maybe I am just doing something wrong.

  Steps to reproduce
  ==
  1. create a server instance (IP given by DHCP = 10.244.255.28)
  2. add a new fixed IP address to this instance (IP given by DHCP = 
10.244.255.22)
  3. list instance to see the result

  os1-lab1:~ # openstack server list --long
  | ID   | Name  | Status | Task State 
| Power State | Networks
  | Image Name  | Image ID | 
Flavor Name | Flavor ID| Availability Zone | Host   
 | Properties |
  | 70f85125-d90f-4eba-899d-3c89e2bea697 | lwq-test-snap | ACTIVE | None   
| Running | lab-net=10.244.255.28, 2aff:::::3b, 10.244.255.22, 
2aff:::::26 | lwq-test-snap   | 
554564df-245c-4e6b-8a02-47556e684c0b | t1.2c2r10d  | 
a905bd3c-db79-415f-abd8-29666db713b4 | az2   | os1-lab10 |  
  |
  4. try to remove last added IP
  os1-lab1.ko:~ # openstack server remove fixed ip lwq-test-snap 10.244.255.22
  remove_fixed_ip
  5. nothing happened, even logs are clear and the server list shows the exact 
same output as posted above

  
  Expected result
  ===
  Specified IP address should be remove. I've used this procedure on Queens 
release and before for countless times.

  Actual result
  =
  Nothing happened as shown above.

  Environment
  ===
  1. Exact version of OpenStack you are running. See the following
list for all releases: http://docs.openstack.org/releases/

  os1-lab1.ko:~ # dpkg -l | grep nova
  ii  nova-api   2:18.3.0-0ubuntu1~cloud1   
 all  OpenStack Compute - API frontend
  ii  nova-common2:18.3.0-0ubuntu1~cloud1   
 all  OpenStack Compute - common files
  ii  nova-conductor 2:18.3.0-0ubuntu1~cloud1   
 all  OpenStack Compute - conductor service
  ii  nova-novncproxy2:18.3.0-0ubuntu1~cloud1   
 all  OpenStack Compute - NoVNC proxy
  ii  nova-placement-api 2:18.3.0-0ubuntu1~cloud1   
 all  OpenStack Compute - placement API frontend
  ii  nova-scheduler 2:18.3.0-0ubuntu1~cloud1   
 all  OpenStack Compute - virtual machine scheduler
  ii  python3-nova   2:18.3.0-0ubuntu1~cloud1   
 all  OpenStack Compute Python 3 libraries
  ii  python3-novaclient 2:11.0.0-0ubuntu1~cloud0   
 all  client library for OpenStack Compute API - 3.x

  2. Which hypervisor did you use?
 - Libvirt + KVM
  os1-lab1.ko:~ # dpkg -l | grep libvirt
  ii  libvirt0:amd64 4.0.0-1ubuntu8.17  
 amd64

  2. Which storage type did you use?
 - local storage + raw qcow2

  3. Which networking type did you use?
 - nova-network + calico

  Logs & Configs
  ==
 - I did not found anything useful, please specify what you would like to 
collect.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1903879/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1581977] Re: Invalid input for dns_name when spawning instance with .number at the end

2020-11-27 Thread Stephen Finucane
I disagree. We already do sanitization of the hostname and fallback to a
hostname 'Server-{instance.uuid}' if that returns an empty string. I
think we should also do this fallback if the hostname is not a valid
FQDN. Personally, I'd rather we provided a mechanism to set hostnames
that was entirely decoupled from the instance name, like below, but
that's a lot of work and I don't want to do it :)

  openstack server create --hostname foo.bar ...

Until someone puts in the effort to do that, extending what we have will
do just fine.

** Changed in: nova
   Status: Opinion => Triaged

** Changed in: nova
   Importance: Wishlist => Low

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1581977

Title:
  Invalid input for dns_name when spawning instance with .number at the
  end

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  When attempting to deploy an instance with a name which ends in dot
   (e.g. .123, as in an all-numeric TLD) or simply a name that,
  after conversion to dns_name, ends as ., nova conductor fails
  with the following error:

  2016-05-15 13:15:04.824 ERROR nova.scheduler.utils [req-4ce865cd-e75b-
  4de8-889a-ed7fc7fece18 admin demo] [instance:
  c4333432-f0f8-4413-82e8-7f12cdf3b5c8] Error from last host:
  silpixa00394065 (node silpixa00394065): [u'Traceback (most recent call
  last):\n', u'  File "/opt/stack/nova/nova/compute/manager.py", line
  1926, in _do_build_and_run_instance\nfilter_properties)\n', u'
  File "/opt/stack/nova/nova/compute/manager.py", line 2116, in
  _build_and_run_instance\ninstance_uuid=instance.uuid,
  reason=six.text_type(e))\n', u"RescheduledException: Build of instance
  c4333432-f0f8-4413-82e8-7f12cdf3b5c8 was re-scheduled: Invalid input
  for dns_name. Reason: 'networking-ovn-ubuntu-16.04' not a valid PQDN
  or FQDN. Reason: TLD '04' must not be all numeric.\nNeutron server
  returns request_ids: ['req-7317c3e3-2875-4073-8076-40e944845b69']\n"]

  This throws one instance of the infamous Horizon message: Error: No
  valid host was found. There are not enough hosts available.

  
  This issue was observed using stable/mitaka via DevStack (nova commit 
fb3f1706c68ea5b58f05ea810c6339f2449959de).

  In the above example, the instance name is "networking-ovn (Ubuntu
  16.04)", which resulted in an attempted dns_name="networking-ovn-
  ubuntu-16.04", where the 04 was interpreted as a TLD and,
  consequently, an invalid TLD.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1581977/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1852727] Re: PCI passthrough documentation does not describe the steps necessary to passthrough PFs

2020-11-26 Thread Stephen Finucane
** Also affects: nova/trunk
   Importance: Undecided
   Status: New

** Changed in: nova/trunk
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Low

** Changed in: nova/trunk
   Importance: Undecided => Low

** Changed in: nova/trunk
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

** No longer affects: nova/trunk

** Also affects: nova/train
   Importance: Undecided
   Status: New

** Changed in: nova/train
   Importance: Undecided => Low

** Changed in: nova/train
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1852727

Title:
  PCI passthrough documentation does not describe the steps necessary to
  passthrough PFs

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) train series:
  Confirmed

Bug description:
  This came up on IRC [1]. By default, nova will not allow you to use PF
  devices unless you specifically request this type of device. This is
  intentional behavior to allow users to whitelist all devices from a
  particular vendor and avoid passing through the PF device when they
  meant to only consume the VFs. In the future, we might want to prevent
  whitelisting of both PF and VFs, but for now we should document the
  current behavior.

  [1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova
  /%23openstack-nova.2019-11-15.log.html#t2019-11-15T08:39:17

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1852727/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1904446] [NEW] 'GetPMEMNamespacesFailed' is not a valid exception

2020-11-16 Thread Stephen Finucane
Public bug reported:

Attempting to retrieve a non-existent PMEM device results in the
following traceback:

./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova During handling 
of the above exception, another exception occurred:
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova Traceback (most 
recent call last):
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/bin/nova-compute", line 10, in 
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova 
sys.exit(main())
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/lib/python3.6/site-packages/nova/cmd/compute.py", line 57, in main
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova 
topic=compute_rpcapi.RPC_TOPIC)
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/lib/python3.6/site-packages/nova/service.py", line 271, in create
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova 
periodic_interval_max=periodic_interval_max)
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/lib/python3.6/site-packages/nova/service.py", line 129, in __init__
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova self.manager 
= manager_class(host=self.host, *args, **kwargs)
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 571, in 
__init__
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova self.driver = 
driver.load_compute_driver(self.virtapi, compute_driver)
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/lib/python3.6/site-packages/nova/virt/driver.py", line 1911, in 
load_compute_driver
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova virtapi)
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/lib/python3.6/site-packages/oslo_utils/importutils.py", line 44, in 
import_object
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova return 
import_class(import_str)(*args, **kwargs)
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 446, in 
__init__
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova 
vpmem_conf=CONF.libvirt.pmem_namespaces)
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 477, in 
_discover_vpmems
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova vpmems_host = 
self._get_vpmems_on_host()
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 512, in 
_get_vpmems_on_host
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova raise 
exception.GetPMEMNamespacesFailed(reason=reason)
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova AttributeError: 
module 'nova.exception' has no attribute 'GetPMEMNamespacesFailed'
./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova

It seems there was a typo introduced when this code was added. The code
referenced 'GetPMEMNamespacesFailed' but the exception, which has since
been removed since it was "unused", was called 'GetPMEMNamespaceFailed'.

** Affects: nova
 Importance: Medium
 Status: Confirmed


** Tags: libvirt

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
   Status: New => Confirmed

** Tags added: libvirt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1904446

Title:
  'GetPMEMNamespacesFailed' is not a valid exception

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Attempting to retrieve a non-existent PMEM device results in the
  following traceback:

  ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova During handling 
of the above exception, another exception occurred:
  ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova
  ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova Traceback (most 
recent call last):
  ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/bin/nova-compute", line 10, in 
  ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova 
sys.exit(main())
  ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/lib/python3.6/site-packages/nova/cmd/compute.py", line 57, in main
  ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova 
topic=compute_rpcapi.RPC_TOPIC)
  ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova   File 
"/usr/lib/python3.6/site-packages/nova/service.py", line 271, in create
  ./nova-compute.log.1:2020-11-16 16:01:22.704 7 ERROR nova 

[Yahoo-eng-team] [Bug 1904051] [NEW] Intermittent failures in cross-cell functional tests

2020-11-12 Thread Stephen Finucane
Public bug reported:

Some functional tests are failing due to the following error:

Captured traceback:
~~~
Traceback (most recent call last):

  File 
"/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/test_cross_cell_migrate.py",
 line 1076, in test_resize_revert_from_stopped
self.api.post_server_action(server['id'], {'migrate': None})

  File 
"/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py",
 line 268, in post_server_action
return self.api_post(

  File 
"/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py",
 line 210, in api_post
return APIResponse(self.api_request(relative_uri, **kwargs))

  File 
"/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py",
 line 186, in api_request
raise OpenStackApiException(

nova.tests.functional.api.client.OpenStackApiException: Unexpected
status code: {"conflictingRequest": {"code": 409, "message": "Cannot
'migrate' instance 8841d71c-c29d-4dc8-9736-98dbc6ee221f while it is in
task_state resize_reverting"}}

This appears to be because we're not waiting for the resize-revert
operation to fully complete before attempting other operations. We need
to wait for the versioned notification emitted on the source compute,
which occurs after the instance's task state has been updated, as
opposed to simply waiting for the migration record to change status,
which occurs before (and on the destination node).

** Affects: nova
 Importance: Medium
 Assignee: Stephen Finucane (stephenfinucane)
 Status: In Progress


** Tags: gate-failure

** Changed in: nova
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1904051

Title:
  Intermittent failures in cross-cell functional tests

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Some functional tests are failing due to the following error:

  Captured traceback:
  ~~~
  Traceback (most recent call last):

File 
"/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/test_cross_cell_migrate.py",
 line 1076, in test_resize_revert_from_stopped
  self.api.post_server_action(server['id'], {'migrate': None})

File 
"/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py",
 line 268, in post_server_action
  return self.api_post(

File 
"/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py",
 line 210, in api_post
  return APIResponse(self.api_request(relative_uri, **kwargs))

File 
"/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/api/client.py",
 line 186, in api_request
  raise OpenStackApiException(

  nova.tests.functional.api.client.OpenStackApiException: Unexpected
  status code: {"conflictingRequest": {"code": 409, "message": "Cannot
  'migrate' instance 8841d71c-c29d-4dc8-9736-98dbc6ee221f while it is in
  task_state resize_reverting"}}

  This appears to be because we're not waiting for the resize-revert
  operation to fully complete before attempting other operations. We
  need to wait for the versioned notification emitted on the source
  compute, which occurs after the instance's task state has been
  updated, as opposed to simply waiting for the migration record to
  change status, which occurs before (and on the destination node).

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1904051/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1898554] [NEW] Legacy 'InstanceNUMACell' with 'mixed' policy results in 'TypeError'

2020-10-05 Thread Stephen Finucane
Public bug reported:

We added support for the 'mixed' CPU policy in Victoria. This required
changes to the 'cpu_policy' field of the 'InstanceNUMACell' object. As
part of that change, we had to check that the consumer of the o.vo
supported the 'mixed' policy and, if not, raise an 'ObjectActionError'.

Unfortunately we're attempting to use a tuple as a string in the string
formatting for that exception's error message. As a result, if you
attempt to actually raise it, you see the following:

  TypeError: not all arguments converted during string formatting

** Affects: nova
 Importance: Low
 Assignee: Stephen Finucane (stephenfinucane)
 Status: Confirmed


** Tags: libvirt numa

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Low

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

** Tags added: libvirt numa

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1898554

Title:
  Legacy 'InstanceNUMACell' with 'mixed' policy results in 'TypeError'

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  We added support for the 'mixed' CPU policy in Victoria. This required
  changes to the 'cpu_policy' field of the 'InstanceNUMACell' object. As
  part of that change, we had to check that the consumer of the o.vo
  supported the 'mixed' policy and, if not, raise an
  'ObjectActionError'.

  Unfortunately we're attempting to use a tuple as a string in the
  string formatting for that exception's error message. As a result, if
  you attempt to actually raise it, you see the following:

TypeError: not all arguments converted during string formatting

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1898554/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1898272] [NEW] "mixed" policy calculations don't account for host cells with no free shared CPUs

2020-10-02 Thread Stephen Finucane
Public bug reported:

The 'mixed' CPU policy allows us to use both shared and dedicated CPUs
(VCPU and PCPU) in the same instance. The expectation is that the both
sets of CPUs will use host cores from the same NUMA node(s). The current
code does appear to be doing this, at least for single NUMA nodes,
however, it does not account for NUMA nodes without any shared CPUs.

# Steps to reproduce

Configure a dual NUMA node host so that all cores from one node are
assigned to '[compute] cpu_shared_set', while all the cores from the
other node are assigned to '[compute] cpu_dedicated_set'. For example,
on a host where cores 0-5 are on node 0, while cores 6-11 are on node 1:

  [compute]
  cpu_shared_set = 0-5
  cpu_dedicated_set = 6-11

 Now attempt to boot a guest using the mixed policy, e.g.

  $ openstack flavor create --vcpu 4 --ram 512 --disk 1 \
  --property 'hw:cpu_policy=mixed' --property 'hw:cpu_dedicated_mask=^0' \
  test.mixed
  $ openstack server create --os-compute-api-version=2.latest \
  --flavor test.mixed --image cirros-0.5.1-x86_64-disk --nic none --wait \
  test-server

# Expected result

The instance should fail to schedule as the 'NUMATopologyFilter' should
reject the host.

# Actual result

The instance is scheduled but fails to boot since the following invalid
XML snippet is generated:

  
4096

  # <--- here



  

This results in the following traceback in the nova-compute logs.

  ERROR nova.compute.manager [instance: ...] Traceback (most recent call last):
  ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/compute/manager.py", line 2625, in _build_resources
  ERROR nova.compute.manager [instance: ...] yield resources
  ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/compute/manager.py", line 2398, in _build_and_run_instance
  ERROR nova.compute.manager [instance: ...] accel_info=accel_info)
  ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 3752, in spawn
  ERROR nova.compute.manager [instance: ...] 
cleanup_instance_disks=created_disks)
  ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 6749, in 
_create_guest_with_network
  ERROR nova.compute.manager [instance: ...] 
cleanup_instance_disks=cleanup_instance_disks)
  ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
  ERROR nova.compute.manager [instance: ...] self.force_reraise()
  ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
  ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, 
self.value, self.tb)
  ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
  ERROR nova.compute.manager [instance: ...] raise value
  ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 6718, in 
_create_guest_with_network
  ERROR nova.compute.manager [instance: ...] 
post_xml_callback=post_xml_callback)
  ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 6643, in _create_guest
  ERROR nova.compute.manager [instance: ...] guest = 
libvirt_guest.Guest.create(xml, self._host)
  ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/guest.py", line 145, in create
  ERROR nova.compute.manager [instance: ...] encodeutils.safe_decode(xml))
  ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
  ERROR nova.compute.manager [instance: ...] self.force_reraise()
  ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
  ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, 
self.value, self.tb)
  ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
  ERROR nova.compute.manager [instance: ...] raise value
  ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/guest.py", line 141, in create
  ERROR nova.compute.manager [instance: ...] guest = 
host.write_instance_config(xml)
  ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/host.py", line 1144, in write_instance_config
  ERROR nova.compute.manager [instance: ...] domain = 
self.get_connection().defineXML(xml)
  ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 190, in doit
  ERROR nova.compute.manager [instance: ...] result = 
proxy_call(self._autowrap, f, *args, **kwargs)
  

[Yahoo-eng-team] [Bug 1896496] [NEW] Combination of 'hw_video_ram' image metadata prop, 'hw_video:ram_max_mb' extra spec raises error

2020-09-21 Thread Stephen Finucane
nager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/libvirt.py", line 3703, in defineXML
  ERROR nova.compute.manager [instance: ...] if ret is None:raise 
libvirtError('virDomainDefineXML() failed', conn=self)
  ERROR nova.compute.manager [instance: ...] libvirt.libvirtError: XML error: 
cannot parse video vram '8192.0'
  ERROR nova.compute.manager [instance: ...]

This appears to be a Python 3 thing, introduced by division of ints now
returning a float.

Steps to reproduce:

1. Set the 'hw_video_ram' image metadata property on an image:

   $ openstack image set --property hw_video_ram=8 $IMAGE

2. Set the 'hw_video:ram_max_mb' flavor extra spec on a flavor:

   $ openstack flavor update --property hw_video:ram_max_mb=16384
$FLAVOR

3. Create a server using this flavor and image:

   $ openstack server create --image $IMAGE --flavor $FLAVOR ... test-
server

Expected result:

Instance should be created with 8MB of VRAM.

Actual result:

Instance fails to create.

** Affects: nova
 Importance: Low
 Assignee: Stephen Finucane (stephenfinucane)
 Status: Confirmed


** Tags: libvirt

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Low

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

** Tags added: libvirt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1896496

Title:
  Combination of 'hw_video_ram' image metadata prop,
  'hw_video:ram_max_mb' extra spec raises error

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  The 'hw_video_ram' image metadata property is used to configure the
  amount of memory allocated to VRAM. Using it requires specifying the
  'hw_video:ram_max_mb' extra spec or you'll get the following error:

nova.exception.RequestedVRamTooHigh: The requested amount of video
  memory 8 is higher than the maximum allowed by flavor 0.

  However, specifying these currently results in a libvirt failure.

ERROR nova.compute.manager [None ...] [instance: 
11a71ae4-e410-4856-aeab-eea6ca4784c5] Failed to build and run instance: 
libvirt.libvirtError: XML error: cannot parse video vram '8192.0'
ERROR nova.compute.manager [instance: ...] Traceback (most recent call 
last):
ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/compute/manager.py", line 2333, in _build_and_run_instance
ERROR nova.compute.manager [instance: ...] accel_info=accel_info)
ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 3632, in spawn
ERROR nova.compute.manager [instance: ...] 
cleanup_instance_disks=created_disks)
ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 6527, in 
_create_domain_and_network
ERROR nova.compute.manager [instance: ...] 
cleanup_instance_disks=cleanup_instance_disks)
ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
ERROR nova.compute.manager [instance: ...] self.force_reraise()
ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, 
self.value, self.tb)
ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
ERROR nova.compute.manager [instance: ...] raise value
ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 6496, in 
_create_domain_and_network
ERROR nova.compute.manager [instance: ...] 
post_xml_callback=post_xml_callback)
ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 6425, in _create_domain
ERROR nova.compute.manager [instance: ...] guest = 
libvirt_guest.Guest.create(xml, self._host)
ERROR nova.compute.manager [instance: ...]   File 
"/opt/stack/nova/nova/virt/libvirt/guest.py", line 127, in create
ERROR nova.compute.manager [instance: ...] encodeutils.safe_decode(xml))
ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
ERROR nova.compute.manager [instance: ...] self.force_reraise()
ERROR nova.compute.manager [instance: ...]   File 
"/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
ERROR nova.compute.manager [instance: ...] six.reraise(self.type_, 
self.value, self.tb)
ERROR nova.compute.manager

[Yahoo-eng-team] [Bug 1599400] Re: nova boot has unexpected API error

2020-09-21 Thread Stephen Finucane
The move to validate these parameters at the API layer introduced in
Stein combined with the flavor extra spec validation work in Ussuri (API
microversion 2.86 or later) should have seen off this issue.

** Changed in: nova
   Status: In Progress => Won't Fix

** Changed in: nova
 Assignee: Ken'ichi Ohmichi (oomichi) => (unassigned)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1599400

Title:
  nova boot has unexpected API error

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Description:
  =

  Nova allow users to set free-form flavor extra-specs "hw:cpu_policy"
  and "hw:cpu_thread_policy". However, these values are not true free-
  form values, but rather enum values. Specifying an invalid value for
  one of these values, and booting an instance with the invalid flavor
  will result in an uncaught ValueError in Nova and a HTTP 500 code
  being returned to the user.

  Reproduce:
  =

  # 1. create flavor 11 with an illegal extra_spec
  "hw:cpu_thread_policy=shared"

  $ nova flavor-create test 11 128 1 3
  $ nova flavor-key 11 set hw:cpu_policy=dedicated
  $ nova flavor-key 11 set hw:cpu_thread_policy=shared

  # 2. boot an instance from that malformed flavor 11

  $ nova boot --image  --flavor 11 test

  Output:
  =

  ERROR (ClientException): Unexpected API Error. Please report this at 
http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
   (HTTP 500) (Request-ID: 
req-a26ad5f3-7982-4361-8817-0ab111ac9ab1)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1599400/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1616539] Re: architecture not validated in "openstack image create"

2020-09-21 Thread Stephen Finucane
This bug should be filed against glance, rather than nova. From what I
can tell, glance provides a config option to allow users to opt-in to
only allowing valid image metadata properties.

** Changed in: nova
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1616539

Title:
  architecture not validated in "openstack image create"

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  On Liberty

   

  $ openstack image create \
--public \
--container-format bare \
--disk-format qcow2 \
--min-disk 2 --min-ram 512 \
--file /home/images/SLES12SP1-cloudimage.qcow2 \
SLES12SP1-x86_64
  +--+--+
  | Field| Value|
  +--+--+
  | checksum | fcdeb8b10730ac96bccc9a121ee030f4 |
  | container_format | bare |
  | created_at   | 2016-08-02T20:51:45Z |
  | disk_format  | qcow2|
  | file | /v2/images/e7f289aa-e689-4f0a-a0a0-43f341986fd5/file |
  | id   | e7f289aa-e689-4f0a-a0a0-43f341986fd5 |
  | min_disk | 2|
  | min_ram  | 512  |
  | name | SLES12SP1-x86_64 |
  | owner| f7ed231f244b4b2db8b1e580f36e1580 |
  | protected| False|
  | schema   | /v2/schemas/image|
  | size | 362847744|
  | status   | active   |
  | updated_at   | 2016-08-02T20:51:51Z |
  | virtual_size | None |
  | visibility   | public   |
  +--+--+

  # openstack image set \
--name SLES12-SP1 \
--architecture x96_64 \
--os-distro sles \  # <-- the problem
--os-version 12.1 \
SLES12SP1-x86_64
  +--+--+
  | Field| Value|
  +--+--+
  | architecture | x96_64   |
  | checksum | fcdeb8b10730ac96bccc9a121ee030f4 |
  | container_format | bare |
  | created_at   | 2016-08-02T20:51:45Z |
  | disk_format  | qcow2|
  | file | /v2/images/e7f289aa-e689-4f0a-a0a0-43f341986fd5/file |
  | id   | e7f289aa-e689-4f0a-a0a0-43f341986fd5 |
  | min_disk | 2|
  | min_ram  | 512  |
  | name | SLES12-SP1   |
  | os_distro| sles |
  | os_version   | 12.1 |
  | owner| f7ed231f244b4b2db8b1e580f36e1580 |
  | protected| False|
  | schema   | /v2/schemas/image|
  | size | 362847744|
  | status   | active   |
  | tags | []   |
  | updated_at   | 2016-08-02T20:53:00Z |
  | virtual_size | None |
  | visibility   | public   |
  +--+--+

  $ openstack server create \
--flavor m1.smaller \
--image SLES12-SP1 \
vm01
  Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ 
and attach the Nova API log if possible.
   (HTTP 500) (Request-ID: 
req-5fc1ad67-86ad-4a69-b7f4-905861a8f2fc)

  
   

  # openstack image set \
--name SLES12-SP1 \
--architecture x86_64 \
--os-distro sles \
--os-version 12.1 \
SLES12-SP1
  

[Yahoo-eng-team] [Bug 1466451] Re: Nova should verify that devname in pci_passthrough_whitelist is not empty

2020-09-21 Thread Stephen Finucane
** Changed in: nova
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1466451

Title:
  Nova should verify that devname in pci_passthrough_whitelist is not
  empty

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  According to 
https://wiki.openstack.org/wiki/SR-IOV-Passthrough-For-Networking:
  "The devname can be a valid PCI device name. The only device names that are 
supported are those displayed by the Linux utility ifconfig -a and correspond 
to either a PF or a VF on a vNIC"

  However it's possible to supply an empty string as devname
  e.g. pci_passthrough_whitelist = {"devname": "", 
"physical_network":"physnet2"}

  It's also possible to have an entry:
  pci_passthrough_whitelist = {"physical_network":"physnet2"} 
  which shouldn't be valid.

  Nova should verify that devname is not an empty string and that
  devname,address or product_id/vendor_id are supplied.

  Version
  ==
  python-nova-2015.1.0-4.el7ost.noarch

  Expected result
  =
  Nova compute should fail to start when specifying an empty string for devname 
when using physical_network or when not specifying devname,address or 
product_id/vendor_id

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1466451/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1299151] Re: nova-consoleauth processes requests when disabled

2020-09-21 Thread Stephen Finucane
As noted in the review, nova-consoleauth has been removed so this bug no
longer makes sense.

** Changed in: nova
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1299151

Title:
  nova-consoleauth processes requests when disabled

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Not sure if this is a bug or not. But nova-consoleauth will process
  requests even if it is listed as disabled in the service list.

  | nova-consoleauth | u9-p| internal | disabled | up  |
  | nova-consoleauth | u10-p   | internal | enabled  | up|
  | nova-consoleauth | u11-p   | internal | enabled  | up|
  In this case I can watch as u9-p continues to process requests from the 
message bus.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1299151/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1810490] Re: wrong link of gabbits

2020-09-21 Thread Stephen Finucane
** Changed in: nova
   Status: In Progress => Fix Released

** Changed in: nova
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1810490

Title:
  wrong link of gabbits

Status in OpenStack Compute (nova):
  Fix Released

Bug description:

  This bug tracker is for errors with the documentation, use the
  following as a template and remove or add fields as you see fit.
  Convert [ ] into [x] to check boxes:

  - [x] This doc is inaccurate in this way: The "gabbits" url is incorrect at 
https://docs.openstack.org/placement/latest/#rest-api
  - [ ] This is a doc addition request.
  - [x] I have a fix to the document that I can paste below including example: 
input and output. 
  The correct is 
http://git.openstack.org/cgit/openstack/placement/tree/placement/tests/functional/gabbits
  If you have a troubleshooting or support issue, use the following  resources:

   - Ask OpenStack: http://ask.openstack.org
   - The mailing list: http://lists.openstack.org
   - IRC: 'openstack' channel on Freenode

  ---
  Release: 0.0.1.dev10886 on 2018-11-19 19:26:28
  SHA: 9d42491910e66ecd15767238bb617ed5984283f2
  Source: 
https://git.openstack.org/cgit/openstack/placement/tree/doc/source/index.rst
  URL: https://docs.openstack.org/placement/latest/

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1810490/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1840139] Re: Libvirt: Correct usage _guest_add_memory_balloon

2020-09-21 Thread Stephen Finucane
I'm not entirely sure what the issue is here, but this doesn't sound
like a bug per sé. At least, it's not something an end user will
encounter.

** Changed in: nova
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1840139

Title:
  Libvirt: Correct usage _guest_add_memory_balloon

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  From the code, function _guest_add_memory_balloon in [1], if 
mem_stats_period_seconds set to 0 or negative value, the memory usage 
statistics will disabled.
  Is mem_stats_period_seconds can control the virtual memory balloon device 
added? Isn't it only control memory usage statistics?

  The virtual memory balloon device will be added by Libvirt as a
  default behavior.[2]

  So the name "_guest_add_memory_balloon" maybe misleading.

  [1] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py
  [2] https://libvirt.org/formatdomain.html#elementsMemBalloon

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1840139/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1806079] Re: revert use of stestr in stable/pike

2020-09-21 Thread Stephen Finucane
** No longer affects: nova

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1806079

Title:
  revert use of stestr in stable/pike

Status in Ubuntu Cloud Archive:
  Fix Released

Bug description:
  The following commit changed dependencies of nova in the stable/pike
  branch and switched it to use stestr. There aren't any other projects
  (as far as I can tell) that use stestr in pike. This causes issues,
  for example, the Ubuntu cloud archive for pike doesn't have stestr. If
  possible I think this should be reverted.

  
  commit 5939ae995fdeb2746346ebd81ce223e4fe891c85
  Date:   Thu Jul 5 16:09:17 2018 -0400

  Backport tox.ini to switch to stestr
  
  The pike branch was still using ostestr (instead of stestr) which makes
  running tests significantly different from queens or master. To make
  things behave the same way this commit backports most of the tox.ini
  from queens so that pike will behave the same way for running tests.
  This does not use the standard backport mechanism because it involves a
  lot of different commits over time. It's also not a functional change
  for nova itself, so the proper procedure is less important here.
  
  Change-Id: Ie207afaf8defabc1d1eb9332f43a9753a00f784d

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1806079/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1855934] Re: new versions of flake8 parse typeing coments

2020-09-21 Thread Stephen Finucane
This was fixed in 26c1567a16d0bbf9ae19327aeafaa7ebc4394946.

** Changed in: nova
   Status: In Progress => Invalid

** Changed in: nova
   Status: Invalid => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1855934

Title:
  new versions of flake8 parse typeing coments

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  while playing with pre-commit i notice that new versions of flake8 parse type 
annotion comments.
  if you have not imported the relevent typing module then  it fails with F821 
undefined name 

  nova/virt/hardware.py:1396:5: F821 undefined name 'Optional'
  nova/virt/hardware.py:1396:5: F821 undefined name 'List'
  nova/virt/hardware.py:1396:5: F821 undefined name 'Set'
  nova/virt/hardware.py:1426:5: F821 undefined name 'Optional'
  nova/virt/hardware.py:1426:5: F821 undefined name 'List'
  nova/virt/hardware.py:1426:5: F821 undefined name 'Set'
  nova/virt/hardware.py:1456:5: F821 undefined name 'Optional'
  nova/virt/hardware.py:1483:5: F821 undefined name 'Optional'
  nova/virt/hardware.py:1525:5: F821 undefined name 'Optional'
  nova/virt/hardware.py:1624:5: F821 undefined name 'Tuple'
  nova/virt/hardware.py:1646:5: F821 undefined name 'Optional'
  nova/virt/hardware.py:1658:5: F821 undefined name 'Optional'
  nova/virt/hardware.py:1674:5: F821 undefined name 'List'
  nova/virt/hardware.py:1696:5: F821 undefined name 'Optional'
  nova/virt/hardware.py:1920:29: F821 undefined name 'List'
  nova/virt/hardware.py:1939:31: F821 undefined name 'Set'

  while this is not an issue today because we pin to an old version of
  flake8 we should still fix this just as a code hygiene issue. given
  this has no impact on the running code im going to triage this as low
  an push a trivial patch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1855934/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1847095] Re: The Nova Quobyte driver should use the LibvirtMountedFileSystemVolumeDriver parent class

2020-09-21 Thread Stephen Finucane
** Changed in: nova
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1847095

Title:
  The Nova Quobyte driver should use the
  LibvirtMountedFileSystemVolumeDriver parent class

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  See note at [1] stating that all LibvirtBaseFileSystemVolumeDriver
  children should subclass LibvirtMountedFileSystemVolumeDriver instead.

  
  [1] 
https://github.com/openstack/nova/blob/c6218428e9b29a2c52808ec7d27b4b21aadc0299/nova/virt/libvirt/volume/fs.py#L101

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1847095/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1741810] Re: Filter AggregateImagePropertiesIsolation doesn't Work

2020-09-16 Thread Stephen Finucane
We discussed this on IRC today [1]. In short, we realize that this was a
change in behaviour introduced in Liberty that should have been better
discussed at the time. However, Liberty was many years ago and it's
genuinely debatable whether this was ever intended behaviour, let alone
something we'd want to reintroduce support for.

Having discussed this, we're going to document this change in behaviour
in the docs and leave it there. If this (support for arbitrary image
metadata properties in this filter) is something you still see value in,
we'd probably have to treat it as a new feature. I'd encourage you to
file a spec [2] so we can evaluate the idea. If not, hopefully the
documentation change helps clarify things.

[1] 
http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2020-09-16.log.html#t2020-09-16T13:18:37
[2] https://specs.openstack.org/openstack/nova-specs/readme.html

** Changed in: nova
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1741810

Title:
  Filter AggregateImagePropertiesIsolation doesn't Work

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Description
  ===
  I tried to use filter AggregateImagePropertiesIsolation to isolate Windows 
instance for reducing number of Windows Licenses.

  I think nova scheduler in pike release, filter
  AggregateImagePropertiesIsolation always returned all hosts. If this
  is a bug, filter AggregateImagePropertiesIsolation needs to be fixed.

  
  Steps to reproduce
  ==
  # add filter to nova.conf and restart nova scheduler
  [filter_scheduler]
  enabled_filters = 
AggregateImagePropertiesIsolation,RetryFilter,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter

  # image create with os property
  openstack image create --min-disk 3 --min-ram 512 --disk-format qcow2 
--public --file windows.img img_windows
  openstack image create --min-disk 1 --min-ram 64 --disk-format qcow2 --public 
--file cirros-0.3.5-x86_64-disk.img img_linux
  openstack image set --property os=windows img_windows
  openstack image set --property os=linux img_linux

  # host aggregate create with os property
  openstack aggregate create os_win
  openstack aggregate add host os_win compute01
  openstack aggregate add host os_win compute02
  openstack aggregate set --property os=windows os_win
   
  openstack aggregate create os_linux
  openstack aggregate add host os_linux compute03
  openstack aggregate add host os_linux compute04
  openstack aggregate add host os_linux compute05
  openstack aggregate set --property os=linux os_linux

  # create flavor
  openstack flavor create --ram 1024 --disk 1 --vcpus 1 --public small
  openstack flavor create --ram 4096 --disk 20 --vcpus 2 --public medium

  # create windows instances
  openstack server create --image img_windows --network test-net --flavor 
medium --max 10 test-win

  
  Expected result
  ===
  Windows instances can be found in compute01, compute02 only

  Actual result
  =
  Windows instance was found in every hosts.


  Environment
  ===
  1. Nova's version
  (nova-scheduler)[nova@control01 /]$ rpm -qa | grep nova
  python-nova-17.0.0-0.20171206190932.cbdc893.el7.centos.noarch
  openstack-nova-scheduler-17.0.0-0.20171206190932.cbdc893.el7.centos.noarch
  openstack-nova-common-17.0.0-0.20171206190932.cbdc893.el7.centos.noarch
  python2-novaclient-9.1.0-0.20170804194758.0a53d19.el7.centos.noarch

  2. hypervisor
  (nova-libvirt)[root@compute01 /]# rpm -qa | grep kvm
  qemu-kvm-common-ev-2.9.0-16.el7_4.11.1.x86_64
  libvirt-daemon-kvm-3.2.0-14.el7_4.5.x86_64
  qemu-kvm-ev-2.9.0-16.el7_4.11.1.x86_64

  2. Storage
  ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous 
(stable)

  3. Networking
  Neutron with OpenVSwitch

  
  Logs & Configs
  ==
  $ tail -f nova-scheduler.log | grep AggregateImagePropertiesIsolation
  2018-01-08 11:52:53.964 6 DEBUG nova.filters 
[req-3828686f-1d46-407a-bebb-14f7a573c52e 9b1f4f0bcea2428c93b8b4276ba67cb7 
188be4011b2b49529cbdd6eade152233 - default default] Filter 
AggregateImagePropertiesIsolation returned 5 host(s) get_filtered_objects 
/usr/lib/python2.7/site-packages/nova/filters.py:104

  # add filter to nova.conf and restart nova scheduler
  [filter_scheduler]
  enabled_filters = 
AggregateImagePropertiesIsolation,RetryFilter,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1741810/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : 

[Yahoo-eng-team] [Bug 1728600] Re: Test test_network_basic_ops fails time to time, port doesn't become ACTIVE quickly

2020-09-15 Thread Stephen Finucane
** Changed in: nova
   Status: New => Incomplete

** No longer affects: nova

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1728600

Title:
  Test test_network_basic_ops fails time to time, port doesn't become
  ACTIVE quickly

Status in tempest:
  In Progress

Bug description:
  Test test_network_basic_ops fails time to time, port doesn't become
  ACTIVE quickly

  Trace:
  Traceback (most recent call last):
File "tempest/scenario/test_security_groups_basic_ops.py", line 185, in 
setUp
  self._deploy_tenant(self.primary_tenant)
File "tempest/scenario/test_security_groups_basic_ops.py", line 349, in 
_deploy_tenant
  self._set_access_point(tenant)
File "tempest/scenario/test_security_groups_basic_ops.py", line 316, in 
_set_access_point
  self._assign_floating_ips(tenant, server)
File "tempest/scenario/test_security_groups_basic_ops.py", line 322, in 
_assign_floating_ips
  client=tenant.manager.floating_ips_client)
File "tempest/scenario/manager.py", line 836, in create_floating_ip
  port_id, ip4 = self._get_server_port_id_and_ip4(thing)
File "tempest/scenario/manager.py", line 814, in _get_server_port_id_and_ip4
  "No IPv4 addresses found in: %s" % ports)
File "/usr/local/lib/python2.7/dist-packages/unittest2/case.py", line 845, 
in assertNotEqual
  raise self.failureException(msg)
  AssertionError: 0 == 0 : No IPv4 addresses found in: 
[{u'allowed_address_pairs': [], u'extra_dhcp_opts': [], u'updated_at': 
u'2017-10-30T10:04:41Z', u'device_owner': u'compute:None', u'revision_number': 
9, u'port_security_enabled': True, u'binding:profile': {}, u'fixed_ips': 
[{u'subnet_id': u'd522b2e5-7e56-4d08-843c-c434c3c2af97', u'ip_address': 
u'10.100.0.12'}], u'id': u'20d59775-906d-4390-b193-a8ec81817ddb', 
u'security_groups': [u'908eb03d-2477-49ab-ab9a-fcfae454', 
u'cf62ee1b-eb73-44d0-9ad8-65bb32885505'], u'binding:vif_details': 
{u'port_filter': True, u'ovs_hybrid_plug': True}, u'binding:vif_type': u'ovs', 
u'mac_address': u'fa:16:3e:02:f3:e8', u'project_id': 
u'0a8532fba2194d32996c3ba46ae35c96', u'status': u'BUILD', u'binding:host_id': 
u'cfg01', u'description': u'', u'tags': [], u'device_id': 
u'5ad8f2be-3cbb-49aa-8d72-e81ca6789665', u'name': u'', u'admin_state_up': True, 
u'network_id': u'49491fd4-2c1e-4c46-8166-b4648eb75f84', u'tenant_id': 
u'0a8532fba2194d32996c3ba46ae35c96', u'created_at': u'2017-10-30T10:04:37Z', 
u'binding:vnic_type': u'normal'}]

  Ran 1 test in 25.096s

To manage notifications about this bug go to:
https://bugs.launchpad.net/tempest/+bug/1728600/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1892562] Re: Choose a security group when creating an instance with a port that has disabled port security

2020-09-15 Thread Stephen Finucane
This is expected behavior. From the API reference:

  One or more security groups. Specify the name of the security group in the 
name attribute. If you
  omit this attribute, the API creates the server in the default security 
group. Requested security
  groups are not applied to pre-existing ports.

This is a pre-existing port so the security groups will not apply.

[1] https://docs.openstack.org/api-ref/compute/?expanded=create-server-
detail#id11

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1892562

Title:
  Choose a security group when creating an instance with a port that has
  disabled port security

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Description
  ===
  When creating an instance using a port that has disabled port security, if 
you choose a security group, an error should be expected to throw an exception, 
but the result is that the creation is successful. Although the instance as 
expected no security group, but I think in the process of creating an instance 
should throw an exception and give some hints, instead of successfully creating 
but not displaying the security group.

  Steps to reproduce
  ==
  * Create an instance use a port that has disabled port security
  * Use the port in the previous step to create an instance, and select a 
security group when creating the instance.

  Expected result
  ===
  Instance creation failed and an exception was thrown.

  Actual result
  =
  Successfully created instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1892562/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1894771] Re: Hypervisor shows negative numbers after launching instances on baremetal nodes

2020-09-15 Thread Stephen Finucane
This is expected behavior. The ironic driver does not report free disk,
RAM and memory via the 'get_available_resource' driver API [1] which
means the resource tracker is essentially subtracting usage from 0.
That's considered okay though [2].

In general, the 'os-hypervisors' API, which the 'nova hypervisor-show'
command uses, is considered very broken and will likely be removed in a
future release. You should rely on placement for an authoritative view
on resource consumption.

[1] 
https://github.com/openstack/nova/blob/e0f088c95d05e9cf32d4af4c7cfc20566b17f8e1/nova/virt/ironic/driver.py#L355-L357
[2] 
https://github.com/openstack/nova/blob/e0f088c95d05e9cf32d4af4c7cfc20566b17f8e1/nova/compute/resource_tracker.py#L1255

** Changed in: nova
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1894771

Title:
  Hypervisor shows negative numbers after launching instances on
  baremetal nodes

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Testing with Train version with ironic driver.
  Before launching instances on baremetal nodes, # nova hypervisor-show  
command shows 0 for vcpus, memory and disk fields, which are set to zero in 
ironic.driver code.
  This is still acceptable as baremetal resources are counted in resource 
class, however, after launching instance on the baremetal node, the 
vcpu/mem/disk fields appear to be negative in hypervisor-show details, and the 
negative numbers correlate with the flavor's vcpu/mem/disk fields.
  [root@train ~(keystone_admin)]#  nova hypervisor-show 
e12c91fb-4c73-406f-8b9e-b0ef3c9c829a
  +-+--+
  | Property| Value|
  +-+--+
  | cpu_info| {}   |
  | current_workload| 0|
  | disk_available_least| 0|
  | free_disk_gb| -100 |
  | free_ram_mb | -16384   |
  | host_ip | 192.168.10.111   |
  | hypervisor_hostname | e12c91fb-4c73-406f-8b9e-b0ef3c9c829a |
  | hypervisor_type | ironic   |
  | hypervisor_version  | 1|
  | id  | e12c91fb-4c73-406f-8b9e-b0ef3c9c829a |
  | local_gb| 0|
  | local_gb_used   | 100  |
  | memory_mb   | 0|
  | memory_mb_used  | 16384|
  | running_vms | 1|
  | service_disabled_reason | None |
  | service_host| train.ironic|
  | service_id  | 23464515-e938-47b1-807e-fb0e3d8250e3 |
  | state   | up   |
  | status  | enabled  |
  | vcpus   | 0|
  | vcpus_used  | 8|
  +-+--+
  The hypervisor detail does not affect the functions of baremetal instances, 
but is quite confusing.
  Besides, nova quotas and usages are also affected by the baremetal flavor's 
vcpu/mem/disk fields, which maybe not able to describe the resources that the 
instance occupies.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1894771/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1892033] Re: Failed to start nova-compute with libvirt-xen

2020-09-15 Thread Stephen Finucane
The libvirt+xen driver has been untested for many cycles and has been
deprecated in Victoria, with an eye on removal in Wallaby or later. I
don't think warrants being fixed.

** Changed in: nova
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1892033

Title:
  Failed to start nova-compute with libvirt-xen

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Description
  ===
  I deployed ussuri env from ubuntu-cloud:ussuri.
  Configure one compute node with xen and libvirt,
  then nova-compute serivce can not be started.
  Got error 'libvirt.libvirtError: this function is not supported by the 
connection driver: virNodeGetCPUMap'.

  Steps to reproduce
  ==
  1. Install nova-compute
  2. Configure nova.conf as below:
  [libvirt]
  virt_type = xen
  3. Start nova-compute service

  Expected result
  ===
  Nova-compute starts successfully

  Actual result
  =
  Got error

  Environment
  ===
  root@xen-cmp01:~# dpkg -l | grep nova-compute
  ii  nova-compute 2:21.0.0-0ubuntu0.20.04.1~cloud0 
   all  OpenStack Compute - compute node base
  ii  nova-compute-kvm 2:21.0.0-0ubuntu0.20.04.1~cloud0 
   all  OpenStack Compute - compute node (KVM)
  ii  nova-compute-libvirt 2:21.0.0-0ubuntu0.20.04.1~cloud0 
   all  OpenStack Compute - compute node libvirt 
support
  root@xen-cmp01:~# dpkg -l | grep libvirt
  ii  libvirt-clients  6.0.0-0ubuntu8.2~cloud0  
   amd64Programs for the libvirt library
  ii  libvirt-daemon   6.0.0-0ubuntu8.2~cloud0  
   amd64Virtualization daemon
  ii  libvirt-daemon-driver-qemu   6.0.0-0ubuntu8.2~cloud0  
   amd64Virtualization daemon QEMU connection driver
  ii  libvirt-daemon-driver-storage-rbd6.0.0-0ubuntu8.2~cloud0  
   amd64Virtualization daemon RBD storage driver
  ii  libvirt-daemon-driver-xen6.0.0-0ubuntu8.2~cloud0  
   amd64Virtualization daemon Xen connection driver
  ii  libvirt-daemon-system6.0.0-0ubuntu8.2~cloud0  
   amd64Libvirt daemon configuration files
  ii  libvirt-daemon-system-systemd6.0.0-0ubuntu8.2~cloud0  
   amd64Libvirt daemon configuration files (systemd)
  ii  libvirt0:amd64   6.0.0-0ubuntu8.2~cloud0  
   amd64library for interfacing with different 
virtualization systems
  ii  nova-compute-libvirt 2:21.0.0-0ubuntu0.20.04.1~cloud0 
   all  OpenStack Compute - compute node libvirt 
support
  ii  python3-libvirt  6.1.0-1~cloud0   
   amd64libvirt Python 3 bindings
  root@xen-cmp01:~# dpkg -l | grep xen
  ii  grub-xen-bin 2.02-2ubuntu8.17 
   amd64GRand Unified Bootloader, version 2 (Xen 
binaries)
  ii  grub-xen-host2.02-2ubuntu8.17 
   amd64GRand Unified Bootloader, version 2 (Xen 
host version)
  ii  libvirt-daemon-driver-xen6.0.0-0ubuntu8.2~cloud0  
   amd64Virtualization daemon Xen connection driver
  ii  libxen-4.9:amd64 4.9.2-0ubuntu1   
   amd64Public libs for Xen
  ii  libxenstore3.0:amd64 4.9.2-0ubuntu1   
   amd64Xenstore communications library for Xen
  ii  python3-os-xenapi0.3.4-0ubuntu3~cloud0
   all  XenAPI library for OpenStack projects - 
Python 3.x
  ii  xen-hypervisor-4.9-amd64 4.9.2-0ubuntu1   
   amd64Xen Hypervisor on AMD64
  ii  xen-utils-4.94.9.2-0ubuntu1   
   amd64XEN administrative tools
  ii  xen-utils-common 4.9.2-0ubuntu1   
   all  Xen administrative tools - common files
  ii  xenstore-utils   4.9.2-0ubuntu1   
   amd64Xenstore command line utilities for Xen

  Logs & Configs
  ==
  2020-08-18 12:23:30.739 12029 ERROR nova.compute.manager 

[Yahoo-eng-team] [Bug 1685152] Re: [RFE] SR-IOV - HotPlug support

2020-09-08 Thread Stephen Finucane
*** This bug is a duplicate of bug 1499269 ***
https://bugs.launchpad.net/bugs/1499269

** This bug has been marked a duplicate of bug 1499269
   cannot attach direct type port (sr-iov) to existing instance

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1685152

Title:
  [RFE] SR-IOV - HotPlug support

Status in OpenStack Compute (nova):
  Expired

Bug description:
  The Nova interface-attach API needs be enhanced to support SR-IOV.

  There is a Newton blueprint for this:
  https://review.openstack.org/#/c/139910/

  It has been abandoned, and need to be picked up for ocata/pike.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1685152/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1894095] [NEW] Running periodic task during live migration results in incorrect usage

2020-09-03 Thread Stephen Finucane
Public bug reported:

With the introduction of NUMA-aware live migration in Train, we now do
proper claiming and, if necessary, unclaiming of resources at the
destination host. However, the latter uses the same mechanism as
resize/cold migrate confirm/revert, which means its subject to the same
races as those highlighted in bug 1879878. This bug tracks the live
migration side of the work to fix that.

** Affects: nova
 Importance: Medium
 Assignee: Stephen Finucane (stephenfinucane)
 Status: Confirmed

** Affects: nova/train
 Importance: Undecided
 Status: New

** Affects: nova/ussuri
 Importance: Undecided
 Status: New


** Tags: libvirt live-migration numa

** Tags added: numa

** Tags added: libvirt live-migration

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

** Also affects: nova/train
   Importance: Undecided
   Status: New

** Also affects: nova/ussuri
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1894095

Title:
  Running periodic task during live migration results in incorrect usage

Status in OpenStack Compute (nova):
  Confirmed
Status in OpenStack Compute (nova) train series:
  New
Status in OpenStack Compute (nova) ussuri series:
  New

Bug description:
  With the introduction of NUMA-aware live migration in Train, we now do
  proper claiming and, if necessary, unclaiming of resources at the
  destination host. However, the latter uses the same mechanism as
  resize/cold migrate confirm/revert, which means its subject to the
  same races as those highlighted in bug 1879878. This bug tracks the
  live migration side of the work to fix that.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1894095/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1893864] Re: resolve ResourceProviderSyncFailed issue in python3

2020-09-02 Thread Stephen Finucane
** Also affects: nova/trunk
   Importance: Undecided
   Status: New

** Also affects: nova/ussuri
   Importance: Undecided
   Status: New

** Also affects: nova/victoria
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1893864

Title:
  resolve ResourceProviderSyncFailed issue in python3

Status in OpenStack Compute (nova):
  New
Status in OpenStack Compute (nova) trunk series:
  New
Status in OpenStack Compute (nova) ussuri series:
  New
Status in OpenStack Compute (nova) victoria series:
  New

Bug description:
  Description
  ===
  In recent code of Train, the VMs booted runs into ERROR state The synchronize 
the placement service with resource provider information supplied by the 
compute host fails

  This is caused by the "/" operation difference between python 2.x and
  python 3.x

  In python 2.x "/" int / int returns int while
  in python 3.x int / int return the real result.

  Environment
  ===
  vmware setup

  Logs & Configs
  ==

  ^[[01;31mERROR nova.scheduler.client.report [^[[01;36mNone 
req-030a4d73-5bf1-4080-8b39-d637270055e0 ^[[00;36madmin admin^[[01;31m] 
^[[01;35m^[[01;31m[req-b1da7c1f-6f45-412c-9740-0f27495b1f23] Failed to update 
inventory to [{'VCPU': {'total': 12, 'reserved': 0, 'min_unit': 1, 'max_unit': 
4, 'step_size': 1, 'allocation_ratio': 100.0}, 'MEMORY_MB': {'total': 49149, 
'reserved': 512, 'min_unit': 1, 'max_unit': 16383, 'step_size': 1, 
'allocation_ratio': 1.5}, 'DISK_GB': {'total': 3025, 'reserved': 0, 'min_unit': 
1, 'max_unit': 0, 'step_size': 1, 'allocation_ratio': 1.0}}] for resource 
provider with UUID 33c124a0-1ebc-4a36-a1fd-b6cd9d104c49.  Got 400: {"errors": 
[{"status": 400, "title": "Bad Request", "detail": "The server could not comply 
with the request since it is either malformed or otherwise incorrect.\n\n JSON 
does not validate: 0 is less than the minimum of 1  Failed validating 'minimum' 
in 
schema['properties']['inventories']['patternProperties']['^[A-Z0-9_]+$']['properties']['max_unit']:
 {'maximum': 2147483647, 'minimum': 1, 'type': 'integer'}  On 
instance['inventories']['DISK_GB']['max_unit']: 0  ", "code": 
"placement.undefined_code", "request_id": 
"req-b1da7c1f-6f45-412c-9740-0f27495b1f23"}]}^[[00m^[[00m
  ^[[00;32mDEBUG oslo_concurrency.lockutils [^[[01;36mNone 
req-030a4d73-5bf1-4080-8b39-d637270055e0 ^[[00;36madmin admin^[[00;32m] 
^[[01;35m^[[00;32mLock "compute_resources" released by 
"nova.compute.resource_tracker.ResourceTracker.instance_claim" :: held 
1.016s^[[00m ^[[00;33m{{(pid=6347) inner 
/usr/local/lib/python3.6/dist-packages/oslo_concurrency/lockutils.py:339}}^[[00m^[[00m
  ^[[01;31mERROR nova.compute.manager [^[[01;36mNone 
req-030a4d73-5bf1-4080-8b39-d637270055e0 ^[[00;36madmin admin^[[01;31m] 
^[[01;35m[instance: e5083b01-f490-4031-b25c-edd6d86dd62f] ^[[01;31mFailed to 
build and run instance^[[00m: nova.exception.ResourceProviderSyncFailed: Failed 
to synchronize the placement service with resource provider information 
supplied by the compute host.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1893864/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1892961] Re: set different VirtualDevice.key

2020-08-26 Thread Stephen Finucane
** Also affects: nova/stein
   Importance: Undecided
   Status: New

** Also affects: nova/train
   Importance: Undecided
   Status: New

** Also affects: nova/victoria
   Importance: Undecided
 Assignee: Yingji Sun (yingjisun)
   Status: In Progress

** Also affects: nova/ussuri
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1892961

Title:
   set different VirtualDevice.key

Status in OpenStack Compute (nova):
  In Progress
Status in OpenStack Compute (nova) stein series:
  New
Status in OpenStack Compute (nova) train series:
  New
Status in OpenStack Compute (nova) ussuri series:
  New
Status in OpenStack Compute (nova) victoria series:
  In Progress

Bug description:
  When creating an instance with multiple nics on vsphere 7, such as,

  Create server using ports: "networks": [{"port": "1ff1fd0e-a7c1-400d-
  8ee4-d8b6c94ed33b"}, {"port": "87aee6b2-c76a-4f10-9eab-
  a23ff9694b34"}],

  it will report error as below.

  2020-02-03 22:56:02.654 13279 ERROR nova.compute.manager [req-
  b1ec16f5-e529-4c98-9a4c-4cb8782489d2 a2fa852dc11546dfaf4bb2d9c0460dcf
  ee69d923dc594779a5775abd2077bea8 - default default] [instance:
  a80f85ce-2c16-4022-b01e-fb6953243fc0] Instance failed to spawn:
  VimFaultException: Network interface 'VirtualE1000' uses network
  'nsx.LogicalSwitch:3a603a1c-4df4-4b09-afd1-ac9b56979f5e', which is not
  accessible.

  The root cause is that starting from vsphere 7, VirtualDevice.key
  cannot be the same any more.

  Original the request to vcenter is

  -->   deviceChange = (vim.vm.device.VirtualDeviceSpec) [
  -->  (vim.vm.device.VirtualDeviceSpec) {
  --> operation = "add", 
  --> device = (vim.vm.device.VirtualE1000) {
  -->key = -47, 
  -->backing = 
(vim.vm.device.VirtualEthernetCard.OpaqueNetworkBackingInfo) {
  -->   opaqueNetworkId = 
"9c0d11f9-8388-465f-9a78-988134d44ab7", 
  -->   opaqueNetworkType = "nsx.LogicalSwitch"
  -->}, 
  -->connectable = (vim.vm.device.VirtualDevice.ConnectInfo) {
  -->   startConnected = true, 
  -->   allowGuestControl = true, 
  -->   connected = true, 
  -->}, 
  -->addressType = "manual", 
  -->macAddress = "fa:16:3e:58:a9:24", 
  -->wakeOnLanEnabled = true, 
  -->externalId = "1ff1fd0e-a7c1-400d-8ee4-d8b6c94ed33b", 
  --> }, 
  -->  }, 
  -->  (vim.vm.device.VirtualDeviceSpec) {
  --> operation = "add", 
  --> device = (vim.vm.device.VirtualE1000) {
  -->key = -47, 
  -->backing = 
(vim.vm.device.VirtualEthernetCard.OpaqueNetworkBackingInfo) {
  -->   opaqueNetworkId = 
"00b14b88-4650-40c9-8216-f188b3f865cf", 

  -->   opaqueNetworkType = "nsx.LogicalSwitch"
  -->}, 
  -->connectable = (vim.vm.device.VirtualDevice.ConnectInfo) {
  -->   startConnected = true, 
  -->   allowGuestControl = true, 
  -->   connected = true, 
  -->}, 
  -->addressType = "manual", 
  -->macAddress = "fa:16:3e:fc:d7:20", 
  -->wakeOnLanEnabled = true, 
  -->externalId = "87aee6b2-c76a-4f10-9eab-a23ff9694b34", 
  --> }, 
  -->  }, 

  We need to change 'key' to different values.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1892961/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1889633] [NEW] Pinned instance with thread policy can consume VCPU

2020-07-30 Thread Stephen Finucane
Public bug reported:

In Train, we introduced the concept of the 'PCPU' resource type to track
pinned instance CPU usage. The '[compute] cpu_dedicated_set' is used to
indicate which host cores should be used by pinned instances and, once
this config option was set, nova would start reporting 'PCPU' resource
types in addition to (or entirely instead of, if 'cpu_shared_set' was
unset) 'VCPU'. Requests for pinned instances (via the
'hw:cpu_policy=dedicated' flavor extra spec or equivalent image metadata
property) would result in a query for 'PCPU' inventory rather than
'VCPU', as previously done.

We anticipated some upgrade issues with this change, whereby there could
be a period during an upgrade in which some hosts would have the new
configuration, meaning they'd be reporting PCPU, but the remainder would
still be on legacy config and therefore would continue reporting just
VCPU. An instance could be reasonably expected to land on any host, but
since only the hosts with the new configuration were reporting 'PCPU'
inventory and the 'hw:cpu_policy=dedicated' extra spec was resulting in
a request for 'PCPU', the hosts with legacy configuration would never be
consumed.

We worked around this issue by adding support for a fallback placement
query, enabled by default, which would make a second request using
'VCPU' inventory instead of 'PCPU'. The idea behind this was that the
hosts with 'PCPU' inventory would be preferred, meaning we'd only try
the 'VCPU' allocation if the preferred path failed. Crucially, we
anticipated that if a host with new style configuration was picked up by
this second 'VCPU' query, an instance would never actually be able to
build there. This is because the new-style configuration would be
reflected in the 'numa_topology' blob of the 'ComputeNode' object,
specifically via the 'cpuset' (for cores allocated to 'VCPU') and
'pcpuset' (for cores allocated to 'PCPU') fields. With new-style
configuration, both of these are set to unique values. If the scheduler
had determined that there wasn't enough 'PCPU' inventory available for
the instance, that would implicitly mean there weren't enough of the
cores listed in the 'pcpuset' field still available.

Turns out there's a gap in this thinking: thread policies. The 'isolate'
CPU thread policy previously meant "give me a host with no hyperthreads,
else a host with hyperthreads but mark the thread siblings of the cores
used by the instance as reserved". This didn't translate to a new 'PCPU'
world where we needed to know how many cores we were consuming up front
before landing on the host. To work around this, we removed support for
the latter case and instead relied on a trait, 'HW_CPU_HYPERTHEADING',
to indicate whether a host had hyperthread support or not. Using the
'isolate' policy meant that trait could not be defined on the host, or
the trait was "forbidden". The gap comes via a combination of this trait
request and the fallback query. If we request the isolate thread policy,
hosts with new-style configuration and sufficient PCPU inventory would
nonetheless be rejected if they reported the 'HW_CPU_HYPERTHEADING'
trait. However, these could get picked up in the fallback query and the
instance would not fail to build on the host because of lack of 'PCPU'
inventory. This means we end up with a pinned instance on a host using
new-style configuration that is consuming 'VCPU' inventory. Boo.

# Steps to reproduce

1. Using a host with hyperthreading support enabled, configure both
'[compute] cpu_dedicated_set' and '[compute] cpu_shared_set'

2. Boot an instance with the 'hw:cpu_thread_policy=isolate' extra spec.

# Expected result

Instance should not boot since the host has hyperthreads.

# Actual result

Instance boots.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1889633

Title:
  Pinned instance with thread policy can consume VCPU

Status in OpenStack Compute (nova):
  New

Bug description:
  In Train, we introduced the concept of the 'PCPU' resource type to
  track pinned instance CPU usage. The '[compute] cpu_dedicated_set' is
  used to indicate which host cores should be used by pinned instances
  and, once this config option was set, nova would start reporting
  'PCPU' resource types in addition to (or entirely instead of, if
  'cpu_shared_set' was unset) 'VCPU'. Requests for pinned instances (via
  the 'hw:cpu_policy=dedicated' flavor extra spec or equivalent image
  metadata property) would result in a query for 'PCPU' inventory rather
  than 'VCPU', as previously done.

  We anticipated some upgrade issues with this change, whereby there
  could be a period during an upgrade in which some hosts would have the
  new configuration, meaning they'd be reporting PCPU, but the remainder
  would still be on legacy config and therefore would continue 

[Yahoo-eng-team] [Bug 1889257] [NEW] Live migration of realtime instances is broken

2020-07-28 Thread Stephen Finucane
Public bug reported:

Attempting to live migrate an instance with realtime enabled fails on
master (commit d4c857dfcb1). This appears to be a bug with the live
migration of pinned instances feature introduced in Train.

# Steps to reproduce

Create a server using realtime attributes and then attempt to live
migrate it. For example:

  $ openstack flavor create --ram 1024 --disk 0 --vcpu 4 \
--property 'hw:cpu_policy=dedicated' \
--property 'hw:cpu_realtime=yes' \
--property 'hw:cpu_realtime_mask=^0-1' \
realtime

  $ openstack server create --os-compute-api-version=2.latest \
--flavor realtime --image cirros-0.5.1-x86_64-disk --nic none \
--boot-from-volume 1 --wait \
test.realtime

  $ openstack server migrate --live-migration test.realtime

# Expected result

Instance should be live migrated.

# Actual result

The live migration never happens. Looking at the logs we see the
following error:

  Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/eventlet/hubs/hub.py", line 
461, in fire_timers
  timer()
File "/usr/local/lib/python3.6/dist-packages/eventlet/hubs/timer.py", line 
59, in __call__
  cb(*args, **kw)
File "/usr/local/lib/python3.6/dist-packages/eventlet/event.py", line 175, 
in _do_send
  waiter.switch(result)
File "/usr/local/lib/python3.6/dist-packages/eventlet/greenthread.py", line 
221, in main
  result = function(*args, **kwargs)
File "/opt/stack/nova/nova/utils.py", line 670, in context_wrapper
  return func(*args, **kwargs)
File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 8966, in 
_live_migration_operation
  # is still ongoing, or failed
File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 
220, in __exit__
  self.force_reraise()
File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 
196, in force_reraise
  six.reraise(self.type_, self.value, self.tb)
File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
  raise value
File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 8959, in 
_live_migration_operation
  #  2. src==running, dst==paused
File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 658, in migrate
  destination, params=params, flags=flags)
File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 190, 
in doit
  result = proxy_call(self._autowrap, f, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 148, 
in proxy_call
  rv = execute(f, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 129, 
in execute
  six.reraise(c, e, tb)
File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
  raise value
File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 83, 
in tworker
  rv = meth(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/libvirt.py", line 1745, in 
migrateToURI3
  if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', 
dom=self)
  libvirt.libvirtError: vcpussched attributes 'vcpus' must not overlap

Looking further, we see there are issues with the XML we are generating
for the destination. Compare what we have on the source before updating
the XML for the destination:

  DEBUG nova.virt.libvirt.migration [-] _update_numa_xml input xml=
...

  4096
  
  
  
  
  
  
  

   {{(pid=12600) _update_numa_xml 
/opt/stack/nova/nova/virt/libvirt/migration.py:97}

To what we have after the update:

  DEBUG nova.virt.libvirt.migration [-] _update_numa_xml output xml=
...

  4096
  
  
  
  
  
  
  

...
  
   {{(pid=12600) _update_numa_xml 
/opt/stack/nova/nova/virt/libvirt/migration.py:131}}

The issue is the 'vcpusched' elements. We're assuming there are only one
of these elements when updating the XML for the destination [1]. Have to
figure out why there are multiple elements and how best to handle this
(likely by deleting and recreating everything).

I suspect the reason we didn't spot this is because libvirt is rewriting
the XML on us. This is what nova is providing libvirt upon boot:

  DEBUG nova.virt.libvirt.driver [...] [instance: ...] End _get_guest_xml 
xml=
...

  4096
  
  
  
  
  
  

...
  
   {{(pid=12600) _get_guest_xml 
/opt/stack/nova/nova/virt/libvirt/driver.py:6331}}

but that's changed by time we get to recalculating things.

The solution is probably to remove all 'vcpusched' elements and recreate
them, rather than trying to update stuff inline.

[1]
https://github.com/openstack/nova/blob/21.0.0/nova/virt/libvirt/migration.py#L152-L155

** Affe

[Yahoo-eng-team] [Bug 1888414] Re: Snapshot of stopped, suspended instance fails

2020-07-21 Thread Stephen Finucane
I also see that when this fails, there are left over base files in
'/opt/stack/data/nova/instances/_base'.

** Changed in: nova
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1888414

Title:
  Snapshot of stopped, suspended instance fails

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Attempting to create a snapshot of a shutdown instance fails. It seems
  nova assumes the instance exists and is running when attempting to
  create the snapshot.

  # Steps to reproduce

    $ openstack server create \
    --os-compute-api-version=2.latest --flavor m1.tiny --image 
cirros-0.5.1-x86_64-disk \
    --nic none --wait test.server
    $ openstack server stop test.server
    $ openstack server image create test.server

  # Expected result

  A snapshot of the instance root disk should be created.

  # Actual result

  The snapshot is not created. Attempts to resume the instance fail:

    $ openstack server start test.server
    Cannot 'start' instance aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda while it is in 
task_state image_pending_upload (HTTP 409) (Request-ID: 
req-39d4bd58-366b-4b93-8d7d-72a487183088)

  # Additional details

  I see the following in the logs:

    nova-compute[20898]: INFO nova.virt.libvirt.driver [None 
req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Skipping quiescing instance: QEMU guest 
agent is not enabled.
    nova-compute[20898]: INFO nova.virt.libvirt.driver [None 
req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Instance instance-000c disappeared 
while taking snapshot of it: [Error Code 42] Domain not found: no domain with 
matching uuid 'aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda' (instance-000c)
    nova-compute[20898]: DEBUG nova.compute.manager [None 
req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Instance disappeared during snapshot 
{{(pid=20898) _snapshot_instance /opt/stack/nova/nova/compute/manager.py:3874}}

  Compare with logs from snapshot of a running guest:

    nova-compute[20898]: INFO nova.virt.libvirt.driver [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Skipping quiescing instance: QEMU guest 
agent is not enabled.
    nova-compute[20898]: DEBUG nova.privsep.utils [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] Path 
'/opt/stack/data/nova/instances' supports direct I/O {{(pid=20898) 
supports_direct_io /opt/stack/nova/nova/privsep/utils.py:64}}
    nova-compute[20898]: DEBUG oslo_concurrency.processutils [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] Running cmd (subprocess): 
qemu-img convert -t none -O qcow2 -f qcow2 
/opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581.delta
 
/opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581
 {{(pid=20898) execute 
/usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py:371}}
    nova-compute[20898]: DEBUG oslo_concurrency.processutils [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] CMD "qemu-img convert -t 
none -O qcow2 -f qcow2 
/opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581.delta
 
/opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581"
 returned: 0 in 0.403s {{(pid=20898) execute 
/usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py:408}}
    nova-compute[20898]: INFO nova.virt.libvirt.driver [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Snapshot extracted, beginning image upload
    nova-compute[20898]: INFO nova.virt.libvirt.driver [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Snapshot image upload complete
    nova-compute[20898]: INFO nova.compute.manager [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Took 2.44 seconds to snapshot the 
instance on the hypervisor.

  We see the same issue if the issues is suspended ('openstack server
  suspend'). There are no issues if the instance is paused, however
  ('openstack server pause') but

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1888414/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1888414] [NEW] Snapshot of stopped, suspended instance fails

2020-07-21 Thread Stephen Finucane
Public bug reported:

Attempting to create a snapshot of a shutdown instance fails. It seems
nova assumes the instance exists and is running when attempting to
create the snapshot.

# Steps to reproduce

  $ openstack server create \
  --os-compute-api-version=2.latest --flavor m1.tiny --image 
cirros-0.5.1-x86_64-disk \
  --nic none --wait test.server
  $ openstack server stop test.server
  $ openstack server image create test.server

# Expected result

A snapshot of the instance root disk should be created.

# Actual result

The snapshot is not created. Attempts to resume the instance fail:

  $ openstack server start test.server
  Cannot 'start' instance aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda while it is in 
task_state image_pending_upload (HTTP 409) (Request-ID: 
req-39d4bd58-366b-4b93-8d7d-72a487183088)

# Additional details

I see the following in the logs:

  nova-compute[20898]: INFO nova.virt.libvirt.driver [None 
req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Skipping quiescing instance: QEMU guest 
agent is not enabled.
  nova-compute[20898]: INFO nova.virt.libvirt.driver [None 
req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Instance instance-000c disappeared 
while taking snapshot of it: [Error Code 42] Domain not found: no domain with 
matching uuid 'aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda' (instance-000c)
  nova-compute[20898]: DEBUG nova.compute.manager [None 
req-0b7dfe74-d465-4c2b-90e3-54ed26ea4244 demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Instance disappeared during snapshot 
{{(pid=20898) _snapshot_instance /opt/stack/nova/nova/compute/manager.py:3874}}

Compare with logs from snapshot of a running guest:

  nova-compute[20898]: INFO nova.virt.libvirt.driver [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Skipping quiescing instance: QEMU guest 
agent is not enabled.
  nova-compute[20898]: DEBUG nova.privsep.utils [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] Path 
'/opt/stack/data/nova/instances' supports direct I/O {{(pid=20898) 
supports_direct_io /opt/stack/nova/nova/privsep/utils.py:64}}
  nova-compute[20898]: DEBUG oslo_concurrency.processutils [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] Running cmd (subprocess): 
qemu-img convert -t none -O qcow2 -f qcow2 
/opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581.delta
 
/opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581
 {{(pid=20898) execute 
/usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py:371}}
  nova-compute[20898]: DEBUG oslo_concurrency.processutils [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] CMD "qemu-img convert -t 
none -O qcow2 -f qcow2 
/opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581.delta
 
/opt/stack/data/nova/instances/snapshots/tmpal0gmbcx/d64bd94655da448495d69b274ca14581"
 returned: 0 in 0.403s {{(pid=20898) execute 
/usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py:408}}
  nova-compute[20898]: INFO nova.virt.libvirt.driver [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Snapshot extracted, beginning image upload
  nova-compute[20898]: INFO nova.virt.libvirt.driver [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Snapshot image upload complete
  nova-compute[20898]: INFO nova.compute.manager [None 
req-79e59ed6-4558-42be-a016-09ff2c4d60cb demo admin] [instance: 
aa82c7a9-dbc1-4c7b-b3f1-8dc6b83e1bda] Took 2.44 seconds to snapshot the 
instance on the hypervisor.

We see the same issue if the issues is suspended ('openstack server
suspend'). There are no issues if the instance is paused, however
('openstack server pause') but

** Affects: nova
 Importance: Medium
 Status: Confirmed


** Tags: libvirt snapshot

** Tags added: libvirt snapshot

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Medium

** Description changed:

  Attempting to create a snapshot of a shutdown instance fails. It seems
  nova assumes the instance exists and is running when attempting to
  create the snapshot.
  
  # Steps to reproduce
  
-   $ openstack server create \
-   --os-compute-api-version=2.latest --flavor m1.tiny --image 
cirros-0.5.1-x86_64-disk \
-   --nic none --wait test.server
-   $ openstack server stop test.server
-   $ openstack server image create test.server
+   $ openstack server create \
+   --os-compute-api-version=2.latest --flavor m1.tiny --image 
cirros-0.5.1-x86_64-disk \
+   --nic none --wait test.server
+   $ openstack server stop test.server
+   $ openstack server image create test.server
  
  # Expected result
  
  A 

[Yahoo-eng-team] [Bug 1884231] [NEW] 'hw:realtime_mask' extra spec is not validated

2020-06-19 Thread Stephen Finucane
or thread policy. For example:

  openstack flavor create --ram 512 --disk 1 --vcpus 2 \
--property 'hw:cpu_policy=dedicated' \
--property 'hw:emulator_threads_policy=isolate' \
--property 'hw:cpu_realtime=true' \
--property 'hw:cpu_realtime_mask=^2' \
test.rt

Similarly, they could ensure at least one core in the range is valid:

  openstack flavor create --ram 512 --disk 1 --vcpus 2 \
--property 'hw:cpu_policy=dedicated' \
--property 'hw:emulator_threads_policy=isolate' \
--property 'hw:cpu_realtime=true' \
--property 'hw:cpu_realtime_mask=^1-5' \
test.rt

However, both cases are still wrong and the 'hw:cpu_realtime_mask' value
is almost certainly user error. Nova should be validating things
properly and rejecting invalid values. we could probably also look at
dropping the requirement to specify 'hw:cpu_realtime_mask' if
'hw:emulator_threads_policy' is configured, however, that's more of a
feature than a bug.

** Affects: nova
 Importance: Medium
 Assignee: Stephen Finucane (stephenfinucane)
 Status: Confirmed

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

** Changed in: nova
   Status: New => Confirmed

** Description changed:

  The 'hw:realtime_mask' extra spec is (currently) used to specify what
  cores in a host should *not* be part of the realtime set of cores on the
  host. Currently, this is mandatory and omitting it will cause a HTTP 400
  error. For example:
  
-   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
-   --property hw:cpu_policy=dedicated
-   --property hw:cpu_realtime=yes \
-   test.rt
+   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
+   --property hw:cpu_policy=dedicated
+   --property hw:cpu_realtime=yes \
+   test.rt
  
  will fail with:
  
-   Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU
+   Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU
  and 1 ordinary vCPU. See hw:cpu_realtime_mask or hw_cpu_realtime_mask
  
  Similarly, attempting to mask *all* values will result in a failure. For
  example:
  
-   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
-   --property hw:cpu_policy=dedicated
-   --property hw:cpu_realtime=yes \
-   --property hw:cpu_realtime_mask=^0-1
-   test.rt
+   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
+   --property hw:cpu_policy=dedicated
+   --property hw:cpu_realtime=yes \
+   --property hw:cpu_realtime_mask=^0-1
+   test.rt
  
  will also fail with:
  
-   Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU
+   Realtime policy needs vCPU(s) mask configured with at least 1 RT vCPU
  and 1 ordinary vCPU. See hw:cpu_realtime_mask or hw_cpu_realtime_mask
  
  However, the value is otherwise unvalidated by nova, which can cause
  libvirt to explode when specific values are passed. For example,
  consider the following flavor:
  
-   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
-   --property hw:cpu_policy=dedicated
-   --property hw:cpu_realtime=yes \
-   --property hw:cpu_realtime_mask='^2' \
-   test.rt
+   $ openstack flavor create --ram 512 --disk 1 --vcpus 2 \
+   --property hw:cpu_policy=dedicated
+   --property hw:cpu_realtime=yes \
+   --property hw:cpu_realtime_mask='^2' \
+   test.rt
  
  This states that the instances should have two cores, and some imaginary
  third core (masks are 0-indexed) will be the non-realtime one. This is
  clearly nonsensical and, surely enough, creating an instance using this
  core causes things to go bang:
  
-   Failed to build and run instance: libvirt.libvirtError: invalid argument: 
Failed to parse bitmap ''
-   Traceback (most recent call last):
- File "/opt/stack/nova/nova/compute/manager.py", line 2378, in 
_build_and_run_instance
-   accel_info=accel_info)
- File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3702, in spawn
-   cleanup_instance_disks=created_disks)
- File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6664, in 
_create_domain_and_network
-   cleanup_instance_disks=cleanup_instance_disks)
- File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", 
line220, in __exit__
-   self.force_reraise()
- File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", 
line196, in force_reraise
-   six.reraise(self.type_, self.value, self.tb)
- File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
-   raise value
- File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6633, in 
_create_domain_and_network
-   post_xml_callback=post_xml_callback)
- File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6559, in 
_create_domain
-   guest = libvirt

[Yahoo-eng-team] [Bug 1879969] Re: confusing error message

2020-06-17 Thread Stephen Finucane
*** This bug is a duplicate of bug 1879964 ***
https://bugs.launchpad.net/bugs/1879964

** This bug has been marked a duplicate of bug 1879964
   Invalid value for 'hw:mem_page_size' raises confusing error

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1879969

Title:
  confusing error message

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Description

  When booting instance using flavor where Hugepages is activated with 
incorrect value get error:
  Invalid memory page size '0' (HTTP 400) (Request-ID: 
req-338bf619-3a54-45c5-9c59-ad8c1d425e91)

  Steps to reproduce

  openstack flavor create hugepage  --ram 1024 --disk 10 --vcpus 1
  openstack flavor set  hugepage --property hw:mem_page_size=2M
  openstack instance create --flavor hugepage..
  Invalid memory page size '0' (HTTP 400) (Request-ID: 
req-338bf619-3a54-45c5-9c59-ad8c1d425e91)

  Expected result
  Correct message  that hugepages is wrongly set to 2M instead 2MB

  Actual result
  Invalid memory page size '0' (HTTP 400) (Request-ID: 
req-338bf619-3a54-45c5-9c59-ad8c1d425e91

  Environment
  deployment tool : kolla-ansible 
  https://github.com/openstack/kolla
  https://github.com/openstack/kolla-ansible

  Train + Centos8 + Libvirt/KVM + ZFS + Neutron/OVS

  Logs
  Output after trying boot instance:
  Invalid memory page size '0' (HTTP 400) (Request-ID: 
req-338bf619-3a54-45c5-9c59-ad8c1d425e91)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1879969/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1882919] [NEW] e1000e interface reported as unsupported

2020-06-10 Thread Stephen Finucane
Public bug reported:

Per this downstream bug [1], attempting to boot a Windows Server 2012 or
2016 image will fail because something (libosinfo?) is attempting to
configure an e1000e VIF which nova does not explicitly support. There
doesn't appear to be any reason not to support this, since libvirt, and
specifically QEMU/KVM, support it.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1839808

** Affects: nova
 Importance: Medium
 Assignee: Stephen Finucane (stephenfinucane)
 Status: In Progress

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1882919

Title:
  e1000e interface reported as unsupported

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Per this downstream bug [1], attempting to boot a Windows Server 2012
  or 2016 image will fail because something (libosinfo?) is attempting
  to configure an e1000e VIF which nova does not explicitly support.
  There doesn't appear to be any reason not to support this, since
  libvirt, and specifically QEMU/KVM, support it.

  [1] https://bugzilla.redhat.com/show_bug.cgi?id=1839808

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1882919/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1882821] [NEW] '[libvirt] file_backed_memory' and '[DEFAULT] reserved_host_memory_mb' are incompatible

2020-06-09 Thread Stephen Finucane
Public bug reported:

Per title, the '[libvirt] file_backed_memory' and '[DEFAULT]
reserved_host_memory_mb' config options are incompatible. Not only does
'[DEFAULT] reserved_host_memory_mb' not really make sense for file
backed memory (if you want to reserve "memory", configure a lower
'[libvirt] file_backed_memory' value) but configuring a value for
'[libvirt] file_backed_memory' that is lower than the value for
'[DEFAULT] reserved_host_memory_mb', which currently defaults to 512MB,
will break nova's resource reporting to placement:

  nova.exception.ResourceProviderUpdateFailed: Failed to update resource
provider via URL
/resource_providers/f39bde61-6f73-4ccb-9488-6efb9689730f/inventories:
{"errors": [{"status": 400, "title": "Bad Request", "detail": "The
server could not comply with the request since it is either malformed or
otherwise incorrect.\n\n Unable to update inventory for resource
provider f39bde61-6f73-4ccb-9488-6efb9689730f: Invalid inventory for
'MEMORY_MB' on resource provider 'f39bde61-6f73-4ccb-9488-6efb9689730f'.
The reserved value is greater than total.  ", "code":
"placement.undefined_code", "request_id": "req-977e43e7-1a7c-4309-96ec-
49a75bdea58a"}]}

Ideally we should error out if both values are configured, however,
doing so would be a breaking change. Instead, we can warn if these are
incompatible and then error our in a future release.

** Affects: nova
 Importance: Medium
 Assignee: Stephen Finucane (stephenfinucane)
 Status: Confirmed

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

** Changed in: nova
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1882821

Title:
  '[libvirt] file_backed_memory' and '[DEFAULT] reserved_host_memory_mb'
  are incompatible

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Per title, the '[libvirt] file_backed_memory' and '[DEFAULT]
  reserved_host_memory_mb' config options are incompatible. Not only
  does '[DEFAULT] reserved_host_memory_mb' not really make sense for
  file backed memory (if you want to reserve "memory", configure a lower
  '[libvirt] file_backed_memory' value) but configuring a value for
  '[libvirt] file_backed_memory' that is lower than the value for
  '[DEFAULT] reserved_host_memory_mb', which currently defaults to
  512MB, will break nova's resource reporting to placement:

nova.exception.ResourceProviderUpdateFailed: Failed to update
  resource provider via URL
  /resource_providers/f39bde61-6f73-4ccb-9488-6efb9689730f/inventories:
  {"errors": [{"status": 400, "title": "Bad Request", "detail": "The
  server could not comply with the request since it is either malformed
  or otherwise incorrect.\n\n Unable to update inventory for resource
  provider f39bde61-6f73-4ccb-9488-6efb9689730f: Invalid inventory for
  'MEMORY_MB' on resource provider
  'f39bde61-6f73-4ccb-9488-6efb9689730f'. The reserved value is greater
  than total.  ", "code": "placement.undefined_code", "request_id":
  "req-977e43e7-1a7c-4309-96ec-49a75bdea58a"}]}

  Ideally we should error out if both values are configured, however,
  doing so would be a breaking change. Instead, we can warn if these are
  incompatible and then error our in a future release.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1882821/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1882233] [NEW] Libvirt driver always reports 'memory_mb_used' of 0

2020-06-05 Thread Stephen Finucane
Public bug reported:

The nova-compute service periodically logs a summary of the free RAM,
disk and vCPUs as reported by the hypervisor. For example:

  Hypervisor/Node resource view: name=vtpm-f31.novalocal free_ram=7960MB
free_disk=11.379043579101562GB free_vcpus=7 pci_devices=[{...}]

On a recent deployment using the libvirt driver, it's observed that the
'free_ram' value never changes despite instances being created and
destroyed. This is because the 'get_memory_mb_used' function in
'nova.virt.libvirt.host' always returns 0 unless the host platform -
reported by 'sys.platform' is either 'linux2' or 'linux3'. Since Python
3.3, the major version is not included in this return value since it was
misleading.

This is low priority because the value only appears to be used for
logging purposes and the values stored in e.g. the 'ComputeNode' object
and reported to placement are calculated based on config options and
number of instances on the node. We may wish to stop reporting this
information instead.

[1] https://stackoverflow.com/a/10429736/613428

** Affects: nova
 Importance: Low
 Assignee: Stephen Finucane (stephenfinucane)
 Status: Confirmed

** Changed in: nova
   Importance: Undecided => Low

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1882233

Title:
  Libvirt driver always reports 'memory_mb_used' of 0

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  The nova-compute service periodically logs a summary of the free RAM,
  disk and vCPUs as reported by the hypervisor. For example:

Hypervisor/Node resource view: name=vtpm-f31.novalocal
  free_ram=7960MB free_disk=11.379043579101562GB free_vcpus=7
  pci_devices=[{...}]

  On a recent deployment using the libvirt driver, it's observed that
  the 'free_ram' value never changes despite instances being created and
  destroyed. This is because the 'get_memory_mb_used' function in
  'nova.virt.libvirt.host' always returns 0 unless the host platform -
  reported by 'sys.platform' is either 'linux2' or 'linux3'. Since
  Python 3.3, the major version is not included in this return value
  since it was misleading.

  This is low priority because the value only appears to be used for
  logging purposes and the values stored in e.g. the 'ComputeNode'
  object and reported to placement are calculated based on config
  options and number of instances on the node. We may wish to stop
  reporting this information instead.

  [1] https://stackoverflow.com/a/10429736/613428

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1882233/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1736920] Re: Glance images are loaded into memory

2020-05-28 Thread Stephen Finucane
I finally got around to investigating this today. tl;dr: there does not
appear to be an issue here.

The return value of 'glanceclient.Client.images.data' is
'glanceclient.common.utils.RequestIdProxy', owing to the use of the
'add_req_id_to_object' decorator [2]. This is *not* a generator, which
means the 'inspect.isgenerator' conditional at [1] is False and we will
never convert these large images to a list. In fact, there appears to be
only one case that does trigger this: the
'glanceclient.Client.images.list' case, which returns a
'glanceclient.common.utils.GeneratorProxy' object due to the use of the
'add_req_id_to_generator' decorator. This is the function at the root of
bug #1557584. As such, the fix is correct and there's nothing to do here
besides possibly documenting things better in the code.

[1] https://github.com/openstack/nova/blob/16.0.0/nova/image/glance.py#L167
[2] 
https://github.com/openstack/python-glanceclient/blob/3.1.1/glanceclient/v2/images.py#L200
[3] 
https://github.com/openstack/python-glanceclient/blob/3.1.1/glanceclient/v2/images.py#L85

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1736920

Title:
  Glance images are loaded into memory

Status in OpenStack Compute (nova):
  Invalid
Status in OpenStack Security Advisory:
  Incomplete

Bug description:
  Nova appears to be loading entire responses from glance into memory
  [1]. This is generally not an issue but these responses could be an
  entire images [2]. Given a large enough image, this seems like a
  potential avenue for DoS, not to mention being highly inefficient.

  [1] 
https://github.com/openstack/nova/blob/16.0.0/nova/image/glance.py#L167-L170
  [2] 
https://github.com/openstack/nova/blob/16.0.0/nova/image/glance.py#L292-L295

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1736920/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1879964] [NEW] Invalid value for 'hw:mem_page_size' raises confusing error

2020-05-21 Thread Stephen Finucane
Public bug reported:

Configure a flavor like so:

  openstack flavor create hugepage  --ram 1024 --disk 10 --vcpus 1 test
  openstack flavor set  hugepage --property hw:mem_page_size=2M test

Attempt to boot an instance. It will fail with the following error
message:

  Invalid memory page size '0' (HTTP 400) (Request-ID: req-
338bf619-3a54-45c5-9c59-ad8c1d425e91)

You wouldn't know from reading it, but this is because the property
should read 'hw:mem_page_size=2MB' (note the extra 'B').

** Affects: nova
 Importance: Low
 Assignee: Stephen Finucane (stephenfinucane)
 Status: In Progress

** Changed in: nova
   Importance: Undecided => Low

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1879964

Title:
  Invalid value for 'hw:mem_page_size' raises confusing error

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Configure a flavor like so:

openstack flavor create hugepage  --ram 1024 --disk 10 --vcpus 1 test
openstack flavor set  hugepage --property hw:mem_page_size=2M test

  Attempt to boot an instance. It will fail with the following error
  message:

Invalid memory page size '0' (HTTP 400) (Request-ID: req-
  338bf619-3a54-45c5-9c59-ad8c1d425e91)

  You wouldn't know from reading it, but this is because the property
  should read 'hw:mem_page_size=2MB' (note the extra 'B').

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1879964/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1689753] Re: ram_filter ignores hugepages which can create unstable guests

2020-05-21 Thread Stephen Finucane
The RAM filter has been removed in recent versions of nova so there is
nothing to resolve on master now and it's unlikely to be resolved for
past releases. Closed as won't fix.

** Changed in: nova
   Status: Confirmed => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1689753

Title:
  ram_filter ignores hugepages which can create unstable guests

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  environment info:
  OS:centos7.1
  nova:15.0.2

  problem description:
  There are 220G memory in compute-node.The 200G of them are hugepages.The 
page_size is 1G.
  Other 20G of them are normal memorys.

  When I boot a normal instance with the flavor of 30G mem and no hugepages.
  The instance is created successfully.But the OS become unstable even OOM 
because memory exhausted

  I think the instance should boot failed with ram_filter return 0 hosts.Rather 
than think that the memory is sufficient
  and spawn the instance in that compute_node

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1689753/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1803575] Re: RFE: Add an option to enable virtio-scsi for new Nova instances by default

2020-04-17 Thread Stephen Finucane
While the issue you highlight here is real, this wouldn't be a good
solution. The primary issue with this approach is the same issue we have
with the '[libvirt] rx_queue_size' and '[libvirt] tx_queue_size' -
namely that it can break live migration as two hosts with different
values will result in a change in the instance XML. If we were to take
this approach, we'd have to store the information as part of the
instance and we don't have a way to do this other than via the flavor or
image metadata. As such, I'm closing this as WONTFIX.

With that said, we do recognize that there is a definite usability issue
here. For context, the libosinfo integration in the libvirt driver [1]
was supposed to resolve this kind of issue for us but the implementation
of that feature is fundamentally broken and it will probably be ripped
out in a future release. We're now working on an improved solution to
this broader issue, but it will take a different form to this.

[1] https://specs.openstack.org/openstack/nova-
specs/specs/liberty/approved/libvirt-hardware-policy-from-libosinfo.html

** Changed in: nova
   Status: Triaged => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1803575

Title:
  RFE: Add an option to enable virtio-scsi for new Nova instances by
  default

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Description
  ===

  Currently virtio-scsi is used only for libvirt instances created from
  the images with properties hw_scsi_model=virtio-scsi and
  hw_disk_bus=scsi set or from the volumes with the same image metadata.

  What is requested: config option in [libvirt] section of the nova.conf
  to enable virtio-scsi for all new instances by default, even if
  hw_scsi_model and hw_disk_bus properties for the image is not set.

  Why: we want virtio-scsi to be enabled for VMs created from users
  images even if they don't set this property explicitly, because we
  want to have most of the vms be able to issue BLKDISCARD with fstrim.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1803575/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1858019] Re: The flavor id is not limited when creating a flavor

2020-03-20 Thread Stephen Finucane
Also agree. If we're going to do anything, it should be done on the
client side. It should be possible to add a flag stating what field we
wish to filter on (name or ID), if needed. Since there's nothing to do
here from the server side, I'm going to close this as WONTFIX.

** Changed in: nova
   Status: Triaged => Won't Fix

** Changed in: nova
 Assignee: Choi-Sung-Hoon (knu-cse) => (unassigned)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1858019

Title:
  The flavor id is not limited when creating a flavor

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  when creating a flavor by 'openstack flavor create --id  --vcpus  
--ram  --disk  ',
  the parameter id is not limited. It can lead to ambiguities when id is set to 
an existed flavor name.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1858019/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1816454] Re: hw:mem_page_size is not respecting all documented values

2020-03-13 Thread Stephen Finucane
Looks like this was resolved in https://review.opendev.org/#/c/673252/

** Changed in: nova
   Status: New => Fix Released

** Tags added: doc

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1816454

Title:
  hw:mem_page_size is not respecting all documented values

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Per the Rocky documentation for hugepages:
  https://docs.openstack.org/nova/rocky/admin/huge-pages.html

  2MB hugepages can be specified either as:
  --property hw:mem_page_size=2Mb, or
  --property hw:mem_page_size=2048

  However, whenever I use the former notation (2Mb), conductor fails
  with the misleading NUMA error below... whereas with the latter
  notation (2048), allocation succeeds and the resulting instance is
  backed with 2MB hugepages on an x86_64 platform (as verified by
  checking `/proc/meminfo | grep HugePages_Free` before/after stopping
  the created instance).

  ERROR nova.scheduler.utils [req-de6920d5-829b-411c-acd7-1343f48824c9
  cb2abbb91da54209a5ad93a845b4cc26 cb226ff7932d40b0a48ec129e162a2fb -
  default default] [instance: 5b53d1d4-6a16-4db9-ab52-b267551c6528]
  Error from last host: node1 (node FQDN-REDACTED): ['Traceback (most
  recent call last):\n', '  File "/usr/lib/python3/dist-
  packages/nova/compute/manager.py", line 2106, in
  _build_and_run_instance\nwith rt.instance_claim(context, instance,
  node, limits):\n', '  File "/usr/lib/python3/dist-
  packages/oslo_concurrency/lockutils.py", line 274, in inner\n
  return f(*args, **kwargs)\n', '  File "/usr/lib/python3/dist-
  packages/nova/compute/resource_tracker.py", line 217, in
  instance_claim\npci_requests, overhead=overhead,
  limits=limits)\n', '  File "/usr/lib/python3/dist-
  packages/nova/compute/claims.py", line 95, in __init__\n
  self._claim_test(resources, limits)\n', '  File "/usr/lib/python3
  /dist-packages/nova/compute/claims.py", line 162, in _claim_test\n
  "; ".join(reasons))\n', 'nova.exception.ComputeResourcesUnavailable:
  Insufficient compute resources: Requested instance NUMA topology
  cannot fit the given host NUMA topology.\n', '\nDuring handling of the
  above exception, another exception occurred:\n\n', 'Traceback (most
  recent call last):\n', '  File "/usr/lib/python3/dist-
  packages/nova/compute/manager.py", line 1940, in
  _do_build_and_run_instance\nfilter_properties, request_spec)\n', '
  File "/usr/lib/python3/dist-packages/nova/compute/manager.py", line
  2156, in _build_and_run_instance\ninstance_uuid=instance.uuid,
  reason=e.format_message())\n', 'nova.exception.RescheduledException:
  Build of instance 5b53d1d4-6a16-4db9-ab52-b267551c6528 was re-
  scheduled: Insufficient compute resources: Requested instance NUMA
  topology cannot fit the given host NUMA topology.\n']

  Additional info:
  I am using Debian testing (buster) and all OpenStack packages included 
therein.

  $ dpkg -l | grep nova
  ii  nova-common   2:18.1.0-2  
all  OpenStack Compute - common files
  ii  nova-compute  2:18.1.0-2  
all  OpenStack Compute - compute node
  ii  nova-compute-kvm  2:18.1.0-2  
all  OpenStack Compute - compute node (KVM)
  ii  python3-nova  2:18.1.0-2  
all  OpenStack Compute - libraries
  ii  python3-novaclient2:11.0.0-2  
all  client library for OpenStack Compute API - 3.x

  $ dpkg -l | grep qemu
  ii  ipxe-qemu 1.0.0+git-20161027.b991c67-1
all  PXE boot firmware - ROM images for qemu
  ii  qemu-block-extra:amd641:3.1+dfsg-2+b1 
amd64extra block backend modules for qemu-system and qemu-utils
  ii  qemu-kvm  1:3.1+dfsg-2+b1 
amd64QEMU Full virtualization on x86 hardware
  ii  qemu-system-common1:3.1+dfsg-2+b1 
amd64QEMU full system emulation binaries (common files)
  ii  qemu-system-data  1:3.1+dfsg-2
all  QEMU full system emulation (data files)
  ii  qemu-system-gui   1:3.1+dfsg-2+b1 
amd64QEMU full system emulation binaries (user interface and audio 
support)
  ii  qemu-system-x86   1:3.1+dfsg-2+b1 
amd64QEMU full system emulation binaries (x86)
  ii  qemu-utils1:3.1+dfsg-2+b1 
amd64QEMU utilities

  * I forced nova to allocate on the same hypervisor (node1) when
  checking for the issue and can 

[Yahoo-eng-team] [Bug 1863058] Re: Arm64 CI for Nova

2020-03-12 Thread Stephen Finucane
Awesome! Could you bring this up on openstack-discuss [1]? It's far more
likely to get eyes (and volunteers to help with issues you'll have)
there.

[1] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-
discuss

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1863058

Title:
  Arm64 CI for Nova

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Linaro has donate a cluster for OpenStack CI on Arm64.
  Now the cluster is ready, 
https://opendev.org/openstack/project-config/src/branch/master/nodepool/nl03.openstack.org.yaml#L414

  We'd like to setup CI for Nova first.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1863058/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1864279] Re: Unable to attach more than 6 scsi volumes

2020-03-12 Thread Stephen Finucane
Looks like it's been fixed on RHEL 7.7 too [1]. If you're on a different
OS, I'd suggest opening a bug against the libvirt component for same and
requesting a backport. I don't think there's much to do here from a nova
perspective.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1741782

** Bug watch added: Red Hat Bugzilla #1741782
   https://bugzilla.redhat.com/show_bug.cgi?id=1741782

** Changed in: nova
   Status: Confirmed => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1864279

Title:
  Unable to attach more than 6 scsi volumes

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Scsi volume with unit number 7 can not be attached because of this
  libvirt check:
  
https://github.com/libvirt/libvirt/blob/89237d534f0fe950d06a2081089154160c6c2224/src/conf/domain_conf.c#L4796

  
  Nova automatically increase volume unit number by 1, and when I attach 7th 
volume to vm I've got this error:
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver 
[req-156a4725-279d-4173-9f11-85125e4a3e47] [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f] Failed to attach volume at mountpoint: 
/dev/sdh: libvirt.libvirtError: Requested operation is not valid: Domain 
already contains a disk with that address
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f] Traceback (most recent call last):
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f]   File 
"/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 1810, in 
attach_volume
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f] guest.attach_device(conf, 
persistent=True, live=live)
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f]   File 
"/usr/lib/python3/dist-packages/nova/virt/libvirt/guest.py", line 305, in 
attach_device
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f] 
self._domain.attachDeviceFlags(device_xml, flags=flags)
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f]   File 
"/usr/lib/python3/dist-packages/eventlet/tpool.py", line 190, in doit
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f] result = proxy_call(self._autowrap, 
f, *args, **kwargs)
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f]   File 
"/usr/lib/python3/dist-packages/eventlet/tpool.py", line 148, in proxy_call
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f] rv = execute(f, *args, **kwargs)
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f]   File 
"/usr/lib/python3/dist-packages/eventlet/tpool.py", line 129, in execute
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f] six.reraise(c, e, tb)
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f]   File 
"/usr/lib/python3/dist-packages/six.py", line 693, in reraise
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f] raise value
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f]   File 
"/usr/lib/python3/dist-packages/eventlet/tpool.py", line 83, in tworker
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f] rv = meth(*args, **kwargs)
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f]   File 
"/usr/lib/python3/dist-packages/libvirt.py", line 605, in attachDeviceFlags
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f] if ret == -1: raise libvirtError 
('virDomainAttachDeviceFlags() failed', dom=self)
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f] libvirt.libvirtError: Requested operation 
is not valid: Domain already contains a disk with that address
  2020-02-21 09:12:53.309 3572 ERROR nova.virt.libvirt.driver [instance: 
3532baf6-a0a4-4a81-84f9-3622c713435f]

  After patching libvirt driver to skip unit 7 I can attach more than 6
  volumes.

  
  ii  nova-compute  2:20.0.0-0ubuntu1~cloud0
  ii  nova-compute-kvm  2:20.0.0-0ubuntu1~cloud0
  ii  nova-compute-libvirt  2:20.0.0-0ubuntu1~cloud0
  ii 

[Yahoo-eng-team] [Bug 1863757] Re: Insufficient memory for guest pages when using NUMA

2020-03-12 Thread Stephen Finucane
*** This bug is a duplicate of bug 1734204 ***
https://bugs.launchpad.net/bugs/1734204

Yes, this has been resolved since Stein as noted at 1734204.
Unfortunately Queen is in Extended Maintenance and we no longer release
new versions so this is not likely to be fixed there.

** This bug has been marked a duplicate of bug 1734204
Insufficient free host memory pages available to allocate guest RAM with 
Open vSwitch DPDK in Newton

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1863757

Title:
  Insufficient memory for guest pages when using NUMA

Status in OpenStack Compute (nova):
  New

Bug description:
  This is a Queens / Bionic openstack deploy.

  Compute nodes are using hugepages for nova instances (reserved at boot
  time):

  root@compute1:~# cat /proc/meminfo | grep -i huge
  AnonHugePages: 0 kB
  ShmemHugePages:0 kB
  HugePages_Total: 332
  HugePages_Free:  184
  HugePages_Rsvd:0
  HugePages_Surp:0
  Hugepagesize:1048576 kB

  There are two numa nodes, as follows:

  root@compute1:~# lscpu | grep -i numa
  NUMA node(s):2
  NUMA node0 CPU(s):   0-19,40-59
  NUMA node1 CPU(s):   20-39,60-79

  Compute nodes are using DPDK, and memory for it has been reserved with
  the following directive:

  reserved-huge-pages: "node:0,size:1GB,count:8;node:1,size:1GB,count:8"

  A number of instances have already been created on node "compute1",
  until the point that current memory usage is as follows:

  root@compute1:~# cat /sys/devices/system/node/node*/meminfo  | grep -i huge
  Node 0 AnonHugePages: 0 kB
  Node 0 ShmemHugePages:0 kB
  Node 0 HugePages_Total:   166
  Node 0 HugePages_Free: 26
  Node 0 HugePages_Surp:  0
  Node 1 AnonHugePages: 0 kB
  Node 1 ShmemHugePages:0 kB
  Node 1 HugePages_Total:   166
  Node 1 HugePages_Free:158
  Node 1 HugePages_Surp:  0

  Problem:

  When a new instance is created (8 cores and 32gb ram), nova tries to
  schedule it on numa node 0 and fails with "Insufficient free host
  memory pages available to allocate guest RAM", even though there is
  enough memory available on numa node 1.

  This behavior has been seem by other users also here (although the
  solution on that bug seems to be more a coincidence than a proper
  solution -- then classified as not a bug, which I don't believe is the
  case):

  https://bugzilla.redhat.com/show_bug.cgi?id=1517004

  Flavor being used has nothing special except a property for
  hw:mem_page_size='large'.

  Instance is being forced to be created on "zone1::compute1", otherwise
  no kind of pinning of cpus or other resources. All the forcing of vm
  going to node0 seems to be nova's decision when instantiating it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1863757/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1864422] Re: can the instance supports online keys updates?

2020-03-12 Thread Stephen Finucane
To the best of my knowledge, this is not currently supported. We only
support it for rebuild [1], which is a destructive operation.

[1] https://specs.openstack.org/openstack/nova-
specs/specs/queens/implemented/rebuild-keypair-reset.html

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1864422

Title:
  can the instance supports online keys updates?

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Description
  ===
As a tenant, the private key of the key may be lost, and the user needs to 
update the keys rather than break the business and stop instance.

Do you have any Suggestions and ideas?

Thank you very much.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1864422/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1864160] Re: mete_date shows region information

2020-03-12 Thread Stephen Finucane
Please bring questions and support requests to either the openstack-
discuss mailing list or the #openstack-nova IRC channel, please

** Changed in: nova
   Status: New => Opinion

** Changed in: nova
   Status: Opinion => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1864160

Title:
  mete_date shows region information

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Description
  ===
I want to show region_name in mete_date. Do you have any Suggestions and 
ideas?

In the instance execution 'curl
  http://169.254.169.254/openstack/2013-04-04/mete_date.json ', the
  information of the current instance region (region_name) is
  displayed.Do you have any good ideas?

Thank you very much.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1864160/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1866288] Re: tox pep8 fails on ubuntu 18.04.3

2020-03-12 Thread Stephen Finucane
Yup, Rocky was tested on Xenial (16.04), not Bionic (18.04). Bionic
doesn't provide a suitable Python interpreter for this older code. This
is expected.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1866288

Title:
  tox pep8 fails on ubuntu 18.04.3

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  pep8 checking fails for rocky branch on ubuntu 18.04.3

  root@mgt02:~/src/nova# tox -epep8 -vvv
    removing /root/src/nova/.tox/log
  using tox.ini: /root/src/nova/tox.ini
  using tox-3.1.0 from /usr/local/lib/python2.7/dist-packages/tox/__init__.pyc
  skipping sdist step
  pep8 start: getenv /root/src/nova/.tox/shared
  pep8 recreate: /root/src/nova/.tox/shared
  ERROR: InterpreterNotFound: python3.5
  pep8 finish: getenv after 0.00 seconds
  
__
 summary 
___
  ERROR:  pep8: InterpreterNotFound: python3.5

  
  root@mgt02:~/src/nova# uname -a
  Linux mgt02 4.15.0-88-generic #88-Ubuntu SMP Tue Feb 11 20:11:34 UTC 2020 
x86_64 x86_64 x86_64 GNU/Linux

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1866288/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1866373] [NEW] URLS in os-keypairs 'links' body are incorrect

2020-03-06 Thread Stephen Finucane
Public bug reported:

Similar to https://bugs.launchpad.net/nova/+bug/1864428, the URLs in the
'links' element of the response are incorrect. They read '/keypairs',
not '/os-keypairs'. From the current api-ref (2020-03-06):

{
"keypairs": [
{
"keypair": {
"fingerprint": 
"7e:eb:ab:24:ba:d1:e1:88:ae:9a:fb:66:53:df:d3:bd",
"name": "keypair-5d935425-31d5-48a7-a0f1-e76e9813f2c3",
"type": "ssh",
"public_key": "ssh-rsa 
B3NzaC1yc2EDAQABAAABAQCkF3MX59OrlBs3dH5CU7lNmvpbrgZxSpyGjlnE8Flkirnc/Up22lpjznoxqeoTAwTW034k7Dz6aYIrZGmQwe2TkE084yqvlj45Dkyoj95fW/sZacm0cZNuL69EObEGHdprfGJQajrpz22NQoCD8TFB8Wv+8om9NH9Le6s+WPe98WC77KLw8qgfQsbIey+JawPWl4O67ZdL5xrypuRjfIPWjgy/VH85IXg/Z/GONZ2nxHgSShMkwqSFECAC5L3PHB+0+/12M/iikdatFSVGjpuHvkLOs3oe7m6HlOfluSJ85BzLWBbvva93qkGmLg4ZAc8rPh2O+YIsBUHNLLMM/oQp
 Generated-by-Nova\n"
}
}
],
"keypairs_links": [
{
"href": 
"http://openstack.example.com/v2.1/6f70656e737461636b20342065766572/keypairs?limit=1=keypair-5d935425-31d5-48a7-a0f1-e76e9813f2c3;,
"rel": "next"
}
]
}

** Affects: nova
 Importance: Low
 Status: New

** Changed in: nova
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1866373

Title:
  URLS in os-keypairs 'links' body are incorrect

Status in OpenStack Compute (nova):
  New

Bug description:
  Similar to https://bugs.launchpad.net/nova/+bug/1864428, the URLs in
  the 'links' element of the response are incorrect. They read
  '/keypairs', not '/os-keypairs'. From the current api-ref
  (2020-03-06):

  {
  "keypairs": [
  {
  "keypair": {
  "fingerprint": 
"7e:eb:ab:24:ba:d1:e1:88:ae:9a:fb:66:53:df:d3:bd",
  "name": "keypair-5d935425-31d5-48a7-a0f1-e76e9813f2c3",
  "type": "ssh",
  "public_key": "ssh-rsa 
B3NzaC1yc2EDAQABAAABAQCkF3MX59OrlBs3dH5CU7lNmvpbrgZxSpyGjlnE8Flkirnc/Up22lpjznoxqeoTAwTW034k7Dz6aYIrZGmQwe2TkE084yqvlj45Dkyoj95fW/sZacm0cZNuL69EObEGHdprfGJQajrpz22NQoCD8TFB8Wv+8om9NH9Le6s+WPe98WC77KLw8qgfQsbIey+JawPWl4O67ZdL5xrypuRjfIPWjgy/VH85IXg/Z/GONZ2nxHgSShMkwqSFECAC5L3PHB+0+/12M/iikdatFSVGjpuHvkLOs3oe7m6HlOfluSJ85BzLWBbvva93qkGmLg4ZAc8rPh2O+YIsBUHNLLMM/oQp
 Generated-by-Nova\n"
  }
  }
  ],
  "keypairs_links": [
  {
  "href": 
"http://openstack.example.com/v2.1/6f70656e737461636b20342065766572/keypairs?limit=1=keypair-5d935425-31d5-48a7-a0f1-e76e9813f2c3;,
  "rel": "next"
  }
  ]
  }

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1866373/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1533087] Re: there is useless 'u' in the wrong info when execute a wrong nova command

2020-02-20 Thread Stephen Finucane
This should not be an issue with Python 3, which is all we support now.
Closing as a result.

** Changed in: python-novaclient
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1533087

Title:
  there is useless 'u' in the wrong info when execute a wrong nova
  command

Status in OpenStack Compute (nova):
  Invalid
Status in python-novaclient:
  Won't Fix

Bug description:
  [Summary]
  there is useless 'u' in the wrong info when execute a wrong nova command

  [Topo]
  devstack all-in-one node

  [Description and expect result]
  no useless 'u' in the wrong info when execute a wrong nova command

  [Reproduceable or not]
  reproduceable 

  [Recreate Steps]
  1) there is useless 'u' in the wrong info when execute a wrong nova command:
  root@45-59:/opt/stack/devstack# nova wrongcmd
  usage: nova [--version] [--debug] [--os-cache] [--timings]
  [--os-region-name ] [--service-type ]
  [--service-name ]
  [--volume-service-name ]
  [--os-endpoint-type ]
  [--os-compute-api-version ]
  [--bypass-url ] [--insecure]
  [--os-cacert ] [--os-cert ]
  [--os-key ] [--timeout ] [--os-auth-type ]
  [--os-auth-url OS_AUTH_URL] [--os-domain-id OS_DOMAIN_ID]
  [--os-domain-name OS_DOMAIN_NAME] [--os-project-id OS_PROJECT_ID]
  [--os-project-name OS_PROJECT_NAME]
  [--os-project-domain-id OS_PROJECT_DOMAIN_ID]
  [--os-project-domain-name OS_PROJECT_DOMAIN_NAME]
  [--os-trust-id OS_TRUST_ID]
  [--os-default-domain-id OS_DEFAULT_DOMAIN_ID]
  [--os-default-domain-name OS_DEFAULT_DOMAIN_NAME]
  [--os-user-id OS_USER_ID] [--os-user-name OS_USERNAME]
  [--os-user-domain-id OS_USER_DOMAIN_ID]
  [--os-user-domain-name OS_USER_DOMAIN_NAME]
  [--os-password OS_PASSWORD]
   ...
  error: argument : invalid choice: u'wrongcmd'  ISSUE
  Try 'nova help ' for more information.
  root@45-59:/opt/stack/devstack# 

  2)below is a correct example for reference:
  root@45-59:/opt/stack/devstack# keystone wrongcmd
  usage: keystone [--version] [--debug] [--os-username ]
  [--os-password ]
  [--os-tenant-name ]
  [--os-tenant-id ] [--os-auth-url ]
  [--os-region-name ]
  [--os-identity-api-version ]
  [--os-token ]
  [--os-endpoint ] [--os-cache]
  [--force-new-token] [--stale-duration ] [--insecure]
  [--os-cacert ] [--os-cert ]
  [--os-key ] [--timeout ]
   ...
  keystone: error: argument : invalid choice: 'wrongcmd' 

  [Configration]
  reproduceable bug, no need

  [logs]
  reproduceable bug, no need

  [Root cause anlyze or debug inf]
  reproduceable bug

  [Attachment]
  None

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1533087/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1860021] Re: nova-live-migration fails 100% with "mysql: command not found" on subnode

2020-01-21 Thread Stephen Finucane
Marking as invalid for nova since the change needed was in DevStack, not
nova.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1860021

Title:
  nova-live-migration fails 100% with "mysql: command not found" on
  subnode

Status in devstack:
  Fix Released
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Since [1] nova-live-migration failures can be seen in devstack-
  subnodes-early.txt.gz like

   + ./stack.sh:main:1158 :   is_glance_enabled
   + lib/glance:is_glance_enabled:90  :   [[ , =~ ,glance ]]
   + lib/glance:is_glance_enabled:91  :   [[ 
,c-bak,c-vol,dstat,g-api,n-cpu,peakmem_tracker,placement-client,q-agt =~ ,g- ]]
   + lib/glance:is_glance_enabled:91  :   return 0
   + ./stack.sh:main:1159 :   echo_summary 'Configuring 
Glance'
   + ./stack.sh:echo_summary:452  :   [[ -t 3 ]]
   + ./stack.sh:echo_summary:458  :   echo -e Configuring Glance
   + ./stack.sh:main:1160 :   init_glance
   + lib/glance:init_glance:276   :   rm -rf 
/opt/stack/data/glance/images
   + lib/glance:init_glance:277   :   mkdir -p 
/opt/stack/data/glance/images
   + lib/glance:init_glance:280   :   recreate_database glance
   + lib/database:recreate_database:110   :   local db=glance
   + lib/database:recreate_database:111   :   recreate_database_mysql glance
   + lib/databases/mysql:recreate_database_mysql:63 :   local db=glance
   + lib/databases/mysql:recreate_database_mysql:64 :   mysql -uroot 
-psecretmysql -h127.0.0.1 -e 'DROP DATABASE IF EXISTS glance;'
  /opt/stack/new/devstack/lib/databases/mysql: line 64: mysql: command not found
   + lib/databases/mysql:recreate_database_mysql:1 :   exit_trap

  [1] https://review.opendev.org/#/c/702707/

To manage notifications about this bug go to:
https://bugs.launchpad.net/devstack/+bug/1860021/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1860417] [NEW] Use of randint in functional tests is racey

2020-01-21 Thread Stephen Finucane
Public bug reported:

In change I475ea0fa5f2d5b197118f0ced5a0ff6907411972, we switched to
using 'random.randint' to generate flavor.id values in functional tests.
This has proven racey, as seen at [1].

[1]
https://zuul.opendev.org/t/openstack/build/c308dab9bd2d43d0b40cf999a34af0f7/console

** Affects: nova
 Importance: Undecided
 Assignee: Stephen Finucane (stephenfinucane)
 Status: In Progress


** Tags: testing

** Tags added: testing

** Changed in: nova
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1860417

Title:
  Use of randint in functional tests is racey

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  In change I475ea0fa5f2d5b197118f0ced5a0ff6907411972, we switched to
  using 'random.randint' to generate flavor.id values in functional
  tests. This has proven racey, as seen at [1].

  [1]
  
https://zuul.opendev.org/t/openstack/build/c308dab9bd2d43d0b40cf999a34af0f7/console

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1860417/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1858091] Re: Nova compute api v2.1/servers in train

2020-01-06 Thread Stephen Finucane
The main change I can see in stable/train is the inclusion of API
microversion 2.75, which made the API stricter and means you will now
get a "400 error response for an unknown parameter in the querystring or
request body" [1]. This is correct behavior from nova's perspective and
it's rancher than needs to be fixed. You can get more information by
looking at the body for the 4xx responses and the logs for the nova-api
services.

[1] https://docs.openstack.org/nova/latest/reference/api-microversion-
history.html#id68

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1858091

Title:
  Nova compute api v2.1/servers in train

Status in kolla-ansible:
  New
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  **Environment**:
  * OS (e.g. from /etc/os-release): Ubuntu
  * Kernel (e.g. `uname -a`): Linux host 4.15.0-55-generic #60-Ubuntu SMP Tue 
Jul 2 18:22:20 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  * Docker version if applicable (e.g. `docker version`): 19.03.2
  * Kolla-Ansible version (e.g. `git head or tag or stable branch` or pip 
package version if using release): 9.0.0
  * Docker image Install type (source/binary): source
  * Docker image distribution: train
  * Are you using official images from Docker Hub or self built? official
  * If self built - Kolla version and environment used to build:
  * Share your inventory file, globals.yml and other configuration files if 
relevant

  -

  I have updated kolla-ansible(to 9.0.0) and openstack images(to train)
  recently. Thus, I was using Rancher node driver to provision openstack
  instances and use it to deploy k8s cluster. With Stein everything was
  working smoothly. However, after I updated to Train version, Rancher
  started getting 400-403 error codes:

  ```
  Error creating machine: Error in driver during machine creation: Expected 
HTTP response code [200] when accessing [POST 
http://10.0.225.254:8774/v2.1/os-keypairs], but got 403 instead

  or

  Error creating machine: Error in driver during machine creation: Expected 
HTTP response code [200] when accessing [POST 
http://10.0.225.254:8774/v2.1/servers], but got 400 instead
  ```

  Thus, I am wondering if anything was changed to nova compute api's in
  Train version and what action can be done in order to fix that issue?
  I have reported that bug on Rancher github as well:
  https://github.com/rancher/rancher/issues/24813 cause I am not sure if
  its fully openstack-version related issue.

  Regards

To manage notifications about this bug go to:
https://bugs.launchpad.net/kolla-ansible/+bug/1858091/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1852727] [NEW] PCI passthrough documentation does not describe the steps necessary to passthrough PFs

2019-11-15 Thread Stephen Finucane
Public bug reported:

This came up on IRC [1]. By default, nova will not allow you to use PF
devices unless you specifically request this type of device. This is
intentional behavior to allow users to whitelist all devices from a
particular vendor and avoid passing through the PF device when they
meant to only consume the VFs. In the future, we might want to prevent
whitelisting of both PF and VFs, but for now we should document the
current behavior.

[1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova
/%23openstack-nova.2019-11-15.log.html#t2019-11-15T08:39:17

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1852727

Title:
  PCI passthrough documentation does not describe the steps necessary to
  passthrough PFs

Status in OpenStack Compute (nova):
  New

Bug description:
  This came up on IRC [1]. By default, nova will not allow you to use PF
  devices unless you specifically request this type of device. This is
  intentional behavior to allow users to whitelist all devices from a
  particular vendor and avoid passing through the PF device when they
  meant to only consume the VFs. In the future, we might want to prevent
  whitelisting of both PF and VFs, but for now we should document the
  current behavior.

  [1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova
  /%23openstack-nova.2019-11-15.log.html#t2019-11-15T08:39:17

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1852727/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1847229] Re: Boot from volume docs don't work

2019-10-08 Thread Stephen Finucane
Looks like mriedem tackled this already in commit
16027094ebabc5cd9f2e766431f18aadeff54a40. Excellent.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1847229

Title:
  Boot from volume docs don't work

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  This bug tracker is for errors with the documentation, use the
  following as a template and remove or add fields as you see fit.
  Convert [ ] into [x] to check boxes:

  - [x] This doc is inaccurate in this way: The commands referenced don't work. 
Looks like a dodgy translation from novaclient to osc.
  - [ ] This is a doc addition request.
  - [ ] I have a fix to the document that I can paste below including example: 
input and output. 

  If you have a troubleshooting or support issue, use the following
  resources:

   - Ask OpenStack: http://ask.openstack.org
   - The mailing list: http://lists.openstack.org
   - IRC: 'openstack' channel on Freenode

  ---
  Release:  on 2019-02-21 00:29:11
  SHA: 19b757a4ba52363607965900fe74533bb2db92a7
  Source: 
https://opendev.org/openstack/nova/src/doc/source/user/launch-instance-from-volume.rst
  URL: 
https://docs.openstack.org/nova/latest/user/launch-instance-from-volume.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1847229/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1847229] [NEW] Boot from volume docs don't work

2019-10-08 Thread Stephen Finucane
Public bug reported:

This bug tracker is for errors with the documentation, use the following
as a template and remove or add fields as you see fit. Convert [ ] into
[x] to check boxes:

- [x] This doc is inaccurate in this way: The commands referenced don't work. 
Looks like a dodgy translation from novaclient to osc.
- [ ] This is a doc addition request.
- [ ] I have a fix to the document that I can paste below including example: 
input and output. 

If you have a troubleshooting or support issue, use the following
resources:

 - Ask OpenStack: http://ask.openstack.org
 - The mailing list: http://lists.openstack.org
 - IRC: 'openstack' channel on Freenode

---
Release:  on 2019-02-21 00:29:11
SHA: 19b757a4ba52363607965900fe74533bb2db92a7
Source: 
https://opendev.org/openstack/nova/src/doc/source/user/launch-instance-from-volume.rst
URL: 
https://docs.openstack.org/nova/latest/user/launch-instance-from-volume.html

** Affects: nova
 Importance: Undecided
 Status: Invalid


** Tags: doc

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1847229

Title:
  Boot from volume docs don't work

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  This bug tracker is for errors with the documentation, use the
  following as a template and remove or add fields as you see fit.
  Convert [ ] into [x] to check boxes:

  - [x] This doc is inaccurate in this way: The commands referenced don't work. 
Looks like a dodgy translation from novaclient to osc.
  - [ ] This is a doc addition request.
  - [ ] I have a fix to the document that I can paste below including example: 
input and output. 

  If you have a troubleshooting or support issue, use the following
  resources:

   - Ask OpenStack: http://ask.openstack.org
   - The mailing list: http://lists.openstack.org
   - IRC: 'openstack' channel on Freenode

  ---
  Release:  on 2019-02-21 00:29:11
  SHA: 19b757a4ba52363607965900fe74533bb2db92a7
  Source: 
https://opendev.org/openstack/nova/src/doc/source/user/launch-instance-from-volume.rst
  URL: 
https://docs.openstack.org/nova/latest/user/launch-instance-from-volume.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1847229/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1731865] Re: Use defusedxml function instead of lxml.etree.parse

2019-10-07 Thread Stephen Finucane
As noted in the review, this isn't necessarily a huge issue and I'm not
sure it's worth investing time on

** Changed in: nova
   Status: In Progress => Won't Fix

** Changed in: nova
 Assignee: Spencer Yu (yushb) => (unassigned)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1731865

Title:
  Use defusedxml function instead of lxml.etree.parse

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Due to 
https://docs.openstack.org/bandit/latest/blacklists/blacklist_calls.html#b313-b320-xml,
  we should use defusedxml function instead of lxml.etree.parse to prevent XML 
attacks.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1731865/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1781558] Re: Change default video model from cirrus to vga

2019-10-07 Thread Stephen Finucane
I'm having a hard time understanding why we should change behavior here.
There was a bug in QEMU, that bug has been fixed for over two years, and
the fix should be present in QEMU packaged by any self-respecting
distro. I'm going to mark this as wontfix. If you disagree, let me know.

** Changed in: nova
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1781558

Title:
  Change default video model from cirrus to vga

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Change default video model from cirrus to vga

  Because of the bug of qemu[1], using cirrus video model
  is dangerous. To fix this problem, change the default
  video model, and disable cirrus forever.

  [1]: CVE-2017-2615 https://lists.gnu.org/archive/html/qemu-
  devel/2017-02/msg00015.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1781558/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1843836] [NEW] Failure to schedule if flavor contains non-CPU flag traits

2019-09-12 Thread Stephen Finucane
Public bug reported:

I'm seeing the following error locally:

Sep 12 18:52:25 compute-small nova-conductor[28968]: ERROR
nova.scheduler.utils [None req-b86b25c8-c89e-4449-bec3-c94948402e02 demo
admin] [instance: a4056430-ed06-4cea-91b9-e15fd4b1979f] Error from last
host: compute-small (node compute-small): [u'Traceback (most recent call
last):\n', u'  File "/opt/stack/nova/nova/compute/manager.py", line
2038, in _do_build_and_run_instance\nfilter_properties,
request_spec)\n', u'  File "/opt/stack/nova/nova/compute/manager.py",
line 2408, in _build_and_run_instance\ninstance_uuid=instance.uuid,
reason=six.text_type(e))\n', u"RescheduledException: Build of instance
a4056430-ed06-4cea-91b9-e15fd4b1979f was re-scheduled: No CPU model
match traits, models: ['IvyBridge-IBRS'], required flags:
set([None])\n"]

This is affecting me when testing the 'PCPU' feature because we're
rewriting the 'hw:cpu_thread_policy' to add a 'HW_CPU_HYPERTHREADING'
trait, however, this can happen with any non-CPU flag trait (e.g.
COMPUTE_SUPPORTS_MULTIATTACH) because of the following code:

https://github.com/openstack/nova/blob/7a18209a8/nova/virt/libvirt/utils.py#L600

That will mean we can return a set contains 'None', which causes this
later check to fail:

https://github.com/openstack/nova/blob/7a18209a81539217a95ab7daad6bc67002768950/nova/virt/libvirt/driver.py#L4083

Since no CPU model will report a 'None' feature flag.

** Affects: nova
 Importance: Undecided
 Assignee: Stephen Finucane (stephenfinucane)
 Status: In Progress


** Tags: libvirt

** Description changed:

  I'm seeing the following error locally:
  
  Sep 12 18:52:25 compute-small nova-conductor[28968]: ERROR
  nova.scheduler.utils [None req-b86b25c8-c89e-4449-bec3-c94948402e02 demo
  admin] [instance: a4056430-ed06-4cea-91b9-e15fd4b1979f] Error from last
  host: compute-small (node compute-small): [u'Traceback (most recent call
  last):\n', u'  File "/opt/stack/nova/nova/compute/manager.py", line
  2038, in _do_build_and_run_instance\nfilter_properties,
  request_spec)\n', u'  File "/opt/stack/nova/nova/compute/manager.py",
  line 2408, in _build_and_run_instance\ninstance_uuid=instance.uuid,
  reason=six.text_type(e))\n', u"RescheduledException: Build of instance
  a4056430-ed06-4cea-91b9-e15fd4b1979f was re-scheduled: No CPU model
  match traits, models: ['IvyBridge-IBRS'], required flags:
  set([None])\n"]
  
  This is affecting me when testing the 'PCPU' feature because we're
  rewriting the 'hw:cpu_thread_policy' to add a 'HW_CPU_HYPERTHREADING'
- trait, however, this can happen with any non-CPU flag trait because of
- the following code:
+ trait, however, this can happen with any non-CPU flag trait (e.g.
+ COMPUTE_SUPPORTS_MULTIATTACH) because of the following code:
  
  
https://github.com/openstack/nova/blob/7a18209a8/nova/virt/libvirt/utils.py#L600
  
  That will mean we can return a set contains 'None', which causes this
  later check to fail:
  
  
https://github.com/openstack/nova/blob/7a18209a81539217a95ab7daad6bc67002768950/nova/virt/libvirt/driver.py#L4083
  
  Since no CPU model will report a 'None' feature flag.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1843836

Title:
  Failure to schedule if flavor contains non-CPU flag traits

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  I'm seeing the following error locally:

  Sep 12 18:52:25 compute-small nova-conductor[28968]: ERROR
  nova.scheduler.utils [None req-b86b25c8-c89e-4449-bec3-c94948402e02
  demo admin] [instance: a4056430-ed06-4cea-91b9-e15fd4b1979f] Error
  from last host: compute-small (node compute-small): [u'Traceback (most
  recent call last):\n', u'  File
  "/opt/stack/nova/nova/compute/manager.py", line 2038, in
  _do_build_and_run_instance\nfilter_properties, request_spec)\n',
  u'  File "/opt/stack/nova/nova/compute/manager.py", line 2408, in
  _build_and_run_instance\ninstance_uuid=instance.uuid,
  reason=six.text_type(e))\n', u"RescheduledException: Build of instance
  a4056430-ed06-4cea-91b9-e15fd4b1979f was re-scheduled: No CPU model
  match traits, models: ['IvyBridge-IBRS'], required flags:
  set([None])\n"]

  This is affecting me when testing the 'PCPU' feature because we're
  rewriting the 'hw:cpu_thread_policy' to add a 'HW_CPU_HYPERTHREADING'
  trait, however, this can happen with any non-CPU flag trait (e.g.
  COMPUTE_SUPPORTS_MULTIATTACH) because of the following code:

  
https://github.com/openstack/nova/blob/7a18209a8/nova/virt/libvirt/utils.py#L600

  That will mean we can return a set contains 'None', which causes this
  later check to fail:

  
https://github.com/openstack/nova/blob/7a18209a81539217a95ab7daad6bc67002768950/nova/virt/libvirt/driver.py#L4083

  Since no CPU mod

[Yahoo-eng-team] [Bug 1843714] Re: nova-status documentation not in the list of man-pages

2019-09-12 Thread Stephen Finucane
** Changed in: nova
   Status: Won't Fix => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1843714

Title:
  nova-status documentation not in the list of man-pages

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  When running "sphinx-build -b man doc/source doc/build/man", the nova-
  status man page is not build . It's missing from the man_pages list in
  doc/source/conf.py

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1843714/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1843714] Re: nova-status documentation not in the list of man-pages

2019-09-12 Thread Stephen Finucane
** Changed in: nova
   Status: In Progress => Won't Fix

** Changed in: nova
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1843714

Title:
  nova-status documentation not in the list of man-pages

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  When running "sphinx-build -b man doc/source doc/build/man", the nova-
  status man page is not build . It's missing from the man_pages list in
  doc/source/conf.py

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1843714/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1840807] [NEW] nova-manage man page refers to non-existent option

2019-08-20 Thread Stephen Finucane
Public bug reported:

The 'nova-manage db sync' command used to take a '--version' option but
this was deprecated some time ago and recently removed. However, the man
page for this command still references the old option:

https://docs.openstack.org/nova/rocky/cli/nova-manage.html#nova-database

** Affects: nova
 Importance: Low
 Assignee: Stephen Finucane (stephenfinucane)
 Status: In Progress


** Tags: doc

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Low

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1840807

Title:
  nova-manage man page refers to non-existent option

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  The 'nova-manage db sync' command used to take a '--version' option
  but this was deprecated some time ago and recently removed. However,
  the man page for this command still references the old option:

  https://docs.openstack.org/nova/rocky/cli/nova-manage.html#nova-
  database

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1840807/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1831771] [NEW] UnexpectedDeletingTaskStateError exception can leave traces of VIFs on host

2019-06-05 Thread Stephen Finucane
Public bug reported:

This was originally reported in Bugzilla

https://bugzilla.redhat.com/show_bug.cgi?id=1668159

The 'UnexpectedDeletingTaskStateError' exception can be raised by
something like aborting a large heat stack, where the instance hasn't
finished setting up before the stack is aborted and the instances
deleted.

https://github.com/openstack/nova/blob/19.0.0/nova/db/sqlalchemy/api.py#L2864

We handle this in the compute manager and as part of that handling, we
clean up the resource tracking of network interfaces.

https://github.com/openstack/nova/blob/19.0.0/nova/compute/manager.py#L2034-L2040

However, we don't unplug these interfaces. This can result in things
being left over on the host.

We should attempt to unplug VIFs as part of this cleanup.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1831771

Title:
  UnexpectedDeletingTaskStateError exception can leave traces of VIFs on
  host

Status in OpenStack Compute (nova):
  New

Bug description:
  This was originally reported in Bugzilla

  https://bugzilla.redhat.com/show_bug.cgi?id=1668159

  The 'UnexpectedDeletingTaskStateError' exception can be raised by
  something like aborting a large heat stack, where the instance hasn't
  finished setting up before the stack is aborted and the instances
  deleted.

  https://github.com/openstack/nova/blob/19.0.0/nova/db/sqlalchemy/api.py#L2864

  We handle this in the compute manager and as part of that handling, we
  clean up the resource tracking of network interfaces.

  
https://github.com/openstack/nova/blob/19.0.0/nova/compute/manager.py#L2034-L2040

  However, we don't unplug these interfaces. This can result in things
  being left over on the host.

  We should attempt to unplug VIFs as part of this cleanup.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1831771/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1831269] [NEW] Resize ignores mem_page_size in new flavor

2019-05-31 Thread Stephen Finucane
Public bug reported:

This was originally reported in Bugzilla.

When attempting to resize a instance where the new flavor uses a
different pagesize to the old flavor, the 'NUMATopologyFilter' evaluates
hosts using the original pagesize value rather than the new one.

Steps to Reproduce:

1. Create an instance with 2M hugepage size by setting flavor property: 
hw:mem_page_size=2048
2. Make sure every other compute node is configured with 1G huge pages
3. Create a new flavor with the property: hw:mem_page_size=1048576
4. Resize the instance as " openstack server resize --flavor  
"

Expected results:

Resize operation rebuilds the instance just like cold-migration. It
should be able to apply all aspects of the new flavor.

Actual results:

Resize will fail wit the error message: "Host does not support requested
memory pagesize. Requested: 2048 kB _numa_fit_instance_cell
/usr/lib/python2.7/site-packages/nova/virt/hardware.py:936"

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1831269

Title:
  Resize ignores mem_page_size in new flavor

Status in OpenStack Compute (nova):
  New

Bug description:
  This was originally reported in Bugzilla.

  When attempting to resize a instance where the new flavor uses a
  different pagesize to the old flavor, the 'NUMATopologyFilter'
  evaluates hosts using the original pagesize value rather than the new
  one.

  Steps to Reproduce:

  1. Create an instance with 2M hugepage size by setting flavor property: 
hw:mem_page_size=2048
  2. Make sure every other compute node is configured with 1G huge pages
  3. Create a new flavor with the property: hw:mem_page_size=1048576
  4. Resize the instance as " openstack server resize --flavor  "

  Expected results:

  Resize operation rebuilds the instance just like cold-migration. It
  should be able to apply all aspects of the new flavor.

  Actual results:

  Resize will fail wit the error message: "Host does not support
  requested memory pagesize. Requested: 2048 kB _numa_fit_instance_cell
  /usr/lib/python2.7/site-packages/nova/virt/hardware.py:936"

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1831269/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1830926] [NEW] Links to reno are incorrect

2019-05-29 Thread Stephen Finucane
Public bug reported:

There are multiple links to reno in the "release notes" section of the
contributor guide:

https://docs.openstack.org/nova/stein/contributor/releasenotes.html

These are versioned links but reno is unversioned. This is resulting in
breaking links when on stable branches like the above.

** Affects: nova
 Importance: Undecided
 Assignee: Stephen Finucane (stephenfinucane)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1830926

Title:
  Links to reno are incorrect

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  There are multiple links to reno in the "release notes" section of the
  contributor guide:

  https://docs.openstack.org/nova/stein/contributor/releasenotes.html

  These are versioned links but reno is unversioned. This is resulting
  in breaking links when on stable branches like the above.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1830926/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1744455] Re: New instance on compute fails

2019-05-17 Thread Stephen Finucane
*** This bug is a duplicate of bug 1672041 ***
https://bugs.launchpad.net/bugs/1672041

** Project changed: nova-solver-scheduler => nova

** This bug has been marked a duplicate of bug 1672041
   nova.scheduler.client.report  409 Conflict

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1744455

Title:
  New instance on compute fails

Status in OpenStack Compute (nova):
  New

Bug description:
  I am installing openstack Pike release. While trying to create the
  instance on compute node, I see the below errors in nova-schedular
  logs:

  2018-01-20 16:17:31.864 1011 INFO nova.scheduler.host_manager 
[req-f0b60f13-637f-4856-a321-76914742652c - - - - -] Successfully synced 
instances from host 'compute'.
  2018-01-20 16:18:28.287 1011 WARNING nova.scheduler.client.report 
[req-07c5ee94-dd71-4328-8f63-f24550f16e37 c8e5bcf05f67431ba5c89518238ef4d7 
6a17e79098ab478fa728b4ace304d591 - default default] Unable to submit allocation 
for instance c9120f12-02b7-4515-ba9f-37faca050cc3 (409 
   
409 Conflict
   
   
409 Conflict
There was a conflict when trying to complete your request.
  Unable to allocate inventory: Unable to create allocation for 'VCPU' on 
resource provider '5841aceb-452b-44b2-b96d-653c394a543c'. The requested amount 
would violate inventory constraints.

  
   
  )
  2018-01-20 16:18:28.919 1011 WARNING nova.scheduler.client.report 
[req-07c5ee94-dd71-4328-8f63-f24550f16e37 c8e5bcf05f67431ba5c89518238ef4d7 
6a17e79098ab478fa728b4ace304d591 - default default] Unable to submit allocation 
for instance c9120f12-02b7-4515-ba9f-37faca050cc3 (409 
   
409 Conflict
   
   
409 Conflict
There was a conflict when trying to complete your request.
  Unable to allocate inventory: Unable to create allocation for 'VCPU' on 
resource provider '5841aceb-452b-44b2-b96d-653c394a543c'. The requested amount 
would violate inventory constraints.

  
   
  )

  
  While Checking the resource available in nova registered compute. I don't see 
the problem.

  MariaDB [nova]> select 
id,vcpus,vcpus_used,hypervisor_type,host,uuid,memory_mb,memory_mb_used from 
compute_nodes;
  
++---++-+-+--+---++
  | id | vcpus | vcpus_used | hypervisor_type | host| uuid  
   | memory_mb | memory_mb_used |
  
++---++-+-+--+---++
  |  1 | 1 |  0 | QEMU| compute | 
5841aceb-452b-44b2-b96d-653c394a543c |  3723 |696 |
  
++---++-+-+--+---++
  1 row in set (0.00 sec)

  MariaDB [nova]>

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1744455/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1824813] [NEW] Unsetting '[DEFAULT] dhcp_domain' results in hostname corruption

2019-04-15 Thread Stephen Finucane
Public bug reported:

Unsetting '[DEFAULT] dhcp_domain' will result in the metadata
service/config drive reporting an instance hostname of '${hostname}None'
instead of '${hostname}'. This is clearly incorrect behavior.

** Affects: nova
 Importance: Low
 Assignee: Stephen Finucane (stephenfinucane)
 Status: In Progress

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Low

** Changed in: nova
 Assignee: (unassigned) => Stephen Finucane (stephenfinucane)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1824813

Title:
  Unsetting '[DEFAULT] dhcp_domain' results in hostname corruption

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Unsetting '[DEFAULT] dhcp_domain' will result in the metadata
  service/config drive reporting an instance hostname of
  '${hostname}None' instead of '${hostname}'. This is clearly incorrect
  behavior.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1824813/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1822355] [NEW] Incomplete stubbing of os-vif in libvirt functional tests

2019-03-29 Thread Stephen Finucane
Public bug reported:

If a functional test fails, we see the following in the logs:

2019-03-29 17:37:10,856 INFO [nova.compute.manager] Terminating instance

   
2019-03-29 17:37:10,859 INFO [nova.api.openstack.requestlog] 127.0.0.1 "GET 
/v2/6f70656e737461636b20342065766572/servers/detail" status: 200 len: 1569 
microversion: 2.1 time: 0.289134
2019-03-29 17:37:10,867 INFO [nova.virt.libvirt.driver] Instance destroyed 
successfully.   

2019-03-29 17:37:10,869 ERROR [vif_plug_ovs.ovsdb.impl_vsctl] Unable to 
execute ['ovs-vsctl', '--timeout=120', '--oneline', '--format=json', 
'--db=tcp:127.0.0.1:6640', '--', '--if-exists', 'del-port', u'br-i
nt', u'tap88dae9fa-0d']. Exception: You have attempted to start a privsep 
helper. This is not allowed in the gate, and indicates a failure to have mocked 
your tests. 
2019-03-29 17:37:10,870 ERROR [os_vif] Failed to unplug vif 
VIFOpenVSwitch(active=True,address=00:0c:29:0d:11:74,bridge_name='br-int',has_traffic_filtering=False,id=88dae9fa-0dc6-49e3-8c29-3abc41e99ac9,netwo
rk=Network(3cb9bc59-5699-4588-a4b1-b87f96708bc6),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap88dae9fa-0d')
   
Traceback (most recent call last):  

   
  File 
"/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/os_vif/__init__.py",
 line 110, in unplug
 
plugin.unplug(vif, instance_info)   


  File 
"/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/vif_plug_ovs/ovs.py",
 line 344, in unplug  
self._unplug_vif_generic(vif, instance_info)

   
  File 
"/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/vif_plug_ovs/ovs.py",
 line 318, in _unplug_vif_generic 
self.ovsdb.delete_ovs_vif_port(vif.network.bridge, vif.vif_name)

  
  File 
"/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/vif_plug_ovs/ovsdb/ovsdb_lib.py",
 line 90, in delete_ovs_vif_port
linux_net.delete_net_dev(dev)   

   
  File 
"/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/oslo_privsep/priv_context.py",
 line 240, in _wrap 
   
self.start()

   
  File 
"/home/sfinucan/Development/openstack/nova/.tox/functional/lib/python2.7/site-packages/oslo_privsep/priv_context.py",
 line 251, in start 
   
channel = daemon.RootwrapClientChannel(context=self)

  
  File "nova/tests/fixtures.py", line 2018, in __init__ 

   
raise Exception('You have attempted to start a privsep helper. '

   
Exception: You have attempted to start a privsep helper. This is not 
allowed in the gate, and indicates a failure to have mocked your tests.

As that suggests, we have a problem we should rectify.

** Affects: nova
 Importance: 

[Yahoo-eng-team] [Bug 1821733] [NEW] Failed to compute_task_build_instances: local variable 'sibling_set' referenced before assignment

2019-03-26 Thread Stephen Finucane
Public bug reported:

Reproduced from rhbz#1686511
(https://bugzilla.redhat.com/show_bug.cgi?id=1686511)

When spawning an Openstack instance, this error is received:

2019-03-07 08:07:38.499 3124 WARNING nova.scheduler.utils 
[req-e577cf31-7a58-420f-8ba5-3f962569ab08 0c90c8d8b42c42e883d2135cc733cac4 
8b869a98a43e4fc48001e0ff6d149fe6 - - -] Failed to compute_task_build_instances: 
local variable 'sibling_set' referenced before assignment
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", 
line 133, in _process_incoming
res = self.dispatcher.dispatch(message)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", 
line 150, in dispatch
return self._do_dispatch(endpoint, method, ctxt, args)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", 
line 121, in _do_dispatch
result = func(ctxt, **new_args)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", 
line 199, in inner
return func(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 
104, in select_destinations
dests = self.driver.select_destinations(ctxt, spec_obj)

  File 
"/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 53, 
in select_destinations
selected_hosts = self._schedule(context, spec_obj)

  File 
"/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 
113, in _schedule
spec_obj, index=num)

  File "/usr/lib/python2.7/site-packages/nova/scheduler/host_manager.py", 
line 576, in get_filtered_hosts
hosts, spec_obj, index)

  File "/usr/lib/python2.7/site-packages/nova/filters.py", line 89, in 
get_filtered_objects
list_objs = list(objs)

  File "/usr/lib/python2.7/site-packages/nova/filters.py", line 44, in 
filter_all
if self._filter_one(obj, spec_obj):

  File 
"/usr/lib/python2.7/site-packages/nova/scheduler/filters/__init__.py", line 44, 
in _filter_one
return self.host_passes(obj, spec)

  File 
"/usr/lib/python2.7/site-packages/nova/scheduler/filters/numa_topology_filter.py",
 line 123, in host_passes
pci_stats=host_state.pci_stats))

  File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 1297, 
in numa_fit_instance_to_host
host_cell, instance_cell, limits)

  File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 906, 
in _numa_fit_instance_cell
host_cell, instance_cell)

  File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 854, 
in _numa_fit_instance_cell_with_pinning
max(map(len, host_cell.siblings)))

  File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 805, 
in _pack_instance_onto_cores
itertools.chain(*sibling_set)))

UnboundLocalError: local variable 'sibling_set' referenced before
assignment

2019-03-07 08:07:38.500 3124 WARNING nova.scheduler.utils [req-
e577cf31-7a58-420f-8ba5-3f962569ab08 0c90c8d8b42c42e883d2135cc733cac4
8b869a98a43e4fc48001e0ff6d149fe6 - - -] [instance: 5bca186a-5a36-4b0f-
8b7a-f2f3bc168b29] Setting instance to ERROR state.

This issues appears to be because of:

https://github.com/openstack/nova/blob/da9f9c962fe00dbfc9c8fe9c47e964816d67b773/nova/virt/hardware.py#L875

This works normally because of loop variables in Python are available
outside of the scope of the loop:

>>> for x in range(5):
... pass
...
>>> print(x)
4

and because there's usually something in sibling_sets. However, this is
presumably failing for this user because there are no free cores at all
on the given host. This is likely the race condition between the nova-
scheduler and nova-compute services.

** Affects: nova
 Importance: Undecided
 Status: New

** Description changed:

- Reproduced from rhbz#1686511.
+ Reproduced from rhbz#1686511
+ (https://bugzilla.redhat.com/show_bug.cgi?id=1686511)
  
  When spawning an Openstack instance, this error is received:
  
+ 2019-03-07 08:07:38.499 3124 WARNING nova.scheduler.utils 
[req-e577cf31-7a58-420f-8ba5-3f962569ab08 0c90c8d8b42c42e883d2135cc733cac4 
8b869a98a43e4fc48001e0ff6d149fe6 - - -] Failed to compute_task_build_instances: 
local variable 'sibling_set' referenced before assignment
+ Traceback (most recent call last):
  
- 2019-03-07 08:07:38.499 3124 WARNING nova.scheduler.utils 
[req-e577cf31-7a58-420f-8ba5-3f962569ab08 0c90c8d8b42c42e883d2135cc733cac4 
8b869a98a43e4fc48001e0ff6d149fe6 - - -] Failed to compute_task_build_instances: 
local variable 'sibling_set' referenced before assignment
- Traceback (most recent call last):
+   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", 
line 133, in _process_incoming
+ res = self.dispatcher.dispatch(message)
  
-   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", 

[Yahoo-eng-team] [Bug 1815591] [NEW] Out-of-date configuration options and no cross-referencing in scheduler filter guide

2019-02-12 Thread Stephen Finucane
Public bug reported:

We document all the filter schedulers in [1]. Most of these take some
kind of configuration options and document this. However, there is no
cross-referencing between these. This lack of cross-referencing also
tends to lead to outdated docs as options get moved around, and I
suspect there are at least a few typos in here. We should address this
here at least.

[1] https://docs.openstack.org/nova/rocky/user/filter-scheduler.html

** Affects: nova
 Importance: Low
 Assignee: Alexandra Settle (alexandra-settle)
 Status: New


** Tags: doc

** Changed in: nova
   Importance: Undecided => Low

** Tags added: doc

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1815591

Title:
  Out-of-date configuration options and no cross-referencing in
  scheduler filter guide

Status in OpenStack Compute (nova):
  New

Bug description:
  We document all the filter schedulers in [1]. Most of these take some
  kind of configuration options and document this. However, there is no
  cross-referencing between these. This lack of cross-referencing also
  tends to lead to outdated docs as options get moved around, and I
  suspect there are at least a few typos in here. We should address this
  here at least.

  [1] https://docs.openstack.org/nova/rocky/user/filter-scheduler.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1815591/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1814882] [NEW] Bandwidth limits specified in flavors are not applied to generic vHost User interfaces

2019-02-06 Thread Stephen Finucane
Public bug reported:

Libvirt supports setting bandwidth limits for various VIF types.

https://github.com/openstack/nova/blob/bcfd2439bab7cfad942d7e6a187df6edb1d1bf09/nova/virt/libvirt/vif.py#L576

This is supported by pretty much all VIF types including vHost User
interfaces defined by os-vif. However, generic vHost user interfaces do
not set this field. This is a mistake and should be corrected.

** Affects: nova
 Importance: Undecided
 Status: New

** Description changed:

  Libvirt supports setting bandwidth limits for various VIF types.
  
  
https://github.com/openstack/nova/blob/bcfd2439bab7cfad942d7e6a187df6edb1d1bf09/nova/virt/libvirt/vif.py#L576
  
- This is supported by pretty much all VIF types except one: generic vHost
- user interfaces. This is a mistake and should be corrected.
+ This is supported by pretty much all VIF types including vHost User
+ interfaces defined by os-vif. However, generic vHost user interfaces do
+ not set this field. This is a mistake and should be corrected.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1814882

Title:
  Bandwidth limits specified in flavors are not applied to generic vHost
  User interfaces

Status in OpenStack Compute (nova):
  New

Bug description:
  Libvirt supports setting bandwidth limits for various VIF types.

  
https://github.com/openstack/nova/blob/bcfd2439bab7cfad942d7e6a187df6edb1d1bf09/nova/virt/libvirt/vif.py#L576

  This is supported by pretty much all VIF types including vHost User
  interfaces defined by os-vif. However, generic vHost user interfaces
  do not set this field. This is a mistake and should be corrected.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1814882/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1811886] [NEW] Overcommit allowed for pinned instances when using hugepages

2019-01-15 Thread Stephen Finucane
Public bug reported:

When working on a fix for bug 181097, it was noted that the check to
ensure pinned instances do not overcommit was not pagesize aware. This
means if an instance without hugepages boots on a host with a large
number of hugepages allocated, it may not get all of the memory
allocated to it. The solution seems to be to make the check pagesize
aware. Test cases to prove this is the case are provided below.

---

# Host information

The memory capacity (and some other stuff) for our node:

$ virsh capabilities | xmllint --xpath '/capabilities/host/topology/cells' -

  
16298528
3075208
4000
0
...
  
  
16512884
3128797
4000
0
...
  


Clearly there are not 3075208 and 3128797 4k pages on NUMA nodes 0 and 1,
respectively, since, for NUMA node 0, (3075208 * 4) + (4000 * 2048) != 16298528.
We use [1] to resolve this. Instead we have 16298528 - (4000 * 2048) = 8106528 
KiB
memory (or 7.93 GiB) for NUMA cell 0 and something similar for cell 1.

To make things easier, cell 1 is totally disabled by adding the
following to 'nova-cpu.conf':

[DEFAULT]
vcpu_pin_set = 0-5,12-17

[1] https://review.openstack.org/631038

For all test cases I create the flavor then try to create two servers
with the same flavor.

# Test A, unpinned, implicit small pages, oversubscribed.

This should work because we're not using a specific page size.

$ openstack flavor create --vcpu 2 --disk 0 --ram 7168 test.numa
$ openstack flavor set test.numa --property hw:numa_nodes=1

$ openstack server create --flavor test.numa --image 
cirros-0.3.6-x86_64-disk --wait test1
$ openstack server create --flavor test.numa --image 
cirros-0.3.6-x86_64-disk --wait test2

Expect: SUCCESS
Actual: SUCCESS

# Test B, unpinned, explicit small pages, oversubscribed

This should fail because we are request a specific page size, though
that size is small pages (4k).

$ openstack flavor create --vcpu 2 --disk 0 --ram 7168 test.numa
$ openstack flavor set test.numa --property hw:numa_nodes=1
$ openstack flavor set test.numa --property hw:mem_page_size=small

$ openstack server create --flavor test.numa --image 
cirros-0.3.6-x86_64-disk --wait test1
$ openstack server create --flavor test.numa --image 
cirros-0.3.6-x86_64-disk --wait test2

Expect: FAILURE
Actual: FAILURE

# Test C, pinned, implicit small pages, oversubscribed

This should fail because we don't allow oversubscription with CPU
pinning.

$ openstack flavor create --vcpu 2 --disk 0 --ram 7168 test.pinned
$ openstack flavor set test.pinned --property hw:cpu_policy=dedicated

$ openstack server create --flavor test.pinned --image 
cirros-0.3.6-x86_64-disk --wait test1
$ openstack server create --flavor test.pinned --image 
cirros-0.3.6-x86_64-disk --wait test2

Expect: FAILURE
Actual: SUCCESS

Interestingly, this fails on the third VM. This is likely because the total
memory for that cell, 16298528 KiB, is sufficient to handle two instances
but not three.

# Test D, pinned, explicit small pages, oversubscribed

This should fail because we don't allow oversubscription with CPU
pinning.

$ openstack flavor create --vcpu 2 --disk 0 --ram 7168 test.pinned
$ openstack flavor set test.pinned --property hw:cpu_policy=dedicated
$ openstack flavor set test.pinned --property hw:mem_page_size=small

$ openstack server create --flavor test.pinned --image 
cirros-0.3.6-x86_64-disk --wait test1
$ openstack server create --flavor test.pinned --image 
cirros-0.3.6-x86_64-disk --wait test2

Expect: FAILURE
Actual: FAILURE

** Affects: nova
 Importance: Undecided
 Assignee: Stephen Finucane (stephenfinucane)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1811886

Title:
  Overcommit allowed for pinned instances when using hugepages

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  When working on a fix for bug 181097, it was noted that the check to
  ensure pinned instances do not overcommit was not pagesize aware. This
  means if an instance without hugepages boots on a host with a large
  number of hugepages allocated, it may not get all of the memory
  allocated to it. The solution seems to be to make the check pagesize
  aware. Test cases to prove this is the case are provided below.

  ---

  # Host information

  The memory capacity (and some other stuff) for our node:

  $ virsh capabilities | xmllint --xpath 
'/capabilities/host/topology/cells' -
  

  16298528
  3075208
  4000
  0
  ...


  16512884
  3128797
  4000
  0
  ...

  

  Clearly there are not 3075208 and 3128797 4k pages on NUMA nodes 0 and 1

[Yahoo-eng-team] [Bug 1811870] [NEW] libvirt reporting incorrect value of 4k (small) pages

2019-01-15 Thread Stephen Finucane
Public bug reported:

libvirt < 4.3.0 had an issue whereby assigning more than 4 GB of huge
pages would result in an incorrect value for the number of 4k (small)
pages. This was tracked and fixed via rhbz#1569678 and the fixes appear
to have been backported to the libvirt versions for RHEL 7.4+. However,
this is still an issue with the versions of libvirt available on Ubuntu
16.04, 18.04 and who knows what else. We should either alert the user
that the bug exists or, better again, work around the issue using the
rest of the (correct) values for different page sizes.

# Incorrect value (Ubuntu 16.04, libvirt 4.0.0)

$ virsh capabilities | xmllint --xpath 
/capabilities/host/topology/cells/cell[1] -

  16298528
  3075208
  4000
  0
  ...


(3075208 * 4) + (4000 * 2048) != 16298528

# Correct values (Fedora ??, libvirt 4.10)

$ virsh capabilities | xmllint --xpath 
/capabilities/host/topology/cells/cell[1] -

  32359908
  8038777
  100
  0
  ...


(8038777 * 4) + (100 * 2048) == 32359908

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1569678

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1811870

Title:
  libvirt reporting incorrect value of 4k (small) pages

Status in OpenStack Compute (nova):
  New

Bug description:
  libvirt < 4.3.0 had an issue whereby assigning more than 4 GB of huge
  pages would result in an incorrect value for the number of 4k (small)
  pages. This was tracked and fixed via rhbz#1569678 and the fixes
  appear to have been backported to the libvirt versions for RHEL 7.4+.
  However, this is still an issue with the versions of libvirt available
  on Ubuntu 16.04, 18.04 and who knows what else. We should either alert
  the user that the bug exists or, better again, work around the issue
  using the rest of the (correct) values for different page sizes.

  # Incorrect value (Ubuntu 16.04, libvirt 4.0.0)

  $ virsh capabilities | xmllint --xpath 
/capabilities/host/topology/cells/cell[1] -
  
    16298528
    3075208
    4000
    0
    ...
  

  (3075208 * 4) + (4000 * 2048) != 16298528

  # Correct values (Fedora ??, libvirt 4.10)

  $ virsh capabilities | xmllint --xpath 
/capabilities/host/topology/cells/cell[1] -
  
    32359908
    8038777
    100
    0
    ...
  

  (8038777 * 4) + (100 * 2048) == 32359908

  [1] https://bugzilla.redhat.com/show_bug.cgi?id=1569678

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1811870/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1810977] [NEW] Oversubscription broken for instances with NUMA topologies

2019-01-08 Thread Stephen Finucane
Public bug reported:

As described in [1], the fix to [2] appears to have inadvertently broken
oversubscription of memory for instances with a NUMA topology but no
hugepages.

Steps to reproduce:

1. Create a flavor that will consume > 50% available memory for your
host(s) and specify an explicit NUMA topology. For example, on my all-
in-one deployment where the host has 32GB RAM, we will request a 20GB
instance:

   $ openstack flavor create --vcpu 2 --disk 0 --ram 20480 test.numa
   $ openstack flavor set test.numa --property hw:numa_nodes=2

2. Boot an instance using this flavor:

   $ openstack server create --flavor test.numa --image
cirros-0.3.6-x86_64-disk --wait test

3. Boot another instance using this flavor:

   $ openstack server create --flavor test.numa --image
cirros-0.3.6-x86_64-disk --wait test2

# Expected result:

The second instance should boot.

# Actual result:

The second instance fails to boot. We see the following error message in
the logs.

  nova-scheduler[18295]: DEBUG nova.virt.hardware [None 
req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] No specific pagesize 
requested for instance, selected pagesize: 4 {{(pid=18318) 
_numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1045}}
  nova-scheduler[18295]: DEBUG nova.virt.hardware [None 
req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] Not enough available 
memory to schedule instance with pagesize 4. Required: 10240, available: 5676, 
total: 15916. {{(pid=18318) _numa_fit_instance_cell 
/opt/stack/nova/nova/virt/hardware.py:1055}}

If we revert the patch that addressed the bug [3] then we revert to the
correct behaviour and the instance boots. With this though, we obviously
lose whatever benefits that change gave us.

[1] 
http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001459.html
[2] https://bugs.launchpad.net/nova/+bug/1734204
[3] https://review.openstack.org/#/c/532168

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1810977

Title:
  Oversubscription broken for instances with NUMA topologies

Status in OpenStack Compute (nova):
  New

Bug description:
  As described in [1], the fix to [2] appears to have inadvertently
  broken oversubscription of memory for instances with a NUMA topology
  but no hugepages.

  Steps to reproduce:

  1. Create a flavor that will consume > 50% available memory for your
  host(s) and specify an explicit NUMA topology. For example, on my all-
  in-one deployment where the host has 32GB RAM, we will request a 20GB
  instance:

 $ openstack flavor create --vcpu 2 --disk 0 --ram 20480 test.numa
 $ openstack flavor set test.numa --property hw:numa_nodes=2

  2. Boot an instance using this flavor:

 $ openstack server create --flavor test.numa --image
  cirros-0.3.6-x86_64-disk --wait test

  3. Boot another instance using this flavor:

 $ openstack server create --flavor test.numa --image
  cirros-0.3.6-x86_64-disk --wait test2

  # Expected result:

  The second instance should boot.

  # Actual result:

  The second instance fails to boot. We see the following error message
  in the logs.

nova-scheduler[18295]: DEBUG nova.virt.hardware [None 
req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] No specific pagesize 
requested for instance, selected pagesize: 4 {{(pid=18318) 
_numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1045}}
nova-scheduler[18295]: DEBUG nova.virt.hardware [None 
req-f7a6594b-8d25-424c-9c6e-8522f66ffd22 demo admin] Not enough available 
memory to schedule instance with pagesize 4. Required: 10240, available: 5676, 
total: 15916. {{(pid=18318) _numa_fit_instance_cell 
/opt/stack/nova/nova/virt/hardware.py:1055}}

  If we revert the patch that addressed the bug [3] then we revert to
  the correct behaviour and the instance boots. With this though, we
  obviously lose whatever benefits that change gave us.

  [1] 
http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001459.html
  [2] https://bugs.launchpad.net/nova/+bug/1734204
  [3] https://review.openstack.org/#/c/532168

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1810977/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1809136] [NEW] Unsupported VIF type unbound convert '_nova_to_osvif_vif_unbound' on compute restart

2018-12-19 Thread Stephen Finucane
Public bug reported:

This is a variant of an existing bug:

- https://bugs.launchpad.net/nova/+bug/1738373 tracks a similar
exception ('_nova_to_osvif_vif_binding_failed') on compute startup.

There are also two other closely related bugs:

- https://bugs.launchpad.net/nova/+bug/1783917 tracks this same exception 
('_nova_to_osvif_vif_unbound') but for live migrations
- https://bugs.launchpad.net/nova/+bug/1784579 tracks a similar exception 
('_nova_to_osvif_vif_binding_failed') but for live migration

In addition, there are a few bugs which are likely the root cause of all
of the above issues (and this one) in the first place:

- https://bugs.launchpad.net/nova/+bug/1751923

In this instance, as with bug 1738373, we are unable to start nova-
compute service on compute node due to an os-vif invoked error.

nova-compute.log on compute shows:

2018-05-12 16:42:47.323 305978 INFO os_vif 
[req-0a72cdea-843a-4932-b8a0-bc24c2f21d9f - - - - -] Successfully plugged vif 
VIFBridge(active=True,address=fa:16:3e:41:a9:2c,bridge_name='qbr8d027ff4-23',has_traffic_filtering=True,id=8d027ff4-2328-47df-9f9a-2c1a9914a83b,network=Network(9a98b244-b1d2-46b3-ab0e-be8456e3a984),plugin='ovs',port_profile=VIFPortProfileBase,preserve_on_delete=False,vif_name='tap8d027ff4-23')
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service 
[req-0a72cdea-843a-4932-b8a0-bc24c2f21d9f - - - - -] Error starting thread.
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service Traceback (most 
recent call last):
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/oslo_service/service.py", line 708, in 
run_service
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service service.start()
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/service.py", line 117, in start
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service 
self.manager.init_host()
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1154, in 
init_host
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service 
self._init_instance(context, instance)
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 957, in 
_init_instance
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service 
self.driver.plug_vifs(instance, net_info)
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 703, in 
plug_vifs
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service 
self.vif_driver.plug(instance, vif)
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/vif.py", line 771, in plug
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service vif_obj = 
os_vif_util.nova_to_osvif_vif(vif)
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/network/os_vif_util.py", line 408, in 
nova_to_osvif_vif
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service {'type': 
vif['type'], 'func': funcname})
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service NovaException: 
Unsupported VIF type unbound convert '_nova_to_osvif_vif_unbound'
2018-05-12 16:42:47.369 305978 ERROR oslo_service.service

Inspecting the available ports shows the port does exist, so this looks
like a caching issue.

[stack@director:~]$ neutron port-list | grep fa:16:3e:41:a9:2c
| 8d027ff4-2328-47df-9f9a-2c1a9914a83b |
| 
fa:16:3e:41:a9:2c | {"subnet_id": "1f5ed9bc-aa7d-49bd-ac48-23b430fc0eb4", 
"ip_address": "172.19.9.17"} |
[stack@director:~]$ neutron port-show 8d027ff4-2328-47df-9f9a-2c1a9914a83b
+---++
| Field | Value 
 |
+---++
| admin_state_up| True  
 |
| allowed_address_pairs |   
 |
| binding:host_id   | overcloud-compute-7.localdomain   
 |
| binding:profile   | {}
 |
| binding:vif_details   | {"port_filter": true, "ovs_hybrid_plug": true}
 |
| binding:vif_type  | ovs   
 |
| binding:vnic_type | normal 

[Yahoo-eng-team] [Bug 1797146] [NEW] failed to boot guest with vnic_type direct when rx_queue_size, tx_queue_size and hw_vif_type are set

2018-10-10 Thread Stephen Finucane
Public bug reported:

Bug #1789074 addressed an issue with booting a guest with vnic_type
direct when rx_queue_size and tx_queue_size. However, this failed to
address an additional permutation: the user specifying
hw_vif_type=virtio. If the user does this, the problem occurs once
again.

Reproduction steps are the same noted in bug #1789074 with one
additional step needed:

  openstack image set --property hw_vif_type=virtio $IMAGE

Once configured, boot an instance with this image and an SRIOV (PF or
VF) interface and the instance will fail to spawn. This is because we
first read and set the VIF model from the image metadata property:

https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L134-L135

Which means a later check passes:

https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L172

Without setting this property, that check would fail as we never
configure the model for direct SR-IOV interfaces.

https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L139

** Affects: nova
 Importance: Medium
 Status: Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1797146

Title:
  failed to boot guest with vnic_type direct when rx_queue_size,
  tx_queue_size and hw_vif_type are set

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Bug #1789074 addressed an issue with booting a guest with vnic_type
  direct when rx_queue_size and tx_queue_size. However, this failed to
  address an additional permutation: the user specifying
  hw_vif_type=virtio. If the user does this, the problem occurs once
  again.

  Reproduction steps are the same noted in bug #1789074 with one
  additional step needed:

openstack image set --property hw_vif_type=virtio $IMAGE

  Once configured, boot an instance with this image and an SRIOV (PF or
  VF) interface and the instance will fail to spawn. This is because we
  first read and set the VIF model from the image metadata property:

  
https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L134-L135

  Which means a later check passes:

  
https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L172

  Without setting this property, that check would fail as we never
  configure the model for direct SR-IOV interfaces.

  
https://github.com/openstack/nova/blob/622ebf2fab0a9bf75ee12437bef28f60e083f849/nova/virt/libvirt/vif.py#L139

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1797146/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1614092] Re: SRIOV - PF / VM that assign to PF does not get vlan tag

2018-05-07 Thread Stephen Finucane
As noted, this is resolved in Ocata. There is an issue with this
currently but that's being tracked in #1743458

** Changed in: nova
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1614092

Title:
  SRIOV - PF / VM that assign to PF  does not get vlan tag

Status in neutron:
  Invalid
Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  During RFE testing Manage SR-IOV PFs as Neutron ports, I found VM booted with 
Neutron port vnic_type  direct-physical  but it does not get access to DHCP 
server. 
  The problem is that the PF / VM does not get VLAN tag with the internal vlan.
  Workaround : 
  Enter to the VM via console and set vlan interface. 


  version RHOS 10 
  python-neutronclient-4.2.1-0.20160721230146.3b1c538.el7ost.noarch
  openstack-neutron-common-9.0.0-0.20160726001729.6a23add.el7ost.noarch
  python-neutron-9.0.0-0.20160726001729.6a23add.el7ost.noarch
  openstack-neutron-fwaas-9.0.0-0.20160720211704.c3e491c.el7ost.noarch
  openstack-neutron-metering-agent-9.0.0-0.20160726001729.6a23add.el7ost.noarch
  openstack-neutron-openvswitch-9.0.0-0.20160726001729.6a23add.el7ost.noarch
  puppet-neutron-9.1.0-0.20160725142451.4061b39.el7ost.noarch
  python-neutron-lib-0.2.1-0.20160726025313.405f896.el7ost.noarch
  openstack-neutron-ml2-9.0.0-0.20160726001729.6a23add.el7ost.noarch
  openstack-neutron-9.0.0-0.20160726001729.6a23add.el7ost.noarch
  openstack-neutron-sriov-nic-agent-9.0.0-0.20160726001729.6a23add.el7ost.noarch

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1614092/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1549915] Re: Lots of "NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported" observed in gate-cinder-python27 logs

2018-03-09 Thread Stephen Finucane
These occur on the latest DevStack deploy. The opt and the warning both
originate in glance so I'm reassigning.

** Changed in: cinder
   Status: Invalid => Confirmed

** Project changed: cinder => glance

** Project changed: glance => oslo.db

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1549915

Title:
  Lots of "NotSupportedWarning: Configuration option(s) ['use_tpool']
  not supported" observed in gate-cinder-python27 logs

Status in oslo.db:
  Confirmed

Bug description:
  There are lots of instances of "NotSupportedWarning: Configuration option(s) 
['use_tpool'] not supported" observed in gate-cinder-python27 logs.
  eg:
  
http://logs.openstack.org/02/282002/1/check/gate-cinder-python27/332a226/console.html.gz

  
  ...
  2016-02-18 22:42:12.214 | 
/home/jenkins/workspace/gate-cinder-python27/.tox/py27/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py:241:
 NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported
  2016-02-18 22:42:12.214 |   exception.NotSupportedWarning
  2016-02-18 22:42:12.214 | 
  2016-02-18 22:42:12.224 | {3} 
cinder.tests.unit.api.contrib.test_admin_actions.AdminActionsAttachDetachTest.test_volume_force_detach_raises_remote_error
 [3.892236s] ... ok
  2016-02-18 22:42:12.224 | 
  2016-02-18 22:42:12.224 | Captured stderr:
  2016-02-18 22:42:12.224 | 
  2016-02-18 22:42:12.224 | 
/home/jenkins/workspace/gate-cinder-python27/.tox/py27/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py:241:
 NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported
  2016-02-18 22:42:12.224 |   exception.NotSupportedWarning
  ...
  

To manage notifications about this bug go to:
https://bugs.launchpad.net/oslo.db/+bug/1549915/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1670628] Re: nova-compute will try to re-plug the vif even if it exists for vhostuser port.

2018-03-05 Thread Stephen Finucane
** Changed in: nova
   Status: Opinion => Confirmed

** Changed in: nova
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1670628

Title:
  nova-compute will try to re-plug the vif even if it exists for
  vhostuser port.

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Description
  ===
  In mitaka version, deploy neutron with ovs-dpdk.
  If we stop ovs-agent, then re-start the nova-compute,the vm in the host will 
get network connection failed.

  Steps to reproduce
  ==
  deploy mitaka. with neutron, enabled ovs-dpdk, choose one compute node, where 
vm has network connection.
  run this in host,
  1. #systemctl stop neutron-openvswitch-agent.service
  2. #systemctl restart openstack-nova-compute.service

  then ping $VM_IN_THIS_HOST

  Expected result
  ===
  ping $VM_IN_THIS_HOST would would success

  Actual result
  =
  ping $VM_IN_THIS_HOST failed.

  Environment
  ===
  Centos7
  ovs2.5.1
  dpdk 2.2.0
  openstack-nova-compute-13.1.1-1

  Reason:
  after some digging, I found that nova-compute will try to plug the vif every 
time when it booting.
  Specially for vhostuser port, nova-compute will not check whether it exists 
as legacy ovs,and it will re-plug the port with vsctl args like "--if-exists 
del-port vhu".
  (refer 
https://github.com/openstack/nova/blob/stable/mitaka/nova/virt/libvirt/vif.py#L679-L683)
  after recreate the ovs vhostuser port, it will not get the right vlan tag 
which set from ovs agent.

  In the test environment, after restart the ovs agent, the agent will
  set a proper vlan id for the port. and the network connection will be
  resumed.

  Not sure it's a bug or config issue, do I miss something?
  there is also fp_plug type for vhostuser port, how could we specify it?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1670628/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1744965] [NEW] 'emulator_threads_policy' doesn't work with 'vcpu_pin_set'

2018-01-23 Thread Stephen Finucane
Public bug reported:

When hyper threading is enabled, the way emulator_threads_policy
allocates the extra cpu resource for emulator is not optimal.

The instance I use for testing is a 6-vcpu VM; before enable this
emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we
enable hyper threading) in nova config,

 vcpu_pin_set=8,10,12,32,34,36

Now when we enable emulator_threads_policy, in stead of adding one more
thread to this vcpu pin list in the nova config, I end up adding two
more sibling threads (on the same core)

 vcpu_pin_set=8,10,12,16,32,34,36,40

So I ended up using 2 more threads, but only of them is used for
emulator and the other thread is wasted.

Originally reported on Bugzilla - https://bugzilla.redhat.com/1534669

** Affects: nova
 Importance: Undecided
 Status: New

** Description changed:

  When hyper threading is enabled, the way emulator_threads_policy
  allocates the extra cpu resource for emulator is not optimal.
  
- The instance I use for testing is a 6-vcpu VM; before enable this 
emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we enable 
hyper threading) in nova config,
- vcpu_pin_set=8,10,12,32,34,36
+ The instance I use for testing is a 6-vcpu VM; before enable this
+ emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we
+ enable hyper threading) in nova config,
  
- Now when we enable emulator_threads_policy, in stead of adding one more 
thread to this vcpu pin list in the nova config, I end up adding two more 
sibling threads (on the same core)
-  vcpu_pin_set=8,10,12,16,32,34,36,40
+  vcpu_pin_set=8,10,12,32,34,36
+ 
+ Now when we enable emulator_threads_policy, in stead of adding one more
+ thread to this vcpu pin list in the nova config, I end up adding two
+ more sibling threads (on the same core)
+ 
+  vcpu_pin_set=8,10,12,16,32,34,36,40
  
  So I ended up using 2 more threads, but only of them is used for
  emulator and the other thread is wasted.
  
  Originally reported on Bugzilla - https://bugzilla.redhat.com/534669

** Description changed:

  When hyper threading is enabled, the way emulator_threads_policy
  allocates the extra cpu resource for emulator is not optimal.
  
  The instance I use for testing is a 6-vcpu VM; before enable this
  emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we
  enable hyper threading) in nova config,
  
-  vcpu_pin_set=8,10,12,32,34,36
+  vcpu_pin_set=8,10,12,32,34,36
  
  Now when we enable emulator_threads_policy, in stead of adding one more
  thread to this vcpu pin list in the nova config, I end up adding two
  more sibling threads (on the same core)
  
   vcpu_pin_set=8,10,12,16,32,34,36,40
  
  So I ended up using 2 more threads, but only of them is used for
  emulator and the other thread is wasted.
  
- Originally reported on Bugzilla - https://bugzilla.redhat.com/534669
+ Originally reported on Bugzilla - https://bugzilla.redhat.com/1534669

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1744965

Title:
  'emulator_threads_policy' doesn't work with 'vcpu_pin_set'

Status in OpenStack Compute (nova):
  New

Bug description:
  When hyper threading is enabled, the way emulator_threads_policy
  allocates the extra cpu resource for emulator is not optimal.

  The instance I use for testing is a 6-vcpu VM; before enable this
  emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we
  enable hyper threading) in nova config,

   vcpu_pin_set=8,10,12,32,34,36

  Now when we enable emulator_threads_policy, in stead of adding one
  more thread to this vcpu pin list in the nova config, I end up adding
  two more sibling threads (on the same core)

   vcpu_pin_set=8,10,12,16,32,34,36,40

  So I ended up using 2 more threads, but only of them is used for
  emulator and the other thread is wasted.

  Originally reported on Bugzilla - https://bugzilla.redhat.com/1534669

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1744965/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1743728] Re: giturl not working for api-ref (nova, neutron-lib)

2018-01-17 Thread Stephen Finucane
** Also affects: nova
   Importance: Undecided
   Status: New

** No longer affects: openstack-doc-tools

** Also affects: neutron
   Importance: Undecided
   Status: New

** Tags added: api-ref

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1743728

Title:
  giturl not working for api-ref (nova, neutron-lib)

Status in neutron:
  New
Status in OpenStack Compute (nova):
  New

Bug description:
  The report a bug link does not have a valid giturl for:

  https://developer.openstack.org/api-ref/network/
  https://developer.openstack.org/api-ref/compute/

  Note https://developer.openstack.org/api-ref/baremetal/ works fine.

  Did not check more.

  Note https://review.openstack.org/534666 might be part of the
  solution.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1743728/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


  1   2   >