[Yahoo-eng-team] [Bug 1771700] Re: nova-lvm tempest job failing with InvalidDiskInfo

2018-05-30 Thread OpenStack Infra
Reviewed:  https://review.openstack.org/569062
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=fda48219a378d09a9a363078ba161d7f54e32c0a
Submitter: Zuul
Branch:master

commit fda48219a378d09a9a363078ba161d7f54e32c0a
Author: Lee Yarwood 
Date:   Thu May 17 09:47:58 2018 +0100

libvirt: Skip fetching the virtual size of block devices

In this latest episode of `Which CI job has lyarwood broken today?!` we
find that I464bc2b88123a012cd12213beac4b572c3c20a56 introduced a
regression in the nova-lvm experimental job as n-cpu attempted to run
qemu-img info against block devices as an unprivileged user.

For the time being we should skip any attempt to use this command
against block devices until the disk_api layer can make privileged
calls using privsep.

Closes-bug: #1771700
Change-Id: I9653f81ec716f80eb638810f65e2d3cdfeedaa22


** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1771700

Title:
  nova-lvm tempest job failing with InvalidDiskInfo

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ocata series:
  Confirmed
Status in OpenStack Compute (nova) pike series:
  Confirmed
Status in OpenStack Compute (nova) queens series:
  Confirmed

Bug description:
  There has been a recent regression in the nova-lvm tempest job. The
  most recent passing run was on 2018-05-11 [1][2], so something
  regressed it between then and yesterday 2018-05-15.

  The build fails and the following trace is seen in the n-cpu log:

  May 15 23:01:40.174233 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager Traceback (most recent call last):
  May 15 23:01:40.174457 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 7343, in 
update_available_resource_for_node
  May 15 23:01:40.174699 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager rt.update_available_resource(context, nodename)
  May 15 23:01:40.174922 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/compute/resource_tracker.py", line 664, in 
update_available_resource
  May 15 23:01:40.175170 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager resources = 
self.driver.get_available_resource(nodename)
  May 15 23:01:40.175414 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 6391, in 
get_available_resource
  May 15 23:01:40.175641 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager disk_over_committed = 
self._get_disk_over_committed_size_total()
  May 15 23:01:40.175868 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 7935, in 
_get_disk_over_committed_size_total
  May 15 23:01:40.176091 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager config, block_device_info)
  May 15 23:01:40.176333 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 7852, in 
_get_instance_disk_info_from_config
  May 15 23:01:40.176555 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager virt_size = disk_api.get_disk_size(path)
  May 15 23:01:40.176773 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager   File "/opt/stack/new/nova/nova/virt/disk/api.py", 
line 99, in get_disk_size
  May 15 23:01:40.176994 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager return images.qemu_img_info(path).virtual_size
  May 15 23:01:40.177215 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager   File "/opt/stack/new/nova/nova/virt/images.py", 
line 87, in qemu_img_info
  May 15 23:01:40.177452 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager raise exception.InvalidDiskInfo(reason=msg)
  May 15 23:01:40.177674 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager InvalidDiskInfo: Disk info file is invalid: qemu-img 
failed to execute on 
/dev/stack-volumes-default/8a1d5912-13e1-4583-876e-a04396b6b712_disk : 
Unexpected error while running command.
  May 15 23:01:40.177902 ubuntu-xenial-rax-dfw-0004040560 nova-compute[28718]: 
ERROR nova.compute.manager Command: /usr/bin/python -m oslo_concurrency.prlimit 
--as=1073741824 --cpu=30 -- env LC_ALL=C LANG=C qemu-img info 
/dev/stack-volumes-default/8a1d5912-13e1-4583-876e-a04396b6b712_disk 

[Yahoo-eng-team] [Bug 1771885] Re: bionic: lxd containers missing search domain in systemd-resolve configuration

2018-05-30 Thread David Ames
Discussion with roaksoax about this and it seems likely this is a cloud-
int / netplan problem. I have added cloud-init and maas just to be
thorough.

When bionic is deployed using MAAS 2.3.0 using a static network config
the DNS search domain is missing from the netplan configuration and or
systemd-resolve.

I am attaching three sets of data. bionic-maas, bionic-dhcp and xenial-
maas to show the differences.

Cloud init reports in cloud-init.log it has the information. See search
bellow:

config= {
'config':
[{'id': 'eno1', 'mac_address': 'd4:be:d9:a8:44:ff', 'mtu': 1500, 'name': 
'eno1', 'subnets': [{'address': '10.245.168.26/21', 'dns_nameservers': 
['10.245.168.6'], 'gateway': '10.245.168.1', 'type': 'static'}], 'type': 
'physical'},
 {'id': 'eno2', 'mac_address': 'd4:be:d9:a8:45:01', 'mtu': 1500, 'name': 
'eno2', 'subnets': [{'type': 'manual'}], 'type': 'physical'},
 {'id': 'eno3', 'mac_address': 'd4:be:d9:a8:45:03', 'mtu': 1500, 'name': 
'eno3', 'subnets': [{'type': 'manual'}], 'type': 'physical'},
 {'id': 'eno4', 'mac_address': 'd4:be:d9:a8:45:05', 'mtu': 1500, 'name': 
'eno4', 'subnets': [{'type': 'manual'}], 'type': 'physical'},
 {'address': ['10.245.168.6'], 'search': ['maas'], 'type': 'nameserver'}],
'version': 1}

But the /etc/netplan/50-cloud-init.yaml configuration is missing this
information leading to resolution failures.

By contrast the xenial (/etc/network/interfaces.d/50-cloud-init.conf has
the correct information when the logging shows the same input from MAAS.

See also the bionci DHCP example which gets the search domain
information from DCHP.

** Also affects: cloud-init
   Importance: Undecided
   Status: New

** Also affects: maas
   Importance: Undecided
   Status: New

** Summary changed:

- bionic: lxd containers missing search domain in systemd-resolve configuration
+ bionic: manual maas missing search domain in systemd-resolve configuration

** Summary changed:

- bionic: manual maas missing search domain in systemd-resolve configuration
+ bionic: static maas missing search domain in systemd-resolve configuration

** Attachment added: "bionic-maas.tar.gz"
   
https://bugs.launchpad.net/maas/+bug/1771885/+attachment/5146674/+files/bionic-maas.tar.gz

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1771885

Title:
  bionic: static maas missing search domain in systemd-resolve
  configuration

Status in cloud-init:
  New
Status in juju:
  Fix Committed
Status in juju 2.3 series:
  Fix Released
Status in MAAS:
  New

Bug description:
  juju: 2.4-beta2  
  MAAS: 2.3.0

  Testing deployment of LXD containers on bionic (specifically for an
  openstack deployment) lead to this problem:

  https://bugs.launchpad.net/charm-nova-cloud-controller/+bug/1765405

  Summary:

  previously, the DNS config in the LXD containers were the same as the
  host machines

  now, the DNS config is in systemd, the DNS server is set correctly,
  but the search domain is missing, so hostnames won't resolve.

  Working resolv.conf on xenial lxd container:

  nameserver 10.245.168.6
  search maas

  Non-working "systemd-resolve --status":

  ...
  Link 21 (eth0)
Current Scopes: DNS
 LLMNR setting: yes
  MulticastDNS setting: no
DNSSEC setting: no
  DNSSEC supported: no
   DNS Servers: 10.245.168.6

  Working (now able to resolve hostnames after modifying netplan and
  adding search domain):

  Link 21 (eth0)
Current Scopes: DNS
 LLMNR setting: yes
  MulticastDNS setting: no
DNSSEC setting: no
  DNSSEC supported: no
   DNS Servers: 10.245.168.6
DNS Domain: maas

  ubuntu@juju-6406ff-2-lxd-2:/etc$ host node-name
  node-name.maas has address 10.245.168.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1771885/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1768876] Re: metadata-api fails to get availability zone for instances created before pike

2018-05-30 Thread OpenStack Infra
Reviewed:  https://review.openstack.org/567878
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=6b4c38c04177ff194d05368cd4aff69958075167
Submitter: Zuul
Branch:master

commit 6b4c38c04177ff194d05368cd4aff69958075167
Author: Surya Seetharaman 
Date:   Fri May 11 17:12:34 2018 +0200

Metadata-API fails to retrieve avz for instances created before Pike

In Pike (through change: I8d426f2635232ffc4b510548a905794ca88d7f99)
we started setting instance.avilability_zone during schedule time by
calculating the avz of the host into which the instance was scheduled
into. After this change was introduced, the metadata request for the avz
on the instance (through change: I73c3b10e52ab4cfda9dacc0c0ba92d1fcb60bcc9)
started using instance.get(availability_zone) instead of doing the upcall.
However this would return None for instances older than Pike whose
availability_zone was not mentioned during boot time as it would be set to
CONF.default_schedule_zone whose default value is None.

This patch adds an online_migration tool to populate missing
instance.availability_zone values.

Change-Id: I2a1d81bfeb1ea006c16d8f403e045e9acedcbe57
Closes-Bug: #1768876


** Changed in: nova
   Status: In Progress => Fix Released

** Changed in: nova/pike
   Status: Triaged => In Progress

** Changed in: nova/pike
 Assignee: (unassigned) => Matt Riedemann (mriedem)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1768876

Title:
  metadata-api fails to get availability zone for instances created
  before pike

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) pike series:
  In Progress
Status in OpenStack Compute (nova) queens series:
  In Progress

Bug description:
  Can't get AVZ for old instances:

  curl http://169.254.169.254/latest/meta-data/placement/availability-zone 
  None#

  This is because the upcall to the nova_api DB was removed in the commit: 
9f7bac2
  and old instances may haven't the AVZ defined.
  Previously, the AVZ in the instance was only set if explicitly defined by the 
user.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1768876/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1726310] Re: nova doesn't list services if it can't connect to a cell DB

2018-05-30 Thread OpenStack Infra
Reviewed:  https://review.openstack.org/568271
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=64e76de43dc55e584c15fa60da50dd06d352
Submitter: Zuul
Branch:master

commit 64e76de43dc55e584c15fa60da50dd06d352
Author: Surya Seetharaman 
Date:   Mon May 14 13:50:12 2018 +0200

Make nova service-list use scatter-gather routine

This patch makes nova service-list use the scatter-gather routine
so that if a cell is down, at least the services from other cells
are listed by ignoring the down cell instead of the whole command
failing with an API exception as is the current situation. Also
making this query parallel for all cells is more efficient.

Depends-On: https://review.openstack.org/569112/

Change-Id: I90b488102eb265d971cade29892279a22d3b5273
Closes-Bug: #1726310


** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1726310

Title:
  nova doesn't list services if it can't connect to a cell DB

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Description
  ===
  nova doesn't list services if it can't connect to a child cell DB.

  I would expect nova to show the services from all child DBs that it can 
connect.
  For the child DBs that can't connect it can show for the mandatory services 
(nova-conductor) with the status "not available" and in the disabled reason why 
("can't connect to the DB")

  
  Steps to reproduce
  ==
  Have at least 2 child cells.
  Stop the DB in one of them.

  "nova service-list" fails with "ERROR (ClientException): Unexpected API 
Error."
  Not given any information about what's causing the problem.

  Expected result
  ===
  List the services of the available cells and list the status of the mandatory 
services of the affected cells as "not available".

  
  Actual result
  =
  $nova service-list
  fails.

  
  Environment
  ===
  nova master (commit: 8d21d711000fff80eb367692b157d09b6532923f)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1726310/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1762842] Re: Compute API guide: Faults in nova - Instance Faults section is wrong

2018-05-30 Thread OpenStack Infra
Reviewed:  https://review.openstack.org/560178
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=88f86e24e121ba545012051b6587d7475bfc5cec
Submitter: Zuul
Branch:master

commit 88f86e24e121ba545012051b6587d7475bfc5cec
Author: Matt Riedemann 
Date:   Tue Apr 10 19:30:51 2018 -0400

doc: cleanup API guide about instance faults

The compute API guide section on instance faults
is updated to point out that server details contain
fault information for servers in ERROR or DELETED status
along with a simple non-admin scenario example.

Change-Id: Idc725a594b67b5f6e45c6f161f6e92c0601761a8
Closes-Bug: #1762842


** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1762842

Title:
  Compute API guide: Faults in nova - Instance Faults section is wrong

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  - [x] This doc is inaccurate in this way:

  This section about instance faults:

  https://developer.openstack.org/api-guide/compute/faults.html
  #instance-faults

  says:

  "However, there is currently no API to retrieve this information."

  This is wrong, as the GET /servers/{server_id} response has a 'fault'
  entry if there was a fault for the server:

  https://developer.openstack.org/api-ref/compute/#id27

  "A fault object. Only displayed in the failed response. Default keys
  are code, created, and message (response code, created time, and
  message respectively). In addition, the key details (stack trace) is
  available if you have the administrator privilege."

  ---
  Release: 17.0.0.0rc2.dev613 on 2018-04-10 15:08
  SHA: 836c3913cc382428625a5e7502a4c807b8136d0a
  Source: 
https://git.openstack.org/cgit/openstack/nova/tree/api-guide/source/faults.rst
  URL: https://developer.openstack.org/api-guide/compute/faults.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1762842/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1773420] Re: Networking Option 1: Provider networks in Neutron

2018-05-30 Thread Brian Haley
This bug was already fixed on the master branch by
https://review.openstack.org/#/c/566491/ so please just cherry-pick that
to stable/queens.  I'm closing this bug as it's one of many duplicates.

** Changed in: neutron
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1773420

Title:
  Networking Option 1: Provider networks in Neutron

Status in neutron:
  Invalid

Bug description:

  This bug tracker is for errors with the documentation, use the
  following as a template and remove or add fields as you see fit.
  Convert [ ] into [x] to check boxes:

  - [X] This doc is inaccurate in this way: __
  - [ ] This is a doc addition request.
  - [X] I have a fix to the document that I can paste below including example: 
input and output. 

  
  When configuring /etc/neutron/neutron.conf the guide says to set the 
following: -

  [keystone_authtoken]
  # ...
  auth_uri = http://controller:5000
  auth_url = http://controller:35357

  however port :35357 has been dropped in Openstack Queens. If you set
  auth_url= http://controller:35357, then attempts to use neutron will
  fail. e.g.

  root@controller:/home/user# openstack network list
  HttpException: Unknown error

  and the /var/log/neutron/neutron-server.log will report something
  like: -

  WARNING keystoneauth.identity.generic.base [-] Failed to discover
  available identity versions when contacting http://host:35357.
  Attempting to parse version from URL.: ConnectFailure: Unable to
  establish connection to http://host:35357:
  HTTPConnectionPool(host='host', port=35357): Max retries exceeded with
  url: / (Caused by
  NewConnectionError(': Failed to establish a new connection: [Errno 111]
  ECONNREFUSED',))

  I have replaced port :35357 with :5000 and it seems to work, but I am
  not sure if this is best practice.

  
  If you have a troubleshooting or support issue, use the following  resources:

   - Ask OpenStack: http://ask.openstack.org
   - The mailing list: http://lists.openstack.org
   - IRC: 'openstack' channel on Freenode

  ---
  Release: 12.0.3.dev16 on 2018-05-24 02:39
  SHA: 85de06e2c40bfdc8dee80506f8d1d809a93b900e
  Source: 
https://git.openstack.org/cgit/openstack/neutron/tree/doc/source/install/controller-install-option1-ubuntu.rst
  URL: 
https://docs.openstack.org/neutron/queens/install/controller-install-option1-ubuntu.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1773420/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1757482] Re: IP address for a router interface allowed outside the allocation range of subnet

2018-05-30 Thread Brian Haley
Re-opened since bug 1774019 seems to be a duplicate.  In that case a
user was able to add a router to a shared external network and it got
the .1 address.  Looks like there is an edge case here we need to cover.

** Changed in: neutron
   Status: Expired => Confirmed

** Changed in: neutron
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1757482

Title:
  IP address for a router interface allowed outside the allocation range
  of subnet

Status in neutron:
  Confirmed

Bug description:
  Currently running Queens on Ubuntu 16.04 with the linuxbridge ml2
  plugin with vxlan overlays.  We have a single, large provider network
  that we have set to 'shared' and 'external', so people who need to do
  things that don't work well with NAT can connect their instances
  directly to the provider network.  Our 'allocation range' as defined
  in our provider subnet is dedicated to tenants, so there should be no
  conflicts.

  One of our users connected a neutron router to the provider network
  (not via the 'external network' option, but rather via the normal 'add
  interface' option) and neglected to specify an IP address.  The
  neutron router decided that it was now the gateway for the entire
  provider network and began arp'ing.

  This seems like it should be disallowed inside of neutron (you
  shouldn't be able to specify an IP address for a router interface that
  isn't explicitly part of your allocation range on said subnet).
  Unless neutron just expect issues like this to be handled by the
  physical provider infrastructure (spoofing prevention, etc.)?

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1757482/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774257] [NEW] neutron-openvswitch-agent RuntimeError: Switch connection timeout

2018-05-30 Thread Gaëtan Trellu
Public bug reported:

In neutron-openvswitch-agent.log I see lot of timeout message.

  RuntimeError: Switch connection timeout

This timeout prevents sometime neutron-openvswitch-agent to be UP.
We are running Pike and we have ~1000 ports in Open vSwitch.

I'm able to run ovs-vsctl, ovs-ofctl, etc... commands which mean that
Open vSwitch (vswitchd+db) are working fine.

This is the full TRACE of neutron-openvswitch-agent log:

2018-05-30 19:22:42.353 7 WARNING ovsdbapp.backend.ovs_idl.vlog [-] 
tcp:127.0.0.1:6640: receive error: Connection reset by peer
2018-05-30 19:22:42.358 7 WARNING ovsdbapp.backend.ovs_idl.vlog [-] 
tcp:127.0.0.1:6640: connection dropped (Connection reset by peer)
2018-05-30 19:24:17.626 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch 
[req-3c335d47-9b3e-4f18-994b-afca7d7d70be - - - - -] Switch connection timeout: 
RuntimeError: Switch connection timeout
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
[req-3c335d47-9b3e-4f18-994b-afca7d7d70be - - - - -] Error while processing VIF 
ports: RuntimeError: Switch connection timeout
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most 
recent call last):
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py",
 line 2066, in rpc_loop
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
ofport_changed_ports = self.update_stale_ofport_rules()
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/osprofiler/profiler.py", 
line 153, in wrapper
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent return 
f(*args, **kwargs)
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py",
 line 1210, in update_stale_ofport_rules
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
self.int_br.delete_arp_spoofing_protection(port=ofport)
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py",
 line 255, in delete_arp_spoofing_protection
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent match=match)
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py",
 line 111, in uninstall_flows
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent (dp, ofp, 
ofpp) = self._get_dp()
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_bridge.py",
 line 67, in _get_dp
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
self._cached_dpid = new_dpid
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", 
line 220, in __exit__
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
self.force_reraise()
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", 
line 196, in force_reraise
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
six.reraise(self.type_, self.value, self.tb)
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_bridge.py",
 line 50, in _get_dp
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent dp = 
self._get_dp_by_dpid(self._cached_dpid)
2018-05-30 19:24:17.628 7 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py",
 line 69, in 

[Yahoo-eng-team] [Bug 1774252] [NEW] Resize confirm fails if nova-compute is restarted after resize

2018-05-30 Thread Matthew Booth
Public bug reported:

Originally reported in RH bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=1584315

Reproduced on OSP12 (Pike).

After resizing an instance but before confirm, update_available_resource
will fail on the source compute due to bug 1774249. If nova compute is
restarted at this point before the resize is confirmed, the
update_available_resource period task will never have succeeded, and
therefore ResourceTracker's compute_nodes dict will not be populated at
all.

When confirm calls _delete_allocation_after_move() it will fail with
ComputeHostNotFound because there is no entry for the current node in
ResourceTracker. The error looks like:

2018-05-30 13:42:19.239 1 ERROR nova.compute.manager 
[req-4f7d5d63-fc05-46ed-b505-41050d889752 09abbd4893bb45eea8fb1d5e40635339 
d4483d13a6ef41b2ae575ddbd0c59141 - default default] [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0] Setting instance vm_state to ERROR: 
ComputeHostNotFound: Compute host compute-1.localdomain could not be found.
2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0] Traceback (most recent call last):
2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0]   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7445, in 
_error_out_instance_on_exception
2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0] yield
2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0]   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3757, in 
_confirm_resize
2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0] migration.source_node)
2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0]   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3790, in 
_delete_allocation_after_move
2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0] cn_uuid = rt.get_node_uuid(nodename)
2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0]   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 155, 
in get_node_uuid
2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0] raise 
exception.ComputeHostNotFound(host=nodename)
2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0] ComputeHostNotFound: Compute host 
compute-1.localdomain could not be found.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1774252

Title:
  Resize confirm fails if nova-compute is restarted after resize

Status in OpenStack Compute (nova):
  New

Bug description:
  Originally reported in RH bugzilla:
  https://bugzilla.redhat.com/show_bug.cgi?id=1584315

  Reproduced on OSP12 (Pike).

  After resizing an instance but before confirm,
  update_available_resource will fail on the source compute due to bug
  1774249. If nova compute is restarted at this point before the resize
  is confirmed, the update_available_resource period task will never
  have succeeded, and therefore ResourceTracker's compute_nodes dict
  will not be populated at all.

  When confirm calls _delete_allocation_after_move() it will fail with
  ComputeHostNotFound because there is no entry for the current node in
  ResourceTracker. The error looks like:

  2018-05-30 13:42:19.239 1 ERROR nova.compute.manager 
[req-4f7d5d63-fc05-46ed-b505-41050d889752 09abbd4893bb45eea8fb1d5e40635339 
d4483d13a6ef41b2ae575ddbd0c59141 - default default] [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0] Setting instance vm_state to ERROR: 
ComputeHostNotFound: Compute host compute-1.localdomain could not be found.
  2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0] Traceback (most recent call last):
  2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0]   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7445, in 
_error_out_instance_on_exception
  2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0] yield
  2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0]   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3757, in 
_confirm_resize
  2018-05-30 13:42:19.239 1 ERROR nova.compute.manager [instance: 
1374133a-2c08-4a8f-94f6-729d4e58d7e0] migration.source_node)
  2018-05-30 13:42:19.239 1 ERROR nova.compute.manager 

[Yahoo-eng-team] [Bug 1774249] [NEW] update_available_resource will raise DiskNotFound after resize but before confirm

2018-05-30 Thread Matthew Booth
Public bug reported:

Original reported in RH Bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=1584315

Tested on OSP12 (Pike), but appears to be still present on master.
Should only occur if nova compute is configured to use local file
instance storage.

Create instance A on compute X

Resize instance A to compute Y
  Domain is powered off
  /var/lib/nova/instances/ renamed to _resize on X
  Domain is *not* undefined

On compute X:
  update_available_resource runs as a periodic task
  First action is to update self
  rt calls driver.get_available_resource()
  ...calls _get_disk_over_committed_size_total
  ...iterates over all defined domains, including the ones whose disks we 
renamed
  ...fails because a referenced disk no longer exists

Results in errors in nova-compute.log:

2018-05-30 02:17:08.647 1 ERROR nova.compute.manager 
[req-bd52371f-c6ec-4a83-9584-c00c5377acd8 - - - - -] Error updating resources 
for node compute-0.localdomain.: DiskNotFound: No disk at 
/var/lib/nova/instances/f3ed9015-3984-43f4-b4a5-c2898052b47d/disk
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager Traceback (most recent 
call last):
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6695, in 
update_available_resource_for_node
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager 
rt.update_available_resource(context, nodename)
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 641, 
in update_available_resource
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager resources = 
self.driver.get_available_resource(nodename)
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5892, in 
get_available_resource
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager 
disk_over_committed = self._get_disk_over_committed_size_total()
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7393, in 
_get_disk_over_committed_size_total
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager config, 
block_device_info)
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7301, in 
_get_instance_disk_info_from_config
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager dk_size = 
disk_api.get_allocated_disk_size(path)
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/virt/disk/api.py", line 156, in 
get_allocated_disk_size
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager return 
images.qemu_img_info(path).disk_size
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/virt/images.py", line 57, in 
qemu_img_info
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager raise 
exception.DiskNotFound(location=path)
2018-05-30 02:17:08.647 1 ERROR nova.compute.manager DiskNotFound: No disk 
at /var/lib/nova/instances/f3ed9015-3984-43f4-b4a5-c2898052b47d/disk

And resource tracker is no longer updated. We can find lots of these in
the gate.

Note that change Icec2769bf42455853cbe686fb30fda73df791b25 nearly
mitigates this, but doesn't because task_state is not set while the
instance is awaiting confirm.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1774249

Title:
  update_available_resource will raise DiskNotFound after resize but
  before confirm

Status in OpenStack Compute (nova):
  New

Bug description:
  Original reported in RH Bugzilla:
  https://bugzilla.redhat.com/show_bug.cgi?id=1584315

  Tested on OSP12 (Pike), but appears to be still present on master.
  Should only occur if nova compute is configured to use local file
  instance storage.

  Create instance A on compute X

  Resize instance A to compute Y
Domain is powered off
/var/lib/nova/instances/ renamed to _resize on X
Domain is *not* undefined

  On compute X:
update_available_resource runs as a periodic task
First action is to update self
rt calls driver.get_available_resource()
...calls _get_disk_over_committed_size_total
...iterates over all defined domains, including the ones whose disks we 
renamed
...fails because a referenced disk no longer exists

  Results in errors in nova-compute.log:

  2018-05-30 02:17:08.647 1 ERROR nova.compute.manager 
[req-bd52371f-c6ec-4a83-9584-c00c5377acd8 - - - - -] Error updating resources 
for node compute-0.localdomain.: DiskNotFound: No disk at 
/var/lib/nova/instances/f3ed9015-3984-43f4-b4a5-c2898052b47d/disk
  2018-05-30 

[Yahoo-eng-team] [Bug 1774234] [NEW] api-ref: cold migrate reference doesn't mention asynchronous post conditions

2018-05-30 Thread Matt Riedemann
Public bug reported:

The API reference for the server 'migrate' (cold migrate) action doesn't
mention any asynchronous post conditions, like that the server goes to
VERIFY_RESIZE status after a successful cold migration and then must be
confirmed or reverted.

https://developer.openstack.org/api-ref/compute/#migrate-server-migrate-
action

We should have something similar to what's in the 'resize' action API
reference:

https://developer.openstack.org/api-ref/compute/#resize-server-resize-
action

** Affects: nova
 Importance: Medium
 Status: Triaged


** Tags: api-ref docs low-hanging-fruit

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1774234

Title:
  api-ref: cold migrate reference doesn't mention asynchronous post
  conditions

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  The API reference for the server 'migrate' (cold migrate) action
  doesn't mention any asynchronous post conditions, like that the server
  goes to VERIFY_RESIZE status after a successful cold migration and
  then must be confirmed or reverted.

  https://developer.openstack.org/api-ref/compute/#migrate-server-
  migrate-action

  We should have something similar to what's in the 'resize' action API
  reference:

  https://developer.openstack.org/api-ref/compute/#resize-server-resize-
  action

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1774234/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774229] [NEW] api-ref: GET /v3/auth/tokens doesn't mention the optional "system" parameter for a scoped token

2018-05-30 Thread Matt Riedemann
Public bug reported:

This came about from a discussion in IRC:

http://eavesdrop.openstack.org/irclogs/%23openstack-dev/%23openstack-
dev.2018-05-30.log.html#t2018-05-30T16:13:41

The API reference here doesn't mention that for a system-scoped token,
the response body can have a 'system' attribute in the 'token' object in
the response body:

https://developer.openstack.org/api-ref/identity/v3/index.html#validate-
and-show-information-for-token

The example in the API reference is for an unscoped token, but for SDK
writers the API reference should have a description of all the possible
response parameters and indicate if they are optional, and the
description can explain in what cases they would be present in the
response.

For example, GET /servers/{id} can have a "fault" parameter in the
response but only under certain scenarios based on the server status:

https://developer.openstack.org/api-ref/compute/#show-server-details

But we still document the optional parameter.

** Affects: keystone
 Importance: Medium
 Status: Confirmed


** Tags: api-ref documentation

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1774229

Title:
  api-ref: GET /v3/auth/tokens doesn't mention the optional "system"
  parameter for a scoped token

Status in OpenStack Identity (keystone):
  Confirmed

Bug description:
  This came about from a discussion in IRC:

  http://eavesdrop.openstack.org/irclogs/%23openstack-dev/%23openstack-
  dev.2018-05-30.log.html#t2018-05-30T16:13:41

  The API reference here doesn't mention that for a system-scoped token,
  the response body can have a 'system' attribute in the 'token' object
  in the response body:

  https://developer.openstack.org/api-ref/identity/v3/index.html
  #validate-and-show-information-for-token

  The example in the API reference is for an unscoped token, but for SDK
  writers the API reference should have a description of all the
  possible response parameters and indicate if they are optional, and
  the description can explain in what cases they would be present in the
  response.

  For example, GET /servers/{id} can have a "fault" parameter in the
  response but only under certain scenarios based on the server status:

  https://developer.openstack.org/api-ref/compute/#show-server-details

  But we still document the optional parameter.

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1774229/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774205] Re: AggregateMultiTenancyIsolation uses wrong tenant_id during cold migrate

2018-05-30 Thread Matt Riedemann
** Also affects: nova/pike
   Importance: Undecided
   Status: New

** Also affects: nova/ocata
   Importance: Undecided
   Status: New

** Also affects: nova/queens
   Importance: Undecided
   Status: New

** Changed in: nova/ocata
   Status: New => Triaged

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1774205

Title:
  AggregateMultiTenancyIsolation uses wrong tenant_id during cold
  migrate

Status in OpenStack Compute (nova):
  Triaged
Status in OpenStack Compute (nova) ocata series:
  Triaged
Status in OpenStack Compute (nova) pike series:
  New
Status in OpenStack Compute (nova) queens series:
  New

Bug description:
  The details are in this mailing list thread:

  http://lists.openstack.org/pipermail/openstack-
  operators/2018-May/015347.html

  But essentially the case is:

  * There are 3 compute hosts.
  * compute1 and compute2 are in a host aggregate and a given tenant is 
restricted to that aggregate
  * The user creates a server on compute1
  * The admin attempts to cold migrate the server which fails in the 
AggregateMultiTenancyIsolation filter because it says the tenant_id in the 
request is not part of the matching host aggregate.

  The reason is because the cold migrate task in the conductor replaces
  the original request spec, which had the instance project_id in it,
  and uses the current context, which is the admin (which could be in a
  different project):

  
https://github.com/openstack/nova/blob/stable/ocata/nova/conductor/tasks/migrate.py#L50

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1774205/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774217] [NEW] docs: vif_plugging_timeout nova config option should mention running neutron rootwrap in daemon mode for performance

2018-05-30 Thread Matt Riedemann
Public bug reported:

There are notes from this operators mailing list thread about how the
neutron agent needed to be configured for rootwrap daemon mode to hit
scale targets otherwise servers would fail to create due to vif plugging
timeouts:

http://lists.openstack.org/pipermail/openstack-
operators/2018-May/015364.html

This bug is used to track the documentation updates needed in both nova
and neutron.

** Affects: neutron
 Importance: Medium
 Assignee: Matt Riedemann (mriedem)
 Status: Triaged

** Affects: nova
 Importance: Medium
 Assignee: Matt Riedemann (mriedem)
 Status: Triaged


** Tags: config neutron performance

** Changed in: nova
 Assignee: (unassigned) => Matt Riedemann (mriedem)

** Also affects: neutron
   Importance: Undecided
   Status: New

** Changed in: neutron
 Assignee: (unassigned) => Matt Riedemann (mriedem)

** Changed in: neutron
   Importance: Undecided => Medium

** Changed in: neutron
   Status: New => Triaged

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1774217

Title:
  docs: vif_plugging_timeout nova config option should mention running
  neutron rootwrap in daemon mode for performance

Status in neutron:
  Triaged
Status in OpenStack Compute (nova):
  Triaged

Bug description:
  There are notes from this operators mailing list thread about how the
  neutron agent needed to be configured for rootwrap daemon mode to hit
  scale targets otherwise servers would fail to create due to vif
  plugging timeouts:

  http://lists.openstack.org/pipermail/openstack-
  operators/2018-May/015364.html

  This bug is used to track the documentation updates needed in both
  nova and neutron.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1774217/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774208] [NEW] osprofiler config options don't show up in nova configuration reference

2018-05-30 Thread Matt Riedemann
Public bug reported:

Looking at the nova configuration reference and sample:

https://docs.openstack.org/nova/latest/configuration/config.html

The osprofiler options aren't included:

https://github.com/openstack/osprofiler/blob/master/osprofiler/opts.py

osprofiler is optional but if available we load the config options:

https://github.com/openstack/nova/blob/465051809c1f09417207136e2ac8615838262c1a/nova/config.py#L46

So we should probably include those in the configuration reference for
nova. For example, we also include config options for the vmware driver
which is optional, and oslo.vmware is in test-requirements.txt like
osprofiler.

** Affects: nova
 Importance: Low
 Status: Triaged


** Tags: config docs

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1774208

Title:
  osprofiler config options don't show up in nova configuration
  reference

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  Looking at the nova configuration reference and sample:

  https://docs.openstack.org/nova/latest/configuration/config.html

  The osprofiler options aren't included:

  https://github.com/openstack/osprofiler/blob/master/osprofiler/opts.py

  osprofiler is optional but if available we load the config options:

  
https://github.com/openstack/nova/blob/465051809c1f09417207136e2ac8615838262c1a/nova/config.py#L46

  So we should probably include those in the configuration reference for
  nova. For example, we also include config options for the vmware
  driver which is optional, and oslo.vmware is in test-requirements.txt
  like osprofiler.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1774208/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774205] [NEW] AggregateMultiTenancyIsolation uses wrong tenant_id during cold migrate

2018-05-30 Thread Matt Riedemann
Public bug reported:

The details are in this mailing list thread:

http://lists.openstack.org/pipermail/openstack-
operators/2018-May/015347.html

But essentially the case is:

* There are 3 compute hosts.
* compute1 and compute2 are in a host aggregate and a given tenant is 
restricted to that aggregate
* The user creates a server on compute1
* The admin attempts to cold migrate the server which fails in the 
AggregateMultiTenancyIsolation filter because it says the tenant_id in the 
request is not part of the matching host aggregate.

The reason is because the cold migrate task in the conductor replaces
the original request spec, which had the instance project_id in it, and
uses the current context, which is the admin (which could be in a
different project):

https://github.com/openstack/nova/blob/stable/ocata/nova/conductor/tasks/migrate.py#L50

** Affects: nova
 Importance: High
 Assignee: Matt Riedemann (mriedem)
 Status: Triaged


** Tags: cold-migrate openstack-version.ocata regression

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1774205

Title:
  AggregateMultiTenancyIsolation uses wrong tenant_id during cold
  migrate

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  The details are in this mailing list thread:

  http://lists.openstack.org/pipermail/openstack-
  operators/2018-May/015347.html

  But essentially the case is:

  * There are 3 compute hosts.
  * compute1 and compute2 are in a host aggregate and a given tenant is 
restricted to that aggregate
  * The user creates a server on compute1
  * The admin attempts to cold migrate the server which fails in the 
AggregateMultiTenancyIsolation filter because it says the tenant_id in the 
request is not part of the matching host aggregate.

  The reason is because the cold migrate task in the conductor replaces
  the original request spec, which had the instance project_id in it,
  and uses the current context, which is the admin (which could be in a
  different project):

  
https://github.com/openstack/nova/blob/stable/ocata/nova/conductor/tasks/migrate.py#L50

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1774205/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774189] [NEW] delete stack failed, oslo_db.exception.DBError

2018-05-30 Thread tommaso
Public bug reported:

When I try to delete a stack I retrieve this error:

DELETE failed: ClientException: resources.server1: Unexpected API Error.
Please report this at http://bugs.launchpad.net/nova/ and attach the
Nova API log if possible.  (HTTP 500)
(Request-ID: req-25b0407d-f31a-4ec1-a6ce-3b2ce04914bc)

Then Openstack don't work correctly and I can't access anymore to
Compute->Instances and Network information.

The Nova Api log is attached.

** Affects: nova
 Importance: Undecided
 Status: New

** Attachment added: "nova-api.log"
   
https://bugs.launchpad.net/bugs/1774189/+attachment/5146427/+files/nova-api.log

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1774189

Title:
  delete stack failed, oslo_db.exception.DBError

Status in OpenStack Compute (nova):
  New

Bug description:
  When I try to delete a stack I retrieve this error:

  DELETE failed: ClientException: resources.server1: Unexpected API
  Error. Please report this at http://bugs.launchpad.net/nova/ and
  attach the Nova API log if possible.  (HTTP 500) (Request-ID: req-25b0407d-
  f31a-4ec1-a6ce-3b2ce04914bc)

  Then Openstack don't work correctly and I can't access anymore to
  Compute->Instances and Network information.

  The Nova Api log is attached.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1774189/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1768980] Re: Wrong Port in "Create OpenStack client environment scripts in keystone" document

2018-05-30 Thread Chason Chan
** Changed in: keystone
   Status: Triaged => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1768980

Title:
  Wrong Port in "Create OpenStack client environment scripts in
  keystone" document

Status in OpenStack Identity (keystone):
  Fix Released

Bug description:

  This bug tracker is for errors with the documentation, use the
  following as a template and remove or add fields as you see fit.
  Convert [ ] into [x] to check boxes:

  - [x] This doc is inaccurate in this way: __

  On the admin auth url, it was supposed to be port 35357 instead of
  5000, as mentioned at the page before. Even it working on 5000 too,
  the script is not doing the same as the page before.

  
  If you have a troubleshooting or support issue, use the following  resources:

   - Ask OpenStack: http://ask.openstack.org
   - The mailing list: http://lists.openstack.org
   - IRC: 'openstack' channel on Freenode

  ---
  Release: 13.0.1.dev8 on 2018-05-02 17:02
  SHA: 56d108858a2284516e1cba66a86883ea969755d4
  Source: 
https://git.openstack.org/cgit/openstack/keystone/tree/doc/source/install/keystone-openrc-rdo.rst
  URL: 
https://docs.openstack.org/keystone/queens/install/keystone-openrc-rdo.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1768980/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774137] [NEW] volume is in "in use" state when a vmware instance boot from volume is failed

2018-05-30 Thread zhaodan7597
Public bug reported:

when creating a vmware instance from a volume is failed, vm goes to
error state, the volume is  in "in use" state, and after deleting the
vm, the state of the volume is still in "in use" so it can't been
deleted.

** Affects: nova
 Importance: Undecided
 Status: New

** Summary changed:

- volume is  in "in use" state when vm boot from volume is failed
+ volume is  in "in use" state when a vmware instance boot from volume is failed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1774137

Title:
  volume is  in "in use" state when a vmware instance boot from volume
  is failed

Status in OpenStack Compute (nova):
  New

Bug description:
  when creating a vmware instance from a volume is failed, vm goes to
  error state, the volume is  in "in use" state, and after deleting the
  vm, the state of the volume is still in "in use" so it can't been
  deleted.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1774137/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774109] [NEW] OrphanedObjectError: Cannot call obj_load_attr on orphaned Instance object

2018-05-30 Thread xiaoxu790
Public bug reported:

openstack:Ocata

# tailf /var/log/nova/nova-api.log
2018-05-30 14:00:34.832 170348 ERROR nova.api.openstack.extensions 
self.flavor = instance.flavor
2018-05-30 14:00:34.832 170348 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 67, in 
getter
2018-05-30 14:00:34.832 170348 ERROR nova.api.openstack.extensions 
self.obj_load_attr(name)
2018-05-30 14:00:34.832 170348 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 1029, in 
obj_load_attr
2018-05-30 14:00:34.832 170348 ERROR nova.api.openstack.extensions 
objtype=self.obj_name())
2018-05-30 14:00:34.832 170348 ERROR nova.api.openstack.extensions 
OrphanedObjectError: Cannot call obj_load_attr on orphaned Instance object
2018-05-30 14:00:34.832 170348 ERROR nova.api.openstack.extensions
2018-05-30 14:00:34.834 170348 INFO nova.api.openstack.wsgi 
[req-dde33c8b-7d91-4938-bab2-0410c046dd71 516e4eeaf0614802b3af422f40b140b6 
befcf911008745b69ca9e4c1fd1e868e - default default] HTTP exception thrown: 
Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and 
attach the Nova API log if possible.

2018-05-30 14:00:34.836 170348 INFO nova.osapi_compute.wsgi.server 
[req-dde33c8b-7d91-4938-bab2-0410c046dd71 516e4eeaf0614802b3af422f40b140b6 
befcf911008745b69ca9e4c1fd1e868e - default default] 172.20.239.50 "GET 
/v2.1/servers/detail HTTP/1.1" status: 500 len: 577 time: 0.2061019


# nova --debug list
DEBUG (extension:180) found extension EntryPoint.parse('v2token = 
keystoneauth1.loading._plugins.identity.v2:Token')
DEBUG (extension:180) found extension EntryPoint.parse('v3oauth1 = 
keystoneauth1.extras.oauth1._loading:V3OAuth1')
DEBUG (extension:180) found extension EntryPoint.parse('admin_token = 
keystoneauth1.loading._plugins.admin_token:AdminToken')
DEBUG (extension:180) found extension EntryPoint.parse('v3oidcauthcode = 
keystoneauth1.loading._plugins.identity.v3:OpenIDConnectAuthorizationCode')
DEBUG (extension:180) found extension EntryPoint.parse('v2password = 
keystoneauth1.loading._plugins.identity.v2:Password')
DEBUG (extension:180) found extension EntryPoint.parse('v3samlpassword = 
keystoneauth1.extras._saml2._loading:Saml2Password')
DEBUG (extension:180) found extension EntryPoint.parse('v3password = 
keystoneauth1.loading._plugins.identity.v3:Password')
DEBUG (extension:180) found extension EntryPoint.parse('v3oidcaccesstoken = 
keystoneauth1.loading._plugins.identity.v3:OpenIDConnectAccessToken')
DEBUG (extension:180) found extension EntryPoint.parse('v3oidcpassword = 
keystoneauth1.loading._plugins.identity.v3:OpenIDConnectPassword')
DEBUG (extension:180) found extension EntryPoint.parse('v3kerberos = 
keystoneauth1.extras.kerberos._loading:Kerberos')
DEBUG (extension:180) found extension EntryPoint.parse('token = 
keystoneauth1.loading._plugins.identity.generic:Token')
DEBUG (extension:180) found extension EntryPoint.parse('v3oidcclientcredentials 
= keystoneauth1.loading._plugins.identity.v3:OpenIDConnectClientCredentials')
DEBUG (extension:180) found extension EntryPoint.parse('v3tokenlessauth = 
keystoneauth1.loading._plugins.identity.v3:TokenlessAuth')
DEBUG (extension:180) found extension EntryPoint.parse('v3token = 
keystoneauth1.loading._plugins.identity.v3:Token')
DEBUG (extension:180) found extension EntryPoint.parse('v3totp = 
keystoneauth1.loading._plugins.identity.v3:TOTP')
DEBUG (extension:180) found extension EntryPoint.parse('password = 
keystoneauth1.loading._plugins.identity.generic:Password')
DEBUG (extension:180) found extension EntryPoint.parse('v3fedkerb = 
keystoneauth1.extras.kerberos._loading:MappedKerberos')
DEBUG (extension:180) found extension EntryPoint.parse('token_endpoint = 
openstackclient.api.auth_plugin:TokenEndpoint')
DEBUG (extension:180) found extension EntryPoint.parse('v1password = 
swiftclient.authv1:PasswordLoader')
DEBUG (session:347) REQ: curl -g -i -X GET http://controller:35357/v3 -H 
"Accept: application/json" -H "User-Agent: nova keystoneauth1/2.18.0 
python-requests/2.11.1 CPython/2.7.5"
INFO (connectionpool:214) Starting new HTTP connection (1): controller
DEBUG (connectionpool:401) "GET /v3 HTTP/1.1" 200 250
DEBUG (session:395) RESP: [200] Date: Wed, 30 May 2018 06:03:55 GMT Server: 
Apache/2.4.6 (CentOS) mod_wsgi/3.4 Python/2.7.5 Vary: X-Auth-Token 
x-openstack-request-id: req-21122cf8-a4c6-459e-9432-c6715297effd 
Content-Length: 250 Keep-Alive: timeout=5, max=100 Connection: Keep-Alive 
Content-Type: application/json
RESP BODY: {"version": {"status": "stable", "updated": "2017-02-22T00:00:00Z", 
"media-types": [{"base": "application/json", "type": 
"application/vnd.openstack.identity-v3+json"}], "id": "v3.8", "links": 
[{"href": "http://controller:35357/v3/;, "rel": "self"}]}}

DEBUG (session:640) GET call to None for http://controller:35357/v3 used 
request id req-21122cf8-a4c6-459e-9432-c6715297effd
DEBUG (base:165)