[Yahoo-eng-team] [Bug 1820827] Re: neutron-vpnaas :ipsec site connection pending create

2019-05-24 Thread Launchpad Bug Tracker
[Expired for neutron because there has been no activity for 60 days.]

** Changed in: neutron
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1820827

Title:
  neutron-vpnaas :ipsec site connection  pending create

Status in neutron:
  Expired

Bug description:
  Openstack release is pike in ubuntu16.04.
  After 
  sudo apt-get install python-neutron-vpnaas
  sudo apt-get install strongswan
  I didn't get a file named /usr/lib/neutron-vpn-agent or 
/etc/neutron/vpn-agent.ini.
  Then I edit /etc/neutron/neutron.conf 
  service = vpnaas

  /etc/neutron/neutron_vpnaas.conf
  service_provider = 
VPN:strongswan:neutron_vpnaas.services.vpn.service_drivers.ipsec.IPsecVPNDriver:default

  /etc/neutron/l3-agent.ini
  [AGENT]
  extensions = vpnaas

  [vpnagent]
  vpn_device_driver = 
neutron_vpnaas.services.vpn.device_drivers.strongswan_ipsec.StrongSwanDriver

  systemctl restart neutron-server
  systemctl restart neutron-l3-agent

  /var/log/neutron/neutron-server.log

  2019-03-19 17:53:50.988 10977 WARNING stevedore.named [req-
  bfb9dc35-98e2-4b93-9190-fb361ec162a0 - - - - -] Could not load
  neutron_vpnaas.services.vpn.service_drivers.ipsec.IPsecVPNDriver

  /var/log/neutron/neutron-l3-agent.log

  2019-03-19 17:53:13.979 10901 WARNING stevedore.named [req-
  46c236d1-02c2-4d05-a644-b1603f7b73cd - - - - -] Could not load vpnaas

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1820827/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1830456] [NEW] dvr router slow response during port update

2019-05-24 Thread norman shen
Public bug reported:

We are having a distributed router which used by hundreds of virtual
machines scattered across around 150 compute nodes. When nova sends port
update request to neutron, it will generally taking nearly 4 min to
complete.

Neutron version is openstack Queens 12.0.5.

I found the following log entries printed by neutron-server,

2019-05-25 05:24:16,285.285 11834 INFO neutron.wsgi [req- x -
default default] x.x.x.x "PUT
/v2.0/ports/8c252d91-741a-4627-9600-916d1da5178f HTTP/1.1" status: 200
len: 0 time: 233.6103470

You can see it takes around 240 seconds to finish request.

Right now I am suspecting this code snippet
https://github.com/openstack/neutron/blob/de59a21754747335d0d9d26082c7f0df105a30c9/neutron/db/l3_dvrscheduler_db.py#L139
leads to the issue.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1830456

Title:
  dvr router slow response during port update

Status in neutron:
  New

Bug description:
  We are having a distributed router which used by hundreds of virtual
  machines scattered across around 150 compute nodes. When nova sends
  port update request to neutron, it will generally taking nearly 4 min
  to complete.

  Neutron version is openstack Queens 12.0.5.

  I found the following log entries printed by neutron-server,

  2019-05-25 05:24:16,285.285 11834 INFO neutron.wsgi [req- x -
  default default] x.x.x.x "PUT
  /v2.0/ports/8c252d91-741a-4627-9600-916d1da5178f HTTP/1.1" status: 200
  len: 0 time: 233.6103470

  You can see it takes around 240 seconds to finish request.

  Right now I am suspecting this code snippet
  
https://github.com/openstack/neutron/blob/de59a21754747335d0d9d26082c7f0df105a30c9/neutron/db/l3_dvrscheduler_db.py#L139
  leads to the issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1830456/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1824248] Re: Security Group filtering hides rules from user

2019-05-24 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/660174
Committed: 
https://git.openstack.org/cgit/openstack/neutron/commit/?id=1920a37a94b7a9589dcf83f6ff0765068560dbf8
Submitter: Zuul
Branch:master

commit 1920a37a94b7a9589dcf83f6ff0765068560dbf8
Author: Slawek Kaplonski 
Date:   Mon May 20 18:47:18 2019 +0200

Show all SG rules belong to SG in group's details

If security group contains rule(s) which were created by different
user (admin), owner of this security group should see such rules
even if those rules don't belong to him.

This patch changes to use admin_context to get security group rules
in get_security_group() method to achieve that.

Test to cover such case is added in neutron-tempest-plugin repo.

Change-Id: I890c81bb6eabc5caa620ed4fcc4dc88ebfa6e1b0
Closes-Bug: #1824248


** Changed in: neutron
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1824248

Title:
  Security Group filtering hides rules from user

Status in neutron:
  Fix Released
Status in OpenStack Security Advisory:
  Won't Fix

Bug description:
  Manage Rules part of the GUI hides the rules currently visible in the
  Launch Instance modal window.

  It allows a malicious admin to add backdoor access rules that might be
  later added to VMs without the knowledge of owner of those VMs.

  When sending GET request as below, it responds only with the rules
  that are created by user and this happens when using Manage Rules part
  of the GUI: 

  On the other hand when using GET request as below, it responds with
  all SG and it includes all rules, and there is no filtering and this
  is used in Launch Instance modal window: 

  Here is example of rules display in Manage Rules part of GUI:

  > 
/opt/stack/horizon/openstack_dashboard/dashboards/project/security_groups/views.py(50)_get_data()
  -> return api.neutron.security_group_get(self.request, sg_id)
  (Pdb) l
   45 @memoized.memoized_method
   46 def _get_data(self):
   47 sg_id = 
filters.get_int_or_uuid(self.kwargs['security_group_id'])
   48 try:
   49 from remote_pdb import RemotePdb; RemotePdb('127.0.0.1', 
444).set_trace()
   50  -> return api.neutron.security_group_get(self.request, sg_id)
   51 except Exception:
   52 redirect = 
reverse('horizon:project:security_groups:index')
   53 exceptions.handle(self.request,
   54   _('Unable to retrieve security group.'),
   55   redirect=redirect)
  (Pdb) p api.neutron.security_group_get(self.request, sg_id)
  , , , ]}>
  (Pdb)

  (Pdb) p self.request
  

  As you might have noticed there are no ports access 44 and 22 (SSH)

  And from the Launch Instance Modal Window, as well as CLI we can see
  that there are two more rules that are invisible for user, port 44 and
  22 (SSH) as displayed below:

  > /opt/stack/horizon/openstack_dashboard/api/rest/network.py(47)get()
  -> return {'items': [sg.to_dict() for sg in security_groups]}
  (Pdb) l
   42 """
   43
   44 security_groups = api.neutron.security_group_list(request)
   45 from remote_pdb import RemotePdb; RemotePdb('127.0.0.1', 
444).set_trace()
   46
   47  -> return {'items': [sg.to_dict() for sg in security_groups]}
   48
   49
   50 @urls.register
   51 class FloatingIP(generic.View):
   52 """API for a single floating IP address."""
  (Pdb) p security_groups
  [, , , , , ]}>]
  (Pdb)

  (Pdb) p request
  

  Thank you,
  Robin

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1824248/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1829889] Re: _assert_ipv6_accept_ra method should wait until proper settings will be configured

2019-05-24 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/660690
Committed: 
https://git.openstack.org/cgit/openstack/neutron/commit/?id=62b2f2b1b1e2d8c0c2ffc1fd2ae9467eb2c1ef07
Submitter: Zuul
Branch:master

commit 62b2f2b1b1e2d8c0c2ffc1fd2ae9467eb2c1ef07
Author: Slawek Kaplonski 
Date:   Wed May 22 13:49:55 2019 +0200

Wait to ipv6 accept_ra be really changed by L3 agent

In functional tests for L3 HA agent, like e.g.
L3HATestFailover.test_ha_router_failover
it may happen that L3 agent will not change ipv6 accept_ra
knob and test fails because it checks that only once just
after router state is change.

This patch fixes that race by adding wait for 60 seconds to
ipv6 accept_ra change.

Change-Id: I459ce4b791c27b1e3d977e0de9fbdb21a8a379f5
Closes-Bug: #1829889


** Changed in: neutron
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1829889

Title:
  _assert_ipv6_accept_ra method should wait until proper settings will
  be configured

Status in neutron:
  Fix Released

Bug description:
  This method is defined in
  neutron/tests/functional/agent/l3/framework.py and it should use
  wait_until_true to avoid potential race conditions between test
  assertions and what L3 agent is doing.

  It seems that e.g. in http://logs.openstack.org/61/659861/1/check
  /neutron-functional/3708673/testr_results.html.gz there was such race
  which caused test failure.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1829889/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1809095] Fix merged to nova (master)

2019-05-24 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/643023
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=5a1c385b996090b80f5881680e04c88abc21828a
Submitter: Zuul
Branch:master

commit 5a1c385b996090b80f5881680e04c88abc21828a
Author: Adrian Chiris 
Date:   Tue Mar 12 14:19:04 2019 +0200

Move get_pci_mapping_for_migration to MigrationContext

In order to fix Bug #1809095, it is required to update
PCI related VIFs with the original PCI address on the source
host to allow virt driver to properly unplug the VIF from hypervisor,
e.g allow the proper VF representor to be unplugged
from the integration bridge in case of a hardware offloaded OVS.

To do so, some preliminary work is needed to allow code-sharing
between nova.network.neutronv2 and nova.compute.manager

This change:
- Moves common logic to retrieve the PCI mapping between
  the source and destination node from nova.network.neutronv2
  to objects.migration_context.
- Makes code adjustments to methods in nova.network.neutronv2
  to accomodate the former.

Change-Id: I9a5118373548c525b2b1c2271e7d210cc92e4f4c
Partial-Bug: #1809095


** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1809095

Title:
  Wrong representor port was unplugged from OVS during cold migration

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Description
  ===
  Wrong representor port was unplugged from OVS during cold migration.
  This happens when VM is scheduled to use a different PCI device to target 
host vs.
  to what PCI device it is using from source host. Nova uses new PCI device 
information to unplug
  representor port in source compute.

  
  Steps to reproduce
  ==
  1. Create representor ports
  $ openstack port create --network private --vnic-type=direct 
--binding-profile '{"capabilities": ["switchdev"]}' direct_port1 
  $ openstack port create --network private --vnic-type=direct 
--binding-profile '{"capabilities": ["switchdev"]}' direct_port2
  2. Create VMs using the ports created above:
  openstack server create --flavor m1.small --image fedora24 --nic 
port-id=direct_port1 --availability-zone=nova:compute-1 vm1 
  openstack server create --flavor m1.small --image fedora24 --nic 
port-id=direct_port2 --availability-zone=nova:compute-2 vm2
  3. Migrate VM2
  $ openstack server migrate vm2
  $ openstack server resize --confirm vm2
  4. VM2 was migrated to compute-1, however representor port is still attached 
to OVS 
  $ sudo ovs-dpctl show
  system@ovs-system:
  lookups: hit:466465 missed:5411 lost:0
  flows: 12
  masks: hit:739146 total:2 hit/pkt:1.57
  port 0: ovs-system (internal)
  port 1: br-pro0.0 (internal)
  port 2: br-pro0 (internal)
  port 3: ens6f0
  port 4: br-int (internal)
  port 5: eth3

  Expected result
  ===
  After cold migration, VM's previously used representor port should be 
unplugged from OVS

  Actual result
  =
  VM's previously used representor port is still plugged in source host. In 
some scenarios, wrong representor port was unplugged from source host. Thus 
affecting VMs that were not cold migrated.

  Environment
  ===
  Libvirt+KVM
  $ /usr/libexec/qemu-kvm --version
  QEMU emulator version 2.10.0
  $ virsh --version
  3.9.0
  Neutron+OVS HW Offload 
  Openstack Queens openstack-nova-compute-17.0.7-1

  
  Logs & Configs
  ==
  1. Plug vif device using pci address :81:00.5
  2018-12-15 13:12:04.871 108055 DEBUG os_vif 
[req-cd20d9ab-e880-41fa-aee5-97b920abcf77 dd9f16f6b15740e181c9b7cf8ee5795c 
52298dbce7024cf89ca9e6d7369a67de - default default] Plugging vif 
VIFHostDevice(active=False,address=fa:16:3e:1b:0a:21,dev_address=:81:00.5,dev_type='ethernet',has_traffic_filtering=True,id=38609ab2-cf36-4782-83c7-7ee2d5c1c163,network=Network(bd30c752-4876-498b-9a36-e9733b635f4f),plugin='ovs',port_profile=VIFPortProfileOVSRepresentor,preserve_on_delete=True)
 plug /usr/lib/python2.7/site-packages/os_vif/__init__.py:76

  2. VM was migrated from compute-1 to compute-2. New pci device is now 
:81:00.4
  2018-12-15 13:13:58.721 108055 DEBUG os_vif 
[req-afd99706-cf49-4c20-b85b-ea4d990ffbb4 dd9f16f6b15740e181c9b7cf8ee5795c 
52298dbce7024cf89ca9e6d7369a67de - default default] Unplugging vif 
VIFHostDevice(active=True,address=fa:16:3e:1b:0a:21,dev_address=:81:00.4,dev_type='ethernet',has_traffic_filtering=True,id=38609ab2-cf36-4782-83c7-7ee2d5c1c163,network=Network(bd30c752-4876-498b-9a36-e9733b635f4f),plugin='ovs',port_profile=VIFPortProfileOVSRepresentor,preserve_on_delete=True)
 unplug /usr/lib/python2.7/site-packages/os_vif/__init__.py:109
  2018-12-15 13:13:58.759 108055 INFO os_vif 

[Yahoo-eng-team] [Bug 1645824] Re: NoCloud source doesn't work on FreeBSD

2019-05-24 Thread Server Team CI bot
This bug is fixed with commit 0f869532 to cloud-init on branch master.
To view that commit see the following URL:
https://git.launchpad.net/cloud-init/commit/?id=0f869532


** Changed in: cloud-init
   Status: Fix Released => Fix Committed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1645824

Title:
  NoCloud source doesn't work on FreeBSD

Status in cloud-init:
  Fix Committed

Bug description:
  Hey guys,

  I'm trying to use cloud-init on FreeBSD using CD to seed metadata, the
  thing is that it had some issues:

  - Mount option 'sync' is not allowed for cd9660 filesystem.
  - I optimized the list of filesystems that needed to be scanned for metadata 
by having three lists (vfat, iso9660, and label list) and then checking against 
them to see which filesystem option needs to be passed to mount command.

  Additionally I'm going to push some changes to FreeBSD cloud-init
  package so it can build last version. I will open another ticket for
  fixing networking in FreeBSD as it doesn't support sysfs
  (/sys/class/net/) by default.

  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1645824/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1830438] [NEW] Hard deleting instance does not take into account soft-deleted referential constraints

2019-05-24 Thread Matt Riedemann
Public bug reported:

The instance hard delete code is new in Train but has a bug noted here:

https://review.opendev.org/#/c/570202/8/nova/db/sqlalchemy/api.py@1804

The hard delete of the instance can fail if there are related soft-
deleted records (like detached volumes [bdms]), because I hit this in a
gate run of the cross-cell resize stuff:

http://paste.openstack.org/show/752057/

'Cannot delete or update a parent row: a foreign key constraint fails
(`nova_cell2`.`block_device_mapping`, CONSTRAINT
`block_device_mapping_instance_uuid_fkey` FOREIGN KEY (`instance_uuid`)
REFERENCES `instances` (`uuid`))') [SQL: 'DELETE FROM instances WHERE
instances.uuid = %(uuid_1)s'] [parameters: {'uuid_1': '4b8a12c4-e28a-
49cc-a681-236c1e8a174c'}]

** Affects: nova
 Importance: Medium
 Assignee: Matt Riedemann (mriedem)
 Status: In Progress


** Tags: db

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1830438

Title:
  Hard deleting instance does not take into account soft-deleted
  referential constraints

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  The instance hard delete code is new in Train but has a bug noted
  here:

  https://review.opendev.org/#/c/570202/8/nova/db/sqlalchemy/api.py@1804

  The hard delete of the instance can fail if there are related soft-
  deleted records (like detached volumes [bdms]), because I hit this in
  a gate run of the cross-cell resize stuff:

  http://paste.openstack.org/show/752057/

  'Cannot delete or update a parent row: a foreign key constraint fails
  (`nova_cell2`.`block_device_mapping`, CONSTRAINT
  `block_device_mapping_instance_uuid_fkey` FOREIGN KEY
  (`instance_uuid`) REFERENCES `instances` (`uuid`))') [SQL: 'DELETE
  FROM instances WHERE instances.uuid = %(uuid_1)s'] [parameters:
  {'uuid_1': '4b8a12c4-e28a-49cc-a681-236c1e8a174c'}]

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1830438/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1830417] Re: NoVNCConsoleTestJSON.test_novnc fails in nova-multi-cell job since 5/20

2019-05-24 Thread melanie witt
** Also affects: devstack
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1830417

Title:
  NoVNCConsoleTestJSON.test_novnc fails in nova-multi-cell job since
  5/20

Status in devstack:
  In Progress
Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Ever since we enabled the n-novnc service in the nova-multi-cell job
  on May 20:

  
https://github.com/openstack/nova/commit/c5b83c3fbca83726f4a956009e1788d26bcedde0
  #diff-7415f5ff7beee2cdf9ffe31e12e4c086

  The
  tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc
  test has been intermittently failing like this:

  2019-05-24 01:55:59.786818 | controller | {2} 
tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc 
[0.870805s] ... FAILED
  2019-05-24 01:55:59.787151 | controller |
  2019-05-24 01:55:59.787193 | controller | Captured traceback:
  2019-05-24 01:55:59.787226 | controller | ~~~
  2019-05-24 01:55:59.787271 | controller | b'Traceback (most recent call 
last):'
  2019-05-24 01:55:59.787381 | controller | b'  File 
"/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 194, in 
test_novnc'
  2019-05-24 01:55:59.787450 | controller | b'
self._validate_rfb_negotiation()'
  2019-05-24 01:55:59.787550 | controller | b'  File 
"/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 92, in 
_validate_rfb_negotiation'
  2019-05-24 01:55:59.787643 | controller | b"'Token must be invalid 
because the connection '"
  2019-05-24 01:55:59.787748 | controller | b'  File 
"/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/unittest2/case.py",
 line 696, in assertFalse'
  2019-05-24 01:55:59.787796 | controller | b'raise 
self.failureException(msg)'
  2019-05-24 01:55:59.787894 | controller | b'AssertionError: True is not 
false : Token must be invalid because the connection closed.'
  2019-05-24 01:55:59.787922 | controller | b''

  
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22b'AssertionError%3A%20True%20is%20not%20false%20%3A%20Token%20must%20be%20invalid%20because%20the%20connection%20closed.'%5C%22%20AND%20tags%3A%5C%22console%5C%22=7d

  My guess would be (without checking the test or the code) that
  something isn't properly routing console auth token
  information/requests to the correct cell which is why we don't see
  this in a "single" cell job.

To manage notifications about this bug go to:
https://bugs.launchpad.net/devstack/+bug/1830417/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1830417] [NEW] NoVNCConsoleTestJSON.test_novnc fails in nova-multi-cell job since 5/20

2019-05-24 Thread Matt Riedemann
Public bug reported:

Ever since we enabled the n-novnc service in the nova-multi-cell job on
May 20:

https://github.com/openstack/nova/commit/c5b83c3fbca83726f4a956009e1788d26bcedde0
#diff-7415f5ff7beee2cdf9ffe31e12e4c086

The
tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc
test has been intermittently failing like this:

2019-05-24 01:55:59.786818 | controller | {2} 
tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc 
[0.870805s] ... FAILED
2019-05-24 01:55:59.787151 | controller |
2019-05-24 01:55:59.787193 | controller | Captured traceback:
2019-05-24 01:55:59.787226 | controller | ~~~
2019-05-24 01:55:59.787271 | controller | b'Traceback (most recent call 
last):'
2019-05-24 01:55:59.787381 | controller | b'  File 
"/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 194, in 
test_novnc'
2019-05-24 01:55:59.787450 | controller | b'
self._validate_rfb_negotiation()'
2019-05-24 01:55:59.787550 | controller | b'  File 
"/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 92, in 
_validate_rfb_negotiation'
2019-05-24 01:55:59.787643 | controller | b"'Token must be invalid 
because the connection '"
2019-05-24 01:55:59.787748 | controller | b'  File 
"/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/unittest2/case.py",
 line 696, in assertFalse'
2019-05-24 01:55:59.787796 | controller | b'raise 
self.failureException(msg)'
2019-05-24 01:55:59.787894 | controller | b'AssertionError: True is not 
false : Token must be invalid because the connection closed.'
2019-05-24 01:55:59.787922 | controller | b''

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22b'AssertionError%3A%20True%20is%20not%20false%20%3A%20Token%20must%20be%20invalid%20because%20the%20connection%20closed.'%5C%22%20AND%20tags%3A%5C%22console%5C%22=7d

My guess would be (without checking the test or the code) that something
isn't properly routing console auth token information/requests to the
correct cell which is why we don't see this in a "single" cell job.

** Affects: nova
 Importance: Medium
 Status: Confirmed


** Tags: cells consoles gate-failure

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1830417

Title:
  NoVNCConsoleTestJSON.test_novnc fails in nova-multi-cell job since
  5/20

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Ever since we enabled the n-novnc service in the nova-multi-cell job
  on May 20:

  
https://github.com/openstack/nova/commit/c5b83c3fbca83726f4a956009e1788d26bcedde0
  #diff-7415f5ff7beee2cdf9ffe31e12e4c086

  The
  tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc
  test has been intermittently failing like this:

  2019-05-24 01:55:59.786818 | controller | {2} 
tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc 
[0.870805s] ... FAILED
  2019-05-24 01:55:59.787151 | controller |
  2019-05-24 01:55:59.787193 | controller | Captured traceback:
  2019-05-24 01:55:59.787226 | controller | ~~~
  2019-05-24 01:55:59.787271 | controller | b'Traceback (most recent call 
last):'
  2019-05-24 01:55:59.787381 | controller | b'  File 
"/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 194, in 
test_novnc'
  2019-05-24 01:55:59.787450 | controller | b'
self._validate_rfb_negotiation()'
  2019-05-24 01:55:59.787550 | controller | b'  File 
"/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 92, in 
_validate_rfb_negotiation'
  2019-05-24 01:55:59.787643 | controller | b"'Token must be invalid 
because the connection '"
  2019-05-24 01:55:59.787748 | controller | b'  File 
"/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/unittest2/case.py",
 line 696, in assertFalse'
  2019-05-24 01:55:59.787796 | controller | b'raise 
self.failureException(msg)'
  2019-05-24 01:55:59.787894 | controller | b'AssertionError: True is not 
false : Token must be invalid because the connection closed.'
  2019-05-24 01:55:59.787922 | controller | b''

  
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22b'AssertionError%3A%20True%20is%20not%20false%20%3A%20Token%20must%20be%20invalid%20because%20the%20connection%20closed.'%5C%22%20AND%20tags%3A%5C%22console%5C%22=7d

  My guess would be (without checking the test or the code) that
  something isn't properly routing console auth token
  information/requests to the correct cell which is why we don't see
  this in a "single" cell job.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1830417/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : 

[Yahoo-eng-team] [Bug 1830295] Re: devstack py3 get_link_devices() KeyError: 'index'

2019-05-24 Thread iain MacDonnell
Yeah... downgrading oslo.privsep from 1.33.0 to 1.32.1 makes the problem
go away.

** Also affects: oslo.privsep
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1830295

Title:
  devstack py3 get_link_devices() KeyError: 'index'

Status in neutron:
  New
Status in oslo.privsep:
  New

Bug description:
  devstack master with py3. openvswitch agent has suddenly stopped
  working, with no change in config or environment (other than
  rebuilding devstack). Stack trace below. For some reason (yet
  undetermined), privileged.get_link_devices() now seems to be returning
  byte arrays instead of strings as the dict keys:

  >>> from neutron.privileged.agent.linux import ip_lib as privileged
  >>> privileged.get_link_devices(None)[0].keys() 
  dict_keys([b'index', b'family', b'__align', b'header', b'flags', b'ifi_type', 
b'event', b'change', b'attrs'])
  >>> 

  
  From agent startup:

  neutron-openvswitch-agent[42936]: ERROR neutron Traceback (most recent call 
last):
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/usr/local/bin/neutron-openvswitch-agent", line 10, in 
  neutron-openvswitch-agent[42936]: ERROR neutron sys.exit(main())
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/opt/stack/neutron/neutron/cmd/eventlet/plugins/ovs_neutron_agent.py", line 
20, in main
  neutron-openvswitch-agent[42936]: ERROR neutron agent_main.main()
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/main.py", 
line 47, in main
  neutron-openvswitch-agent[42936]: ERROR neutron mod.main()
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/main.py",
 line 35, in main
  neutron-openvswitch-agent[42936]: ERROR neutron 
'neutron.plugins.ml2.drivers.openvswitch.agent.'
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/usr/local/lib/python3.6/dist-packages/os_ken/base/app_manager.py", line 375, 
in run_apps
  neutron-openvswitch-agent[42936]: ERROR neutron hub.joinall(services)  
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/usr/local/lib/python3.6/dist-packages/os_ken/lib/hub.py", line 102, in joinall
  neutron-openvswitch-agent[42936]: ERROR neutron t.wait()
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/usr/local/lib/python3.6/dist-packages/eventlet/greenthread.py", line 180, in 
wait
  neutron-openvswitch-agent[42936]: ERROR neutron return 
self._exit_event.wait()
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/usr/local/lib/python3.6/dist-packages/eventlet/event.py", line 132, in wait
  neutron-openvswitch-agent[42936]: ERROR neutron current.throw(*self._exc)
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/usr/local/lib/python3.6/dist-packages/eventlet/greenthread.py", line 219, in 
main
  neutron-openvswitch-agent[42936]: ERROR neutron result = function(*args, 
**kwargs)
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/usr/local/lib/python3.6/dist-packages/os_ken/lib/hub.py", line 64, in _launch
  neutron-openvswitch-agent[42936]: ERROR neutron raise e
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/usr/local/lib/python3.6/dist-packages/os_ken/lib/hub.py", line 59, in _launch
  neutron-openvswitch-agent[42936]: ERROR neutron return func(*args, 
**kwargs)
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_oskenapp.py",
 line 43, in agent_main_wrapper
  neutron-openvswitch-agent[42936]: ERROR neutron LOG.exception("Agent main 
thread died of an exception")
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
  neutron-openvswitch-agent[42936]: ERROR neutron self.force_reraise() 
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
  neutron-openvswitch-agent[42936]: ERROR neutron six.reraise(self.type_, 
self.value, self.tb)
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/usr/local/lib/python3.6/dist-packages/six.py", line 693, in reraise
  neutron-openvswitch-agent[42936]: ERROR neutron raise value
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_oskenapp.py",
 line 40, in agent_main_wrapper
  neutron-openvswitch-agent[42936]: ERROR neutron 
ovs_agent.main(bridge_classes)
  neutron-openvswitch-agent[42936]: ERROR neutron   File 
"/opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py",
 line 2393, in main
  

[Yahoo-eng-team] [Bug 1751192] Re: nova-manage archive_deleted_rows date limit

2019-05-24 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/556751
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=e822360b6696c492bb583240483ee9593d7d24e1
Submitter: Zuul
Branch:master

commit e822360b6696c492bb583240483ee9593d7d24e1
Author: Jake Yip 
Date:   Tue Feb 20 16:14:10 2018 +1100

Add --before to nova-manage db archive_deleted_rows

Add a parameter to limit the archival of deleted rows by date. That is,
only rows related to instances deleted before provided date will be
archived.

This option works together with --max_rows, if both are specified both
will take effect.

Closes-Bug: #1751192
Change-Id: I408c22d8eada0518ec5d685213f250e8e3dae76e
Implements: blueprint nova-archive-before


** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1751192

Title:
  nova-manage archive_deleted_rows date limit

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Description
  ===

  Currently we have a large number of rows in our nova databases, which
  will greatly benefit from `nova-manage archive_deleted_rows`
  (thanks!). What we would like to do is to archive all deleted records
  before a certain time (say >1 year ago). This will allow us to
  continue running reports on newly deleted instances, and allow `nova
  list --deleted` to work (up to a certain period).

  Reading the code, however, reveals that there is no ability to do
  that. Currently, it has a --max-rows, but there are certain
  shortcomings with this option

  1) related records are archived inconsistently. Due to foreign keys,
  it has to archive fk tables first. It will take up to `--max-rows`
  from the first table it encounters, working its way through all tables
  eventually reaching `instances` table last. What this means, is that
  instances is always archived last. An instance might have all of it's
  information in fk tables archived before itself is.

  2) there is no ability to keep records up to certain timerange ago.

  We are working on an in-house patch to achieve this. If this is of use
  to the community I'd be happy to work on this to be included upstream.

  Environment
  ===
  We are running Newton Nova

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1751192/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1830232] Re: [Functional tests] Keepalived fails to start when not existing interfaces are set in config file

2019-05-24 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/661042
Committed: 
https://git.openstack.org/cgit/openstack/neutron/commit/?id=959af761cb1197bbeaed4ba1f0c3e5ef4aba3ee1
Submitter: Zuul
Branch:master

commit 959af761cb1197bbeaed4ba1f0c3e5ef4aba3ee1
Author: Slawek Kaplonski 
Date:   Thu May 23 17:08:56 2019 +0200

[Functional tests] Test keepalived in namespaces

Functional tests for keepalived should spawn processes in namespaces
where dummy interfaces used in keepalived.conf file exists.
Otherwise keepalived 2.0.10 (this is version used currently in RHEL 8)
fails to start and tests are failing.

On older versions of keepalived, like 1.3.9 used in Ubuntu 18.04,
keepalived is logging warning about not existing interfaces but it's
starting fine thus tests are running properly.

So this patch adds creation of namespace for each test from
neutron.tests.functional.agent.linux.test_keepalived module,
creates dummy interfaces with names used in keepalived config file
and runs keepalive process in this namespace.

Change-Id: I54f45b8c52fc1ecce811b028f0f92e0d78d3157b
Closes-Bug: #1830232


** Changed in: neutron
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1830232

Title:
  [Functional tests] Keepalived fails to start when not existing
  interfaces are set in config file

Status in neutron:
  Fix Released

Bug description:
  It looks that when not existing interfaces are given in
  keepalived.conf file, keepalived may not start properly.

  I saw that when running functional tests from module 
neutron.tests.functional.agent.linux.test_keepalived.KeepalivedManagerTestCase 
on RHEL 8 where keepalived 2.0.10 is used.
  I saw in logs something like:

  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: Registering Kernel netlink reflector
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: Registering Kernel netlink command channel
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: Opening file 
'/tmp/tmpo_he5agd/tmpnhyku1i8/router1/keepalived.conf'.
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 7) WARNING - interface eth0 for vrrp_instance 
VR_1 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 17) (VR_1) tracked interface eth0 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 20) WARNING - interface eth0 for ip address 
169.254.0.1/24 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 23) WARNING - interface eth1 for ip address 
192.168.1.0/24 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 24) WARNING - interface eth2 for ip address 
192.168.2.0/24 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 25) WARNING - interface eth2 for ip address 
192.168.3.0/24 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 26) WARNING - interface eth10 for ip address 
192.168.55.0/24 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 29) WARNING - interface eth1 for VROUTE nexthop 
doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 34) WARNING - interface eth4 for vrrp_instance 
VR_2 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 40) (VR_2) tracked interface eth4 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 43) WARNING - interface eth4 for ip address 
169.254.0.2/24 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 46) WARNING - interface eth2 for ip address 
192.168.2.0/24 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 47) WARNING - interface eth6 for ip address 
192.168.3.0/24 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: (Line 48) WARNING - interface eth10 for ip address 
192.168.55.0/24 doesn't exist
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: Non-existent interface specified in configuration
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 
Keepalived_vrrp[11267]: Stopped - used 0.000608 user time, 0.00 system time
  maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived[11266]: 
Keepalived_vrrp exited with permanent error CONFIG. Terminating
  maj 23 10:10:16 

[Yahoo-eng-team] [Bug 1609217] Re: DVR: dvr router ns should not exist in scheduled DHCP agent nodes

2019-05-24 Thread OpenStack Infra
** Changed in: neutron
   Status: Opinion => In Progress

** Changed in: neutron
 Assignee: (unassigned) => LIU Yulong (dragon889)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1609217

Title:
  DVR: dvr router ns should not exist in scheduled DHCP agent nodes

Status in neutron:
  In Progress

Bug description:
  ENV:
  stable/mitaka
  hosts:
  compute1 (nova-compute, l3-agent (dvr), metedate-agent)
  compute2 (nova-compute, l3-agent (dvr), metedate-agent)
  network1 (l3-agent (dvr_snat), metedata-agent, dhcp-agent)
  network2 (l3-agent(dvr_snat), metedata-agent, dhcp-agent)

  How to reproduce? (scenario 1)
  set: dhcp_agents_per_network = 2

  1. create a DVR router:
  neutron router-create --ha False --distributed True test1

  2. Create a network & subnet with dhcp enabled.
  neutron net-create test1
  neutron subnet-create --enable-dhcp test1 --name test1 192.168.190.0/24

  3. Attach the router and subnet
  neutron router-interface-add test1 subnet=test1

  Then the router test1 will exist in both network1 and network2. But in
  the DB routerl3agentbindings, there is only one record for DVR router
  to one l3 agent.

  http://paste.openstack.org/show/547695/

  And for another scenario 2:
  change the network2 node deployment to only run metedata-agent, dhcp-agent.
  Both in the qdhcp-namespace and the VM could ping each other.
  So qrouter-namespace in the not-binded network node is not used, and should 
not exist.

  Code:
  The essential code issue may be DHCP port should not be considered in DVR 
host query.
  https://github.com/openstack/neutron/blob/master/neutron/common/utils.py#L258

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1609217/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1830383] [NEW] SRIOV: MAC address in use error

2019-05-24 Thread Oleg Bondarev
Public bug reported:

When using direct-physical port, the port inherits physical device MAC address 
on binding.
When deleting the VM later - MAC address stays.
If try spawn a VM with another direct-physical port - we have "Neutron error: 
MAC address 0c:c4:7a:de:ae:19 is already in use on network None.: 
MacAddressInUseClient: Unable to complete operation for network 
42915db3-4e46-4150-af9d-86d0c59d765f. The mac address 0c:c4:7a:de:ae:19 is in 
use."

The proposal is to reset port's MAC address when unbinding.

** Affects: neutron
 Importance: Undecided
 Assignee: Oleg Bondarev (obondarev)
 Status: In Progress


** Tags: sriov-pci-pt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1830383

Title:
  SRIOV: MAC address in use error

Status in neutron:
  In Progress

Bug description:
  When using direct-physical port, the port inherits physical device MAC 
address on binding.
  When deleting the VM later - MAC address stays.
  If try spawn a VM with another direct-physical port - we have "Neutron error: 
MAC address 0c:c4:7a:de:ae:19 is already in use on network None.: 
MacAddressInUseClient: Unable to complete operation for network 
42915db3-4e46-4150-af9d-86d0c59d765f. The mac address 0c:c4:7a:de:ae:19 is in 
use."

  The proposal is to reset port's MAC address when unbinding.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1830383/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1829828] Re: instance became error after a set-password failure

2019-05-24 Thread Brin Zhang
** Package changed: nova (Ubuntu) => ubuntu

** Package changed: ubuntu => nova

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1829828

Title:
  instance became error after a set-password failure

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Description
  ===
  Hi guys, i ran into a problem in our OpenStack Ocata/Rocky clusters:
  When i was trying to use `set-password` subcommand of nova CLI to reset
  root password for my VM, it failed and my VM became error.

  I searched launchpad for similar issues, but got nothing. I believe the
  problem may also exist in latest OpenStack distro.

  Steps to reproduce
  ==
  * Upload any image(without QGA inside), e.g: cirros
  * Update the image with property: hw_qemu_guest_agent=yes
$ glance image-update --property hw_qemu_guest_agent=yes 
  * Boot new instance (e.g: QGA) with image cirros and ensure instance is 
active/running.
  * Use cli `nova set-password ` to reset password for the 
instance.

  Expected result
  ===
  Error Messages like 'QGA not running' occur.
  Instance becomes active/ruuning again from task_state `updating_password`.

  Actual result
  =
  CLI returns with: Failed to set admin password on XX became error setting 
admin password
  (HTTP 409)(Request-ID: req-X)
  And instance became error.

  Environment
  ===
  1. version: OpenStack Ocata/Rocky + centOS7
  2. hypervisor: Libvirt + KVM
  3. storage: Ceph
  4. networking Neutron with OpenVSwitch

  Logs & Configs
  ==

   Nova CLI error #
  [root@node159 ~]# nova set-password f355e4d0-574c-4792-bbbd-04ad03ce6066
  New password: 
  Again: 
  ERROR (Conflict): Failed to set admin password on 
f355e4d0-574c-4792-bbbd-04ad03ce6066 because error setting 
  admin password (HTTP 409) (Request-ID: 
req-34715791-f42a-4235-98d5-f69680440fc8)

  # Grep nova-compute errors by Instance UUID #
  23698  2019-05-21 14:53:50.355 7 INFO nova.compute.manager 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Enter manager build_and_run_instance
  23699  2019-05-21 14:53:50.521 7 INFO nova.compute.manager 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Enter manager _build_and_run_instance
  23700  2019-05-21 14:53:50.546 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Attempting claim: memory 2048 MB, disk 1 
GB, vcpus 2 CPU
  23701  2019-05-21 14:53:50.547 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Total memory: 65417 MB, used: 37568.00 MB
  23702  2019-05-21 14:53:50.548 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] memory limit: 52333.60 MB, free: 14765.60 
MB
  23703  2019-05-21 14:53:50.548 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Total disk: 3719 GB, used: 285.00 GB
  23704  2019-05-21 14:53:50.549 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] disk limit: 3719.00 GB, free: 3434.00 GB
  23705  2019-05-21 14:53:50.550 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Total vcpu: 16 VCPU, used: 41.00 VCPU
  23706  2019-05-21 14:53:50.550 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] vcpu limit not specified, defaulting to 
unlimited
  23707  2019-05-21 14:53:50.552 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Claim successful
  23708  2019-05-21 14:53:50.762 7 INFO nova.scheduler.client.report 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] 

[Yahoo-eng-team] [Bug 1829828] [NEW] instance became error after a set-password failure

2019-05-24 Thread Launchpad Bug Tracker
You have been subscribed to a public bug:

Description
===
Hi guys, i ran into a problem in our OpenStack Ocata/Rocky clusters:
When i was trying to use `set-password` subcommand of nova CLI to reset
root password for my VM, it failed and my VM became error.

I searched launchpad for similar issues, but got nothing. I believe the
problem may also exist in latest OpenStack distro.

Steps to reproduce
==
* Upload any image(without QGA inside), e.g: cirros
* Update the image with property: hw_qemu_guest_agent=yes
  $ glance image-update --property hw_qemu_guest_agent=yes 
* Boot new instance (e.g: QGA) with image cirros and ensure instance is 
active/running.
* Use cli `nova set-password ` to reset password for the instance.

Expected result
===
Error Messages like 'QGA not running' occur.
Instance becomes active/ruuning again from task_state `updating_password`.

Actual result
=
CLI returns with: Failed to set admin password on XX became error setting 
admin password
(HTTP 409)(Request-ID: req-X)
And instance became error.

Environment
===
1. version: OpenStack Ocata/Rocky + centOS7
2. hypervisor: Libvirt + KVM
3. storage: Ceph
4. networking Neutron with OpenVSwitch

Logs & Configs
==

 Nova CLI error #
[root@node159 ~]# nova set-password f355e4d0-574c-4792-bbbd-04ad03ce6066
New password: 
Again: 
ERROR (Conflict): Failed to set admin password on 
f355e4d0-574c-4792-bbbd-04ad03ce6066 because error setting 
admin password (HTTP 409) (Request-ID: req-34715791-f42a-4235-98d5-f69680440fc8)

# Grep nova-compute errors by Instance UUID #
23698  2019-05-21 14:53:50.355 7 INFO nova.compute.manager 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Enter manager build_and_run_instance
23699  2019-05-21 14:53:50.521 7 INFO nova.compute.manager 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Enter manager _build_and_run_instance
23700  2019-05-21 14:53:50.546 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Attempting claim: memory 2048 MB, disk 1 
GB, vcpus 2 CPU
23701  2019-05-21 14:53:50.547 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Total memory: 65417 MB, used: 37568.00 MB
23702  2019-05-21 14:53:50.548 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] memory limit: 52333.60 MB, free: 14765.60 
MB
23703  2019-05-21 14:53:50.548 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Total disk: 3719 GB, used: 285.00 GB
23704  2019-05-21 14:53:50.549 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] disk limit: 3719.00 GB, free: 3434.00 GB
23705  2019-05-21 14:53:50.550 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Total vcpu: 16 VCPU, used: 41.00 VCPU
23706  2019-05-21 14:53:50.550 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] vcpu limit not specified, defaulting to 
unlimited
23707  2019-05-21 14:53:50.552 7 INFO nova.compute.claims 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Claim successful
23708  2019-05-21 14:53:50.762 7 INFO nova.scheduler.client.report 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Submitted allocation for instance
23709  2019-05-21 14:53:50.913 7 INFO nova.compute.manager 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 
380f701f5575430195526229dc143a1f - - -] [instance: 
f355e4d0-574c-4792-bbbd-04ad03ce6066] Enter manager _build_resources
23713  2019-05-21 14:53:51.430 7 INFO nova.virt.libvirt.driver 
[req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 

[Yahoo-eng-team] [Bug 1830349] [NEW] Router external gateway wrongly marked as DOWN

2019-05-24 Thread Giuseppe Petralia
Public bug reported:

neutron version: 2:8.4.0-0ubuntu7.3~cloud0
openstack version: cloud:trusty-mitaka

In bootstack a customer had a non-ha router.
After updating the router to HA mode,
it is external gateway is wrongly marked as Down,
but we can see traffic going through the interface:

openstack router show  7d7a37e0-33f3-474f-adbf-ab27033c6bc8
+-+-+
| Field   | Value   




|
+-+-+
| admin_state_up  | UP  




|
| availability_zone_hints | 




|
| availability_zones  | nova




|
| created_at  | None




|
| description | 




|
| distributed | False   




|
| external_gateway_info   | {"enable_snat": true, "external_fixed_ips": 
[{"subnet_id": "dbfee73f-7094-4596-a79c-e05c2ce7d738", "ip_address": 
"185.170.7.198"}], "network_id": "43c6a5c6-d44c-43d9-a0e9-1c0311b41626"}


   |
| flavor_id   | None