[Yahoo-eng-team] [Bug 1787908] Re: ARP_spoofing on linuxbridge ml2
[Expired for neutron because there has been no activity for 60 days.] ** Changed in: neutron Status: Incomplete => Expired -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1787908 Title: ARP_spoofing on linuxbridge ml2 Status in neutron: Expired Bug description: Release: Pike Environment: Ubuntu 16.04 Neutron ML2 Plugin: Linux Bridge Vlan Problem: cannot turn off arp_spoofing on linuxbridge ml2 Background: According to Pike release notes linuxbridge agent parameter “prevent_arp_spoofing” has been depreciated. Instead, in order to disable arp_spoofing (ebtables) on the Port, the Port’s attribute “port_security_enabled” should be set to false and no security group is on that port.Further, the Pike documentation says that the Network (and Port) “port_security_enabled” is True by default and the Port’s value if not explicitly set will default to the Networks value”. Problem: Given the linuxBridge_agent config below and the Network and Port attribute “port_security_enabled” set to False, with no security group arp_spoofing is being established on the port.Further we are finding that the default “port_security_enabled” value for Networks and Ports are actually set to “false” contrary to the documentation. Diagnostics: We’ve been able to trace the port’s “port_security_enabled” value through the plugin and we’ve found that at some point that we haven’t identified yet the value is being set from false to true.In the module, arp_protect.py::setup_arp_spoofing_proteciton() we’ve printed out the port’s value and have prior to the if statement and have found that the value has been previously set to True, which at this point drops through the if statement to enable ebtables values on the port. We’ve tried to trace the value back up the stack but have not found where it is being reset.Any thoughts? Is this a bug or are we missing a configuration somewhere? As a work-around we will set the value to false in our environment. ++ linuxbridge_agent.ini [linux_bridge] physical_interface_mappings=physnet1:eth2,physnet2:eth1,physnet3:eth3 [vxlan] enable_vxlan=false [agent] prevent_arp_spoofing=false [ml2_type_vlan] network_vlan_ranges=physnet1:55:55,physnet2:50:50,physnet3:210:215 [ml2] type_drivers=vlan, local mechanism_drivers=linuxbridge [securitygroup] enable_security_group=false firewall_driver = neutron.agent.firewall.NoopFirewallDriver To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1787908/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1783654] Re: DVR process flow not installed on physical bridge for shared tenant network
Reviewed: https://review.openstack.org/609440 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=020d745f5b859f93f0c550be221c350bc14e8d23 Submitter: Zuul Branch:stable/pike commit 020d745f5b859f93f0c550be221c350bc14e8d23 Author: Swaminathan Vasudevan Date: Thu Aug 23 05:54:17 2018 + Revert "DVR: Inter Tenant Traffic between networks not possible with shared net" This reverts commit d019790fe436b72cb05b8d0ff1f3a62ebd9e9bee. Closes-Bug: #1783654 Change-Id: I4fd2610e185fb60cae62693cd4032ab700209b5f (cherry picked from commit fd72643a61f726145288b2a468b044e84d02c88e) (cherry picked from commit b70afb50138f9588a5165e1ca986f83856d5399d) ** Changed in: cloud-archive/pike Status: Invalid => Fix Committed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1783654 Title: DVR process flow not installed on physical bridge for shared tenant network Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive pike series: Fix Committed Status in Ubuntu Cloud Archive queens series: Fix Committed Status in Ubuntu Cloud Archive rocky series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Fix Committed Status in neutron source package in Cosmic: Fix Released Bug description: Seems like collateral from https://bugs.launchpad.net/neutron/+bug/1751396 In DVR, the distributed gateway port's IP and MAC are shared in the qrouter across all hosts. The dvr_process_flow on the physical bridge (which replaces the shared router_distributed MAC address with the unique per-host MAC when its the source), is missing, and so is the drop rule which instructs the bridge to drop all traffic destined for the shared distributed MAC. Because of this, we are seeing the router MAC on the network infrastructure, causing it on flap on br-int on every compute host: root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 11 4 fa:16:3e:42:a2:ec1 root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 11 4 fa:16:3e:42:a2:ec2 root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 1 4 fa:16:3e:42:a2:ec0 root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 11 4 fa:16:3e:42:a2:ec0 root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 11 4 fa:16:3e:42:a2:ec0 root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 1 4 fa:16:3e:42:a2:ec0 root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 1 4 fa:16:3e:42:a2:ec0 root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 1 4 fa:16:3e:42:a2:ec0 root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 1 4 fa:16:3e:42:a2:ec1 root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 11 4 fa:16:3e:42:a2:ec0 root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 11 4 fa:16:3e:42:a2:ec0 root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec 11 4 fa:16:3e:42:a2:ec0 Where port 1 is phy-br-vlan, connecting to the physical bridge, and port 11 is the correct local qr-interface. Because these dvr flows are missing on br-vlan, pkts w/ source mac ingress into the host and br- int learns it upstream. The symptom is when pinging a VM's floating IP, we see occasional packet loss (10-30%), and sometimes the responses are sent upstream by br-int instead of the qrouter, so the ICMP replies come with fixed IP of the replier since no NAT'ing took place, and on the tenant network rather than external network. When I force net_shared_only to False here, the problem goes away: https://github.com/openstack/neutron/blob/stable/pike/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py#L436 It should we noted we *ONLY* need to do this on our dvr_snat host. The dvr process's are missing on every compute host. But if we shut qrouter on the snat host, FIP functionality works and DVR mac stops flapping on others. Or if we apply fix only to snat host, it works. Perhaps there is something on SNAT node that is unique Ubuntu SRU details: --- [Impact] See above [Test Case] Deploy OpenStack with dvr enabled and then follow the steps above. [Regression Potential] The patches that are backported have already landed upstream in the corresponding stable branches, helping to minimize any regression potential. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1783654/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team
[Yahoo-eng-team] [Bug 1799340] [NEW] The default_ephemeral_device in instances table is NULL when deploying a VM
Public bug reported: Description === I deploy a vm with flavor: +--+---+--+---+--+---+-+- | Name | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | +--+---+--+---+--+---+-+---+-+ | smy_test | 512 | 10 | 5 | 1| 2 | 1.0 | True | - | +--+---+--+---+--+---+-+- When the VM is deployed successfully and running normally,the default_ephemeral_device is not updated in instances table in database. Obviously default_swap_device and root_device_name is updated. +--+--+-+ | root_device_name | default_ephemeral_device | default_swap_device | +--+--+-+ | /dev/vda | NULL | /dev/vdc| +--+--+-+ Steps to reproduce == 1. create a flavor using ephemeral 2. deploy a VM with this flavor Expected result === check instances table and default_ephemeral_device is NULL; Actual result = default_ephemeral_device has a value; Environment === [root@nail1 ~]# rpm -qa | grep nova openstack-nova-api-18.0.2-1.el7.noarch openstack-nova-common-18.0.2-1.el7.noarch python2-novaclient-11.0.0-1.el7.noarch openstack-nova-placement-api-18.0.2-1.el7.noarch openstack-nova-scheduler-18.0.2-1.el7.noarch openstack-nova-conductor-18.0.2-1.el7.noarch openstack-nova-novncproxy-18.0.2-1.el7.noarch python-nova-18.0.2-1.el7.noarch openstack-nova-compute-18.0.2-1.el7.noarch openstack-nova-console-18.0.2-1.el7.noarch hypervisor: Libvirt + KVM ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1799340 Title: The default_ephemeral_device in instances table is NULL when deploying a VM Status in OpenStack Compute (nova): New Bug description: Description === I deploy a vm with flavor: +--+---+--+---+--+---+-+- | Name | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | +--+---+--+---+--+---+-+---+-+ | smy_test | 512 | 10 | 5 | 1| 2 | 1.0 | True | - | +--+---+--+---+--+---+-+- When the VM is deployed successfully and running normally,the default_ephemeral_device is not updated in instances table in database. Obviously default_swap_device and root_device_name is updated. +--+--+-+ | root_device_name | default_ephemeral_device | default_swap_device | +--+--+-+ | /dev/vda | NULL | /dev/vdc| +--+--+-+ Steps to reproduce == 1. create a flavor using ephemeral 2. deploy a VM with this flavor Expected result === check instances table and default_ephemeral_device is NULL; Actual result = default_ephemeral_device has a value; Environment === [root@nail1 ~]# rpm -qa | grep nova openstack-nova-api-18.0.2-1.el7.noarch openstack-nova-common-18.0.2-1.el7.noarch python2-novaclient-11.0.0-1.el7.noarch openstack-nova-placement-api-18.0.2-1.el7.noarch openstack-nova-scheduler-18.0.2-1.el7.noarch openstack-nova-conductor-18.0.2-1.el7.noarch openstack-nova-novncproxy-18.0.2-1.el7.noarch python-nova-18.0.2-1.el7.noarch openstack-nova-compute-18.0.2-1.el7.noarch openstack-nova-console-18.0.2-1.el7.noarch hypervisor: Libvirt + KVM To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1799340/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1784353] Re: Rescheduled boot from volume instances fail due to the premature removal of their attachments
Reviewed: https://review.openstack.org/587071 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=41452a5c6adb8cae54eef24803f4adc468131b34 Submitter: Zuul Branch:master commit 41452a5c6adb8cae54eef24803f4adc468131b34 Author: Lee Yarwood Date: Mon Jul 30 13:41:35 2018 +0100 conductor: Recreate volume attachments during a reschedule When an instance with attached volumes fails to spawn, cleanup code within the compute manager (_shutdown_instance called from _build_resources) will delete the volume attachments referenced by the bdms in Cinder. As a result we should check and if necessary recreate these volume attachments when rescheduling an instance. Note that there are a few different ways to fix this bug by making changes to the compute manager code, either by not deleting the volume attachment on failure before rescheduling [1] or by performing the get/create check during each build after the reschedule [2]. The problem with *not* cleaning up the attachments is if we don't reschedule, then we've left orphaned "reserved" volumes in Cinder (or we have to add special logic to tell compute when to cleanup attachments). The problem with checking the existence of the attachment on every new host we build on is that we'd be needlessly checking that for initial creates even if we don't ever need to reschedule, unless again we have special logic against that (like checking to see if we've rescheduled at all). Also, in either case that involves changes to the compute means that older computes might not have the fix. So ultimately it seems that the best way to handle this is: 1. Only deal with this on reschedules. 2. Let the cell conductor orchestrate it since it's already dealing with the reschedule. Then the compute logic doesn't need to change. [1] https://review.openstack.org/#/c/587071/3/nova/compute/manager.py@1631 [2] https://review.openstack.org/#/c/587071/4/nova/compute/manager.py@1667 Change-Id: I739c06bd02336bf720cddacb21f48e7857378487 Closes-bug: #1784353 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1784353 Title: Rescheduled boot from volume instances fail due to the premature removal of their attachments Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: In Progress Status in OpenStack Compute (nova) rocky series: In Progress Bug description: Description === This is caused by the cleanup code within the compute layer (_shutdown_instance) removing all volume attachments associated with an instance with no attempt being made to recreate these ahead of the instance being rescheduled. Steps to reproduce == - Attempt to boot an instance with volumes attached. - Ensure spawn() fails, for example by stopping the l2 network agent services on the compute host. Expected result === The instance is reschedule to another compute host and boots correctly. Actual result = The instance fails to boot on all hosts that is rescheduled to due to a missing volume attachment. Environment === 1. Exact version of OpenStack you are running. See the following list for all releases: http://docs.openstack.org/releases/ bf497cc47497d3a5603bf60de652054ac5ae1993 2. Which hypervisor did you use? (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...) What's the version of that? Libvirt + KVM, however this shouldn't matter. 3. Which storage type did you use? (For example: Ceph, LVM, GPFS, ...) What's the version of that? N/A 4. Which networking type did you use? (For example: nova-network, Neutron with OpenVSwitch, ...) N/A Logs & Configs == 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] Traceback (most recent call last): 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1579, in _prep_block_device 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] wait_func=self._await_block_device_map_created) 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 837, in attach_block_devices 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] _log_and_attach(device)
[Yahoo-eng-team] [Bug 1799338] [NEW] cloud-init won't reformat NTFS ephemeral drive on SLES 15
Public bug reported: Commit aa4eeb808 (Paul Meyer 2018-05-23 15:45:39 -0400 710) detects that the platform doesn't support NTFS volumes by looking for the appropriate error message in the exception it catches. The exact syntax of that error message differs between RHEL (the target distro for Paul's merge) and SUSE: RHEL mount: unknown filesystem type 'ntfs' SUSE mount: /dev/sdb1: unknown filesystem type 'ntfs' As a result, cloud-init on SUSE VMs in Azure doesn't properly detect that the distro doesn't support NTFS and thus will not reformat the ephemeral volume on Azure. ** Affects: cloud-init Importance: Undecided Status: New ** Merge proposal linked: https://code.launchpad.net/~jasonzio/cloud-init/+git/cloud-init/+merge/357669 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1799338 Title: cloud-init won't reformat NTFS ephemeral drive on SLES 15 Status in cloud-init: New Bug description: Commit aa4eeb808 (Paul Meyer 2018-05-23 15:45:39 -0400 710) detects that the platform doesn't support NTFS volumes by looking for the appropriate error message in the exception it catches. The exact syntax of that error message differs between RHEL (the target distro for Paul's merge) and SUSE: RHEL mount: unknown filesystem type 'ntfs' SUSE mount: /dev/sdb1: unknown filesystem type 'ntfs' As a result, cloud-init on SUSE VMs in Azure doesn't properly detect that the distro doesn't support NTFS and thus will not reformat the ephemeral volume on Azure. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1799338/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799337] [NEW] Building cloud-init on SLES 15 complains about missing cloud-id binary
Public bug reported: Change 6ee8a2c55 (Chad Smith 2018-10-09 22:19:20 + 286) added 'cloud-id' to the list of console scripts in setup.py. The packaging spec for suse wasn't changed to package that script. ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1799337 Title: Building cloud-init on SLES 15 complains about missing cloud-id binary Status in cloud-init: New Bug description: Change 6ee8a2c55 (Chad Smith 2018-10-09 22:19:20 + 286) added 'cloud-id' to the list of console scripts in setup.py. The packaging spec for suse wasn't changed to package that script. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1799337/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799332] [NEW] Apache WSGI config shipping with Keystone is incompatible with Horizon
Public bug reported: In keystone/httpd/wsgi-keystone.conf, the following configuration is present: Alias /identity /usr/local/bin/keystone-wsgi-public SetHandler wsgi-script Options +ExecCGI WSGIProcessGroup keystone-public WSGIApplicationGroup %{GLOBAL} WSGIPassAuthorization On However, it is both harmful and unnecessary. The operative WSGI configuration for Keystone comes from the ... section. In fact, the commit which added the /identity endpoint described it as an documentation example: "Apache Httpd can be configured to accept keystone requests on all sorts of interfaces. The sample config file is updated to show how to configure Apache Httpd to also send requests on /identity and /identity_admin to keystone." Leaving it in place, however, causes conflicts when Horizon is concurrently installed: AH01630: client denied by server configuration: /usr/bin/keystone-wsgi- public ...in responses to Horizon URL's referencing '/identity'. Therefore, I believe keeping this configuration snippet in the shipped WSGI configuration (as opposed to actual documentation) is a defect. ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1799332 Title: Apache WSGI config shipping with Keystone is incompatible with Horizon Status in OpenStack Identity (keystone): New Bug description: In keystone/httpd/wsgi-keystone.conf, the following configuration is present: Alias /identity /usr/local/bin/keystone-wsgi-public SetHandler wsgi-script Options +ExecCGI WSGIProcessGroup keystone-public WSGIApplicationGroup %{GLOBAL} WSGIPassAuthorization On However, it is both harmful and unnecessary. The operative WSGI configuration for Keystone comes from the ... section. In fact, the commit which added the /identity endpoint described it as an documentation example: "Apache Httpd can be configured to accept keystone requests on all sorts of interfaces. The sample config file is updated to show how to configure Apache Httpd to also send requests on /identity and /identity_admin to keystone." Leaving it in place, however, causes conflicts when Horizon is concurrently installed: AH01630: client denied by server configuration: /usr/bin/keystone- wsgi-public ...in responses to Horizon URL's referencing '/identity'. Therefore, I believe keeping this configuration snippet in the shipped WSGI configuration (as opposed to actual documentation) is a defect. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1799332/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799153] Re: Inappropriate behaviour of limits when passing --region None in create and list.
https://review.openstack.org/#/c/612283 ** Project changed: keystone => python-openstackclient -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1799153 Title: Inappropriate behaviour of limits when passing --region None in create and list. Status in python-openstackclient: New Bug description: When creating registered limit by passing --region None in registered limit create cli, it is giving error message "More than one resource exist for region" which is definitely a wrong message as regions with same name cannot be created neither same exist. The correct behaviour should be - 1. In the case if --region None it should create a registered limit. 2. "No region exist with name xyz" if passed a invalid 'xyz' region while creating. Same in case of registerd limit list 1. In the case if --region None it should list all limits ignoring None. 2. "No region exist with name xyz" if passed a invalid 'xyz' region while listing. Sane behaviors for limit create and list To manage notifications about this bug go to: https://bugs.launchpad.net/python-openstackclient/+bug/1799153/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1794809] Re: Gateway ports are down after reboot of control plane nodes
Reviewed: https://review.openstack.org/606085 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f787f12aa3441ecffef55f261c4d87dbb12ca6cf Submitter: Zuul Branch:master commit f787f12aa3441ecffef55f261c4d87dbb12ca6cf Author: Slawek Kaplonski Date: Fri Sep 28 13:07:28 2018 +0200 Make port binding attempt after agent is revived In some cases it may happen that port is "binding_failed" because L2 agent running on destination host was down but this is "temporary" issue. It is like that for example in case when using L3 HA and when master and backup network nodes were e.g. rebooted. L3 agent might start running before L2 agent on host in such case and if it's new master node, router ports will have "binding_failed" state. When agent sends heartbeat and is getting back to live, ML2 plugin will try to bind all ports with "binding_failed" from this host. Change-Id: I3bedb7c22312884cc28aa78aa0f8fbe418f97090 Closes-Bug: #1794809 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1794809 Title: Gateway ports are down after reboot of control plane nodes Status in neutron: Fix Released Bug description: Sometimes when control plane nodes are going down and then up it may happen that for L3 HA routers, failover of active router will happen and in such case if L3 agent will be running before openvswitch agent on host, gateway port may be in "binding failed" state on new MASTER agent. That will cause no connectivity to floating IPs on this router. I tested this on Queens but it seems that there wasn't any changes in this since Queens. One possible solution might be to trigger another bind attempt for all ports which are binding_failed on host when L2 agent from this host is revived. I will investigate if that would work. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1794809/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799328] [NEW] Should not store segmenthostmapping table when segment service plugin disabled
Public bug reported: Version = Openstack neutron Ocata Issue Description = Currently, the default behavior of Neutron will store the segment in compute nodes level, so the port binding process can know exactly which network plane it can reach on some specific compute nodes. But for some SDN controllers, which integrated as a mechanism driver of ml2 core plugin, they may don't use the segmenthostmapping info, as they manage their own mapping, so they don't use that info, and it may raise some other issues, such as performance issue in a large scaled deployment. Proposal = If the env doesn't enable the segments service plugin, neutron server won't inject any record into segmenthostmapping table. Only we enable the segments service plugin, and we want to create routed networks, then store the necessary info into the db table. ** Affects: neutron Importance: Undecided Status: New ** Tags: api ocata-backport-potential ** Tags added: api ocata-backport-potential -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1799328 Title: Should not store segmenthostmapping table when segment service plugin disabled Status in neutron: New Bug description: Version = Openstack neutron Ocata Issue Description = Currently, the default behavior of Neutron will store the segment in compute nodes level, so the port binding process can know exactly which network plane it can reach on some specific compute nodes. But for some SDN controllers, which integrated as a mechanism driver of ml2 core plugin, they may don't use the segmenthostmapping info, as they manage their own mapping, so they don't use that info, and it may raise some other issues, such as performance issue in a large scaled deployment. Proposal = If the env doesn't enable the segments service plugin, neutron server won't inject any record into segmenthostmapping table. Only we enable the segments service plugin, and we want to create routed networks, then store the necessary info into the db table. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1799328/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1796887] Re: Validation of tokens degraded after upgrade to Rocky
Reviewed: https://review.openstack.org/608963 Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=d465a58f02f134086d6322c5b858c056a3aea025 Submitter: Zuul Branch:master commit d465a58f02f134086d6322c5b858c056a3aea025 Author: Jose Castro Leon Date: Tue Oct 9 15:11:48 2018 +0200 Add caching on trust role validation to improve performance In the token model, the trust roles are not cached. This behavior impacts services that are using trusts heavily like heat or magnum. It introduces new cache data to improve the performance on token validation requests on trusts. Change-Id: I974907b427c34fd5db3228b6139d93bbcdc38df5 Closes-Bug: #1796887 ** Changed in: keystone Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1796887 Title: Validation of tokens degraded after upgrade to Rocky Status in OpenStack Identity (keystone): Fix Released Bug description: Recently we have upgraded Keystone to the Rocky release and we saw a quite noticiable increase of the response on validation of certain types of tokens. Specifically tokens that are created from trusts. On the new token model (keystone/models/token_model.py) that's evaluated several times during token validation, the call to retrieve the roles from the trust is retrieving the information directly from the DB with no caching whatsoever. On other operations of the token_model, this information is only requested once, and then cached for following operations. Since we are using heat and magnum, that are heavily using trusts, we were impacted by this change of validation response. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1796887/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799113] Re: queens/pike compute rpcapi version mismatch
** Changed in: nova Importance: Undecided => High ** Also affects: nova/queens Importance: Undecided Status: New ** Also affects: nova/rocky Importance: Undecided Status: New ** Changed in: nova/queens Importance: Undecided => High ** Changed in: nova/rocky Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1799113 Title: queens/pike compute rpcapi version mismatch Status in OpenStack Compute (nova): In Progress Status in OpenStack Compute (nova) queens series: New Status in OpenStack Compute (nova) rocky series: New Bug description: Doing a live upgrade from pike to queens noticed that resizes weren't working. In queens source it says pike version of the compute rpcapi is 4.18 https://github.com/openstack/nova/blob/eae37a27caa5ca8b0ca50187928bde81f28a24e1/nova/compute/rpcapi.py#L361 Looking at latest stable/pike the latest version there is 4.17 https://github.com/openstack/nova/blob/6ef30d5078595108c1c0f2b5c258ae6ef2db1eeb/nova/compute/rpcapi.py#L330 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1799113/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1796976] Re: neutron.conf needs lock_path set for router to operate
Reviewed: https://review.openstack.org/612196 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f4d438019e3bd2f9b6c64badb9533168e583d8af Submitter: Zuul Branch:master commit f4d438019e3bd2f9b6c64badb9533168e583d8af Author: SapanaJadhav Date: Sun Oct 21 21:46:32 2018 +0530 neutron.conf needs lock_path set for router to operate This change is adding required configuration in neutron.conf to set the lock_path parameter, which was missing in compute-install-ubuntu.rst Change-Id: If090bdf060dfe21d11b1a5dfd010dc8167d9e45e Closes-Bug: #1796976 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1796976 Title: neutron.conf needs lock_path set for router to operate Status in neutron: Fix Released Bug description: This bug tracker is for errors with the documentation, use the following as a template and remove or add fields as you see fit. Convert [ ] into [x] to check boxes: - [X] This doc is inaccurate in this way: Using self-service network. Router fails to operate if lock_path is not set. - [ ] This is a doc addition request. - [ ] I have a fix to the document that I can paste below including example: input and output. Detail: - Rocky clean install followinf the self-service network model - While creating the sample networks and router the following issues arise when running throug the verifications 1) ip netns <-- shows qdhcp namespaces but no qrouter 2) openstack port list --router router <-- shows the interdaces in down states After researching the log files i see in /var/log/neutron/neutron-l3-agent.log that the required parameter lock_path is missing I've edited the /etc/neutron/neutron.conf and in the [oslo_concurrency] have set lock_path = /var/lib/neutron/tmp After reinit: - /var/log/neutron/neutron-l3-agent.log is clean, - ip netns shows 2 qdhcp and one qrouter as expected - openstack port list --router router <- Ports are shown as active Therefore my guess is that neutron install guide should be updated to reflect this needed paramenter. Thanks a lot in advance. --- Release: on 2018-10-08 10:46 SHA: 6084f10333d7662a4f98994db49fd52bf9bf68f2 Source: https://git.openstack.org/cgit/openstack/openstack-manuals/tree/doc/install-guide/source/launch-instance-networks-selfservice.rst URL: https://docs.openstack.org/install-guide/launch-instance-networks-selfservice.html To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1796976/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1796854] Re: Neutron doesn't respect advscv role while creating port
Reviewed: https://review.openstack.org/609633 Committed: https://git.openstack.org/cgit/openstack/neutron-lib/commit/?id=00147a7d700e6d0142161152137bbab0c39ce4c0 Submitter: Zuul Branch:master commit 00147a7d700e6d0142161152137bbab0c39ce4c0 Author: Maciej Józefczyk Date: Thu Oct 11 08:57:29 2018 + Allow advsvc role to create port in foreign tenant Change [1] introduced support for advsvc role. This added possibility for user with role advsvc to make CRUD operations on ports, subnets and networks in foreign tenants. Due the check in _validate_privileges() it was not working. This patch fixes that. Closes-Bug: #1796854 [1] https://review.openstack.org/#/c/101281 Change-Id: I6a3f91337bf8dd32012a75916e3409e30f46b50d ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1796854 Title: Neutron doesn't respect advscv role while creating port Status in neutron: Fix Released Bug description: Neutron doesn't allow user with role 'advsvc' to add port in other user tenant network. Introduced change: https://review.openstack.org/#/c/101281/10 Should allow that, but in fact in neutron-lib there is no validation for advsvc role: https://github.com/openstack/neutron-lib/blob/master/neutron_lib/api/attributes.py#L28 Error: Specifying 'project_id' or 'tenant_id' other than the authenticated project in request requires admin privileges Version Devstack master. How to reproduce 1. Setup devstack master, add new project and user to this project with role advsvc source devstack/openrc admin demo openstack project create advsvc-project openstack user create --project advsvc-project --password test advsvc-project-user openstack role create advsvc openstack role add --user advsvc-project-user --project advsvc-project advsvc openstack role add --user advsvc-project-user --project advsvc-project member 2. Create network in other project. openstack project create test-project openstack user create --project test-project --password test test-project-user openstack role add --user test-project-user --project test-project member neutron net-create private-net-test-user --provider:network_type=vxlan --provider:segmentation_id=1234 --project-id [[ test-project-id ]] neutron subnet-create private-net-test-user --name private-subnet- test-user --allocation-pool start=10.13.12.100,end=10.13.12.130 10.13.12.0/24 --dns-nameserver 8.8.8.8 --project-id [[ test-project-id ]] 3. Create a port in test-project tenant by user with advsvc role: stack@mjozefcz-devstack:~$ neutron port-create --tenant-id 865073224f7b4e9d9fdd4a446e3a4af4 private-net-test-user neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. Specifying 'project_id' or 'tenant_id' other than the authenticated project in request requires admin privileges Neutron server returns request_ids: ['req-e841edb1-2cf2-47b6-a493-11a56114a323'] To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1796854/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799309] [NEW] Migration/Resize fails with Unexpected exception in API method: Circular reference detected
Public bug reported: Description === Cold migration of a VM failed, so I tried resizing, and the same error occurred. Previous to this, I had disabled the compute service on one node, to see if cold migrating VMs would avoid scheduling the VMs on the disabled node. This succeeded a few times, but then this error occurred and now it happens on the two test VMs that I have running in this environment (there are only two VMs total). Steps to reproduce == Run as "admin" on a test server created by a domain user in its respective domain project: openstack server migrate It continues to be 100% repeatable. VMs are still operational and can be shut down and powered on without issue. Expected result === As with previous migrations, a graceful shut down, cold migrate, and power on of the VM. Actual result = This is returned (note that this is from a different migration attempt than the logs included below): Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. (HTTP 500) (Request-ID: req-46ce0fd3-5579-48be-9486-499fdea085a1) Environment === stable/rocky deployed with Kolla-Ansible 7.0.0.0rc3devXX (as of October 15, 2018) with respective Kolla images Hypervisor: Libvirt + KVM Storage: iSCSI attached (Blockbridge) Networking: DVR Logs & Configs == This is a filtered list of nova files from all nova containers running on all controllers, concatenated and sorted, filtered by the request ID. /var/lib/docker/volumes/kolla_logs/_data/nova/nova-api.log:2018-10-22 16:05:07.105 26 DEBUG nova.api.openstack.wsgi [req-8a41af8d-23ea-491e- be0a-b0fbe98dd6a7 ebdf7b1025a6464ba150b8ea63bfacb5 a87099d25afd4c599d34b2fae7689dec - default default] Action: 'action', calling method: >, body: {"migrate": null} _process_stack /var/lib/kolla/venv/lib/python2.7/site- packages/nova/api/openstack/wsgi.py:615 /var/lib/docker/volumes/kolla_logs/_data/nova/nova-api.log:2018-10-22 16:05:07.106 26 DEBUG nova.compute.api [req-8a41af8d-23ea-491e-be0a- b0fbe98dd6a7 ebdf7b1025a6464ba150b8ea63bfacb5 a87099d25afd4c599d34b2fae7689dec - default default] [instance: 2fd8ff29 -f64a-4e5b-bfcd-97c52cf6d66d] Fetching instance by UUID get /var/lib/kolla/venv/lib/python2.7/site-packages/nova/compute/api.py:2402 /var/lib/docker/volumes/kolla_logs/_data/nova/nova-api.log:2018-10-22 16:05:07.117 26 DEBUG oslo_concurrency.lockutils [req-8a41af8d-23ea- 491e-be0a-b0fbe98dd6a7 ebdf7b1025a6464ba150b8ea63bfacb5 a87099d25afd4c599d34b2fae7689dec - default default] Lock "43894fde-4653-4499-9c83-0e963c974fae" acquired by "nova.context.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /var/lib/kolla/venv/lib/python2.7/site- packages/oslo_concurrency/lockutils.py:273 /var/lib/docker/volumes/kolla_logs/_data/nova/nova-api.log:2018-10-22 16:05:07.118 26 DEBUG oslo_concurrency.lockutils [req-8a41af8d-23ea- 491e-be0a-b0fbe98dd6a7 ebdf7b1025a6464ba150b8ea63bfacb5 a87099d25afd4c599d34b2fae7689dec - default default] Lock "43894fde-4653-4499-9c83-0e963c974fae" released by "nova.context.get_or_set_cached_cell_and_set_connections" :: held 0.000s inner /var/lib/kolla/venv/lib/python2.7/site- packages/oslo_concurrency/lockutils.py:285 /var/lib/docker/volumes/kolla_logs/_data/nova/nova-api.log:2018-10-22 16:05:07.174 26 DEBUG nova.objects.instance [req-8a41af8d-23ea-491e- be0a-b0fbe98dd6a7 ebdf7b1025a6464ba150b8ea63bfacb5 a87099d25afd4c599d34b2fae7689dec - default default] Lazy-loading 'flavor' on Instance uuid 2fd8ff29-f64a-4e5b-bfcd-97c52cf6d66d obj_load_attr /var/lib/kolla/venv/lib/python2.7/site- packages/nova/objects/instance.py:1109 /var/lib/docker/volumes/kolla_logs/_data/nova/nova-api.log:2018-10-22 16:05:07.212 26 DEBUG nova.compute.api [req-8a41af8d-23ea-491e-be0a- b0fbe98dd6a7 ebdf7b1025a6464ba150b8ea63bfacb5 a87099d25afd4c599d34b2fae7689dec - default default] [instance: 2fd8ff29 -f64a-4e5b-bfcd-97c52cf6d66d] flavor_id is None. Assuming migration. resize /var/lib/kolla/venv/lib/python2.7/site- packages/nova/compute/api.py:3448 /var/lib/docker/volumes/kolla_logs/_data/nova/nova-api.log:2018-10-22 16:05:07.213 26 DEBUG nova.compute.api [req-8a41af8d-23ea-491e-be0a- b0fbe98dd6a7 ebdf7b1025a6464ba150b8ea63bfacb5 a87099d25afd4c599d34b2fae7689dec - default default] [instance: 2fd8ff29 -f64a-4e5b-bfcd-97c52cf6d66d] Old instance type c5.4xlarge, new instance type c5.4xlarge resize /var/lib/kolla/venv/lib/python2.7/site- packages/nova/compute/api.py:3469 /var/lib/docker/volumes/kolla_logs/_data/nova/nova-api.log:2018-10-22 16:05:09.999 26 ERROR nova.api.openstack.wsgi [req-8a41af8d-23ea-491e- be0a-b0fbe98dd6a7 ebdf7b1025a6464ba150b8ea63bfacb5 a87099d25afd4c599d34b2fae7689dec - default default] Unexpected exception in API method: ValueError: Circular reference detected /var/lib/docker/volumes/kolla_logs/_data/nova/nova-api.log:2018-10-22 16:05:10.005 26 INFO nova.api.openstack.wsgi [req-8a41af8d-23ea-491e-
[Yahoo-eng-team] [Bug 1799298] Re: Metadata API cross joining instance_metadata and instance_system_metadata
** Tags added: api db metadata performance ** Changed in: nova Status: New => Triaged ** Changed in: nova Importance: Undecided => Medium ** Also affects: nova/queens Importance: Undecided Status: New ** Also affects: nova/ocata Importance: Undecided Status: New ** Also affects: nova/rocky Importance: Undecided Status: New ** Also affects: nova/pike Importance: Undecided Status: New ** Changed in: nova/pike Importance: Undecided => Medium ** Changed in: nova/queens Importance: Undecided => Medium ** Changed in: nova/rocky Importance: Undecided => Medium ** Changed in: nova/pike Status: New => Triaged ** Changed in: nova/queens Status: New => Triaged ** Changed in: nova/rocky Status: New => Triaged ** Changed in: nova/ocata Status: New => Triaged ** Changed in: nova/ocata Importance: Undecided => Medium -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1799298 Title: Metadata API cross joining instance_metadata and instance_system_metadata Status in OpenStack Compute (nova): Triaged Status in OpenStack Compute (nova) ocata series: Triaged Status in OpenStack Compute (nova) pike series: Triaged Status in OpenStack Compute (nova) queens series: Triaged Status in OpenStack Compute (nova) rocky series: Triaged Bug description: Description === While troubleshooting a production issue we identified that the Nova metadata API is fetching a lot more raw data from the database than seems necessary. The problem appears to be caused by the SQL query used to fetch instance data, which joins the "instance" table with, among others, two metadata tables: "instance_metadata" and "instance_system_metadata". Below is a simplified version of this query which was captured by adding extra logging (the full query is listed at the end of this bug report): SELECT ... FROM (SELECT ... FROM `instances` WHERE `instances` . `deleted` = ? AND `instances` . `uuid` = ? LIMIT ?) AS `anon_1` LEFT OUTER JOIN `instance_system_metadata` AS `instance_system_metadata_1` ON `anon_1` . `instances_uuid` = `instance_system_metadata_1` . `instance_uuid` LEFT OUTER JOIN (`security_group_instance_association` AS `security_group_instance_association_1` INNER JOIN `security_groups` AS `security_groups_1` ON `security_groups_1` . `id` = `security_group_instance_association_1` . `security_group_id` AND `security_group_instance_association_1` . `deleted` = ? AND `security_groups_1` . `deleted` = ? ) ON `security_group_instance_association_1` . `instance_uuid` = `anon_1` . `instances_uuid` AND `anon_1` . `instances_deleted` = ? LEFT OUTER JOIN `security_group_rules` AS `security_group_rules_1` ON `security_group_rules_1` . `parent_group_id` = `security_groups_1` . `id` AND `security_group_rules_1` . `deleted` = ? LEFT OUTER JOIN `instance_info_caches` AS `instance_info_caches_1` ON `instance_info_caches_1` . `instance_uuid` = `anon_1` . `instances_uuid` LEFT OUTER JOIN `instance_extra` AS `instance_extra_1` ON `instance_extra_1` . `instance_uuid` = `anon_1` . `instances_uuid` LEFT OUTER JOIN `instance_metadata` AS `instance_metadata_1` ON `instance_metadata_1` . `instance_uuid` = `anon_1` . `instances_uuid` AND `instance_metadata_1` . `deleted` = ? The instance table has a 1-to-many relationship to both "instance_metadata" and "instance_system_metadata" tables, so the query is effectively producing a cross join of both metadata tables. Steps to reproduce == To illustrate the impact of this query, add 2 properties to a running instance and verify that it has 2 records in "instance_metadata", as well as other records in "instance_system_metadata" such as base image properties: > select instance_uuid,`key`,value from instance_metadata where instance_uuid = 'a6cf4a6a-effe-4438-9b7f-d61b23117b9b'; +--+---++ | instance_uuid| key | value | +--+---++ | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | +--+---++ 2 rows in set (0.61 sec) > select instance_uuid,`key`,valusystem_metadata where instance_uuid = 'a6cf4a6a-effe-4438-9b7f-d61b23117b9b'; ++--+ | key| value| ++--+ |
[Yahoo-eng-team] [Bug 1799301] [NEW] SUSE sysconfig renderer enablement incomplete
Public bug reported: With db50bc0d9 the sysconfig renderer was enabled for openSUSE and SUSE Linux Enterprise. This implementation is incomplete and network rendering for openSUSE and SLES is now completely broken. Message in cloud-init.log: stages.py[ERROR]: Unable to render networking. Network config is likely broken: No available network renderers found. Searched through list: ['eni', 'sysconfig', 'netplan'] The issue is that the available() method in sysconfig.py looks for expected_paths = [ 'etc/sysconfig/network-scripts/network-functions', 'etc/sysconfig/network-scripts/ifdown-eth'] in addition to ifup and ifdown. While ifup and ifdown are found the above scripts do not exists on openSUSE and SLES. The equivalent to 'etc/sysconfig/network-scripts/network-functions' would be 'etc/sysconfig/network/functions.netconfig', there is no default ifdown-eth, any ifdown scripts would exist in 'etc/sysconfig/network/if-down.d' but this is empty by default. One option is of course to not look for such specific location and "trust" that the necessary script for the given distro are installed. We would only check for ifup and ifdown commands as those are necessary. The underying distro implementation for script handling may not be as important here. ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1799301 Title: SUSE sysconfig renderer enablement incomplete Status in cloud-init: New Bug description: With db50bc0d9 the sysconfig renderer was enabled for openSUSE and SUSE Linux Enterprise. This implementation is incomplete and network rendering for openSUSE and SLES is now completely broken. Message in cloud-init.log: stages.py[ERROR]: Unable to render networking. Network config is likely broken: No available network renderers found. Searched through list: ['eni', 'sysconfig', 'netplan'] The issue is that the available() method in sysconfig.py looks for expected_paths = [ 'etc/sysconfig/network-scripts/network-functions', 'etc/sysconfig/network-scripts/ifdown-eth'] in addition to ifup and ifdown. While ifup and ifdown are found the above scripts do not exists on openSUSE and SLES. The equivalent to 'etc/sysconfig/network-scripts/network-functions' would be 'etc/sysconfig/network/functions.netconfig', there is no default ifdown-eth, any ifdown scripts would exist in 'etc/sysconfig/network/if-down.d' but this is empty by default. One option is of course to not look for such specific location and "trust" that the necessary script for the given distro are installed. We would only check for ifup and ifdown commands as those are necessary. The underying distro implementation for script handling may not be as important here. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1799301/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1687027] Re: test_walk_versions tests fail with "IndexError: tuple index out of range" after timeout
It looks that in some cases such test can take longer than 300 seconds and there are still failures there. See: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22in%20test_walk_versions%5C%22%20AND%20filename%3A%5C %22job-output.txt%5C%22 ** Changed in: neutron Status: Fix Released => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1687027 Title: test_walk_versions tests fail with "IndexError: tuple index out of range" after timeout Status in neutron: In Progress Bug description: http://logs.openstack.org/99/460399/1/check/gate-neutron-dsvm- functional-ubuntu-xenial/25de43d/testr_results.html.gz Traceback (most recent call last): File "neutron/tests/base.py", line 115, in func return f(self, *args, **kwargs) File "neutron/tests/base.py", line 115, in func return f(self, *args, **kwargs) File "neutron/tests/functional/db/test_migrations.py", line 551, in test_walk_versions self._migrate_up(config, engine, dest, curr, with_data=True) File "neutron/tests/functional/db/test_migrations.py", line 537, in _migrate_up migration.do_alembic_command(config, 'upgrade', dest) File "neutron/db/migration/cli.py", line 109, in do_alembic_command getattr(alembic_command, cmd)(config, *args, **kwargs) File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/alembic/command.py", line 254, in upgrade script.run_env() File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/alembic/script/base.py", line 416, in run_env util.load_python_file(self.dir, 'env.py') File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/alembic/util/pyfiles.py", line 93, in load_python_file module = load_module_py(module_id, path) File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/alembic/util/compat.py", line 75, in load_module_py mod = imp.load_source(module_id, path, fp) File "neutron/db/migration/alembic_migrations/env.py", line 120, in run_migrations_online() File "neutron/db/migration/alembic_migrations/env.py", line 114, in run_migrations_online context.run_migrations() File "", line 8, in run_migrations File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/alembic/runtime/environment.py", line 817, in run_migrations self.get_context().run_migrations(**kw) File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/alembic/runtime/migration.py", line 323, in run_migrations step.migration_fn(**kw) File "/opt/stack/new/neutron/neutron/db/migration/alembic_migrations/versions/mitaka/expand/3894bccad37f_add_timestamp_to_base_resources.py", line 36, in upgrade sa.Column(column_name, sa.DateTime(), nullable=True) File "", line 8, in add_column File "", line 3, in add_column File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/alembic/operations/ops.py", line 1551, in add_column return operations.invoke(op) File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/alembic/operations/base.py", line 318, in invoke return fn(self, operation) File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/alembic/operations/toimpl.py", line 123, in add_column schema=schema File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/alembic/ddl/impl.py", line 172, in add_column self._exec(base.AddColumn(table_name, column, schema=schema)) File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/alembic/ddl/impl.py", line 118, in _exec return conn.execute(construct, *multiparams, **params) File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 945, in execute return meth(self, multiparams, params) File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/sqlalchemy/sql/ddl.py", line 68, in _execute_on_connection return connection._execute_ddl(self, multiparams, params) File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1002, in _execute_ddl compiled File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context context) File "/opt/stack/new/neutron/.tox/dsvm-functional/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1398, in _handle_dbapi_exception util.raise_from_cause(newraise, exc_info) File
[Yahoo-eng-team] [Bug 1799298] [NEW] Metadata API cross joining instance_metadata and instance_system_metadata
Public bug reported: Description === While troubleshooting a production issue we identified that the Nova metadata API is fetching a lot more raw data from the database than seems necessary. The problem appears to be caused by the SQL query used to fetch instance data, which joins the "instance" table with, among others, two metadata tables: "instance_metadata" and "instance_system_metadata". Below is a simplified version of this query which was captured by adding extra logging (the full query is listed at the end of this bug report): SELECT ... FROM (SELECT ... FROM `instances` WHERE `instances` . `deleted` = ? AND `instances` . `uuid` = ? LIMIT ?) AS `anon_1` LEFT OUTER JOIN `instance_system_metadata` AS `instance_system_metadata_1` ON `anon_1` . `instances_uuid` = `instance_system_metadata_1` . `instance_uuid` LEFT OUTER JOIN (`security_group_instance_association` AS `security_group_instance_association_1` INNER JOIN `security_groups` AS `security_groups_1` ON `security_groups_1` . `id` = `security_group_instance_association_1` . `security_group_id` AND `security_group_instance_association_1` . `deleted` = ? AND `security_groups_1` . `deleted` = ? ) ON `security_group_instance_association_1` . `instance_uuid` = `anon_1` . `instances_uuid` AND `anon_1` . `instances_deleted` = ? LEFT OUTER JOIN `security_group_rules` AS `security_group_rules_1` ON `security_group_rules_1` . `parent_group_id` = `security_groups_1` . `id` AND `security_group_rules_1` . `deleted` = ? LEFT OUTER JOIN `instance_info_caches` AS `instance_info_caches_1` ON `instance_info_caches_1` . `instance_uuid` = `anon_1` . `instances_uuid` LEFT OUTER JOIN `instance_extra` AS `instance_extra_1` ON `instance_extra_1` . `instance_uuid` = `anon_1` . `instances_uuid` LEFT OUTER JOIN `instance_metadata` AS `instance_metadata_1` ON `instance_metadata_1` . `instance_uuid` = `anon_1` . `instances_uuid` AND `instance_metadata_1` . `deleted` = ? The instance table has a 1-to-many relationship to both "instance_metadata" and "instance_system_metadata" tables, so the query is effectively producing a cross join of both metadata tables. Steps to reproduce == To illustrate the impact of this query, add 2 properties to a running instance and verify that it has 2 records in "instance_metadata", as well as other records in "instance_system_metadata" such as base image properties: > select instance_uuid,`key`,value from instance_metadata where instance_uuid = > 'a6cf4a6a-effe-4438-9b7f-d61b23117b9b'; +--+---++ | instance_uuid| key | value | +--+---++ | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | +--+---++ 2 rows in set (0.61 sec) > select instance_uuid,`key`,valusystem_metadata where instance_uuid = > 'a6cf4a6a-effe-4438-9b7f-d61b23117b9b'; ++--+ | key| value| ++--+ | image_disk_format | qcow2| | image_min_ram | 0| | image_min_disk | 20 | | image_base_image_ref | 39cd564f-6a29-43e2-815b-62097968486a | | image_container_format | bare | ++--+ 5 rows in set (0.00 sec) For this particular instance, the generated query used by the metadata API will fetch 10 records from the database: +--+-+---++--+ | anon_1_instances_uuid| instance_metadata_1_key | instance_metadata_1_value | instance_system_metadata_1_key | instance_system_metadata_1_value | +--+-+---++--+ | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_disk_format | qcow2 | | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | image_disk_format | qcow2 | | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_min_ram | 0 | | a6cf4a6a-effe-4438-9b7f-d61b23117b9b |
[Yahoo-eng-team] [Bug 1778206] Re: Compute leaks volume attachments if we fail in driver.pre_live_migration
Reviewed: https://review.openstack.org/587439 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1a29248d5e688ba1d4f806895dccd45fcb34b833 Submitter: Zuul Branch:master commit 1a29248d5e688ba1d4f806895dccd45fcb34b833 Author: Matthew Booth Date: Tue Jun 26 14:42:47 2018 +0100 Ensure attachment cleanup on failure in driver.pre_live_migration Previously, if the call to driver.pre_live_migration failed (which in libvirt can happen with a DestinationDiskExists exception), the compute manager wouldn't rollback/cleanup volume attachments, leading to corrupt volume attachment information, and, depending on the backend, the instance being unable to access its volume. This patch moves the driver.pre_live_migration call inside the existing try/except, allowing the compute manager to properly rollback/cleanup volume attachments. The compute manager has its own _rollback_live_migration() cleanup in case the pre_live_migration() RPC call to the destination fails. There should be no conflicts between the cleanup in that and the new volume cleanup in the except block. The remove_volume_connection() -> driver_detach() -> detach_volume() call catches the InstanceNotFound exception and warns about the instance disappearing (it was never really on the destination in the first place). The attachment_delete() in _rollback_live_migration() is contingent on there being an old_vol_attachment in migrate_data, which there isn't because pre_live_migration() raised instead of returning. Change-Id: I67f66e95d69ae6df22e539550a3eac697ea8f5d8 Closes-bug: 1778206 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1778206 Title: Compute leaks volume attachments if we fail in driver.pre_live_migration Status in OpenStack Compute (nova): Fix Released Bug description: ComputeManager.pre_live_migration fails to clean up volume attachments if the call to driver.pre_live_migration() fails. There's a try block in there to clean up attachments, but its scope isn't large enough. The result is a volume in a perpetual attaching state. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1778206/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1795046] Re: Rocky Openstack CentOS documentation not matching
As Adam said, you need to set OS_IDENTITY_API_VERSION=3 for the openstack client to recognize that it needs to handle this v3-specific subcommand. Marking this as invalid. ** Changed in: keystone Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1795046 Title: Rocky Openstack CentOS documentation not matching Status in OpenStack Identity (keystone): Invalid Bug description: Installation Documentation on site: https://docs.openstack.org/keystone/rocky/install/keystone-users- rdo.html is written to run command below but it is not a valid command. [cmock@controller ~]$ sudo openstack domain create --description "An Example Domain" example openstack: 'domain create --description An Example Domain example' is not an openstack command. See 'openstack --help'. Did you mean one of these? command list container create container delete container list container save container set container show container unset [cmock@controller ~]$ Suggest updating documentation --- Release: on 2018-09-10 22:19 SHA: c5930abc5aa06881f28baa697d8d43a1f25157b8 Source: https://git.openstack.org/cgit/openstack/keystone/tree/doc/source/install/keystone-users-rdo.rst URL: https://docs.openstack.org/keystone/rocky/install/keystone-users-rdo.html To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1795046/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1783010] Re: Configure the Apache HTTP server (incorrect edit file)
The instructions are correct as-is, /etc/apache2/apache2.conf is a valid place to set the ServerName. ** Changed in: keystone Status: In Progress => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1783010 Title: Configure the Apache HTTP server (incorrect edit file) Status in OpenStack Identity (keystone): Invalid Bug description: This bug tracker is for errors with the documentation, use the following as a template and remove or add fields as you see fit. Convert [ ] into [x] to check boxes: - [ ] This doc is inaccurate in this way: __ - [ ] This is a doc addition request. - [x] I have a fix to the document that I can paste below including example: input and output. Configure the Apache HTTP server 1. Edit the /etc/apache2/sites-enabled/keystone.conf file and add the ServerName option to reference the controller node: ServerName controller ... To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1783010/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1784353] Re: Rescheduled boot from volume instances fail due to the premature removal of their attachments
** Also affects: nova/rocky Importance: Undecided Status: New ** Changed in: nova/queens Status: New => Triaged ** Changed in: nova/rocky Status: New => Triaged ** Changed in: nova Assignee: Stephen Finucane (stephenfinucane) => Lee Yarwood (lyarwood) ** Changed in: nova/queens Importance: Undecided => Medium ** Changed in: nova/rocky Importance: Undecided => Medium -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1784353 Title: Rescheduled boot from volume instances fail due to the premature removal of their attachments Status in OpenStack Compute (nova): In Progress Status in OpenStack Compute (nova) queens series: Triaged Status in OpenStack Compute (nova) rocky series: Triaged Bug description: Description === This is caused by the cleanup code within the compute layer (_shutdown_instance) removing all volume attachments associated with an instance with no attempt being made to recreate these ahead of the instance being rescheduled. Steps to reproduce == - Attempt to boot an instance with volumes attached. - Ensure spawn() fails, for example by stopping the l2 network agent services on the compute host. Expected result === The instance is reschedule to another compute host and boots correctly. Actual result = The instance fails to boot on all hosts that is rescheduled to due to a missing volume attachment. Environment === 1. Exact version of OpenStack you are running. See the following list for all releases: http://docs.openstack.org/releases/ bf497cc47497d3a5603bf60de652054ac5ae1993 2. Which hypervisor did you use? (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...) What's the version of that? Libvirt + KVM, however this shouldn't matter. 3. Which storage type did you use? (For example: Ceph, LVM, GPFS, ...) What's the version of that? N/A 4. Which networking type did you use? (For example: nova-network, Neutron with OpenVSwitch, ...) N/A Logs & Configs == 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] Traceback (most recent call last): 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1579, in _prep_block_device 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] wait_func=self._await_block_device_map_created) 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 837, in attach_block_devices 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] _log_and_attach(device) 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 834, in _log_and_attach 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] bdm.attach(*attach_args, **attach_kwargs) 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 46, in wrapped 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] ret_val = method(obj, context, *args, **kwargs) 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 617, in attach 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] virt_driver, do_driver_attach) 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] return f(*args, **kwargs) 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 614, in _do_locked_attach 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance: d48c9894-2ba2-4752-bae5-36c437933ff1] self._do_attach(*args, **_kwargs) 2018-07-04 15:19:43.191 1 ERROR nova.compute.manager [instance:
[Yahoo-eng-team] [Bug 1799186] Re: Queens compute node is not compatible with Pike Controller node
This isn't really supported. You should be configuring [upgrade_levels]/compute to pike: https://docs.openstack.org/nova/latest/configuration/config.html#upgrade_levels.compute Until you get everything upgraded to Queens at which point you can remove the RPC version pin. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1799186 Title: Queens compute node is not compatible with Pike Controller node Status in OpenStack Compute (nova): Invalid Bug description: Description === We Have OpenStack Pike running on Ubuntu 16.04. As per OpenStack documentation, compute node should support N+1 version of controller. But when we upgrade the controller node, we get below error on compute side for all actions performed. ERROR oslo_messaging.rpc.server UnsupportedVersion: Endpoint does not support RPC version 5.0. Attempted method: build_and_run_instance The RPC version of Pike compute is not supported with RPC version of Queens controller. Thus we are unable to upgrade this setup. Steps to reproduce == 1. Setup openstack Pike on Ubuntu 16.04 2. Upgrade controller node to Queens by adding new keys 3. Synch db, restart (standard upgrade process) 4. After successful upgrade of controller node, check functions of nova (create instance, start/stop instance) Here controller is on Queens and Compute is on Pike 5. You should get error that RPC versions are not supported Expected result === Compute Should be compatible with N+1 version of OpenStack. In This case, this scenario must be supported and compute functions must work. Actual result = All compute related functions fails. Start/stop/reboot instance fails. Logs ERROR oslo_messaging.rpc.server UnsupportedVersion: Endpoint does not support RPC version 5.0. Attempted method: build_and_run_instance Environment === Ubuntu 16.04 controller and compute with Pike installation. Do let me know if you need more details. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1799186/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799246] [NEW] module level init of db transation contexts cause failure under mod_uwsgi on module reload
Public bug reported: Description === This is related to a downstream bug first reported here:https://bugzilla.redhat.com/show_bug.cgi?id=1630069 where in it was discovers that due to how triplo currently deploys the placemetn api under mod_wsgi when deployed under mod_wsgi if the wsgi application exits with an error it is reloaded back into the same python interperter instance. As a result of this behavior module level variable have a longer lifetime then the lifetime of the application. when run under uwsgi when the application is reloaded it is reloaded in a new python interperter meaning the lifetime of the module level variables is scoped to the life time of the application. As a result of the life time semantics of mod_wsgi the placment api and nova-api must assume that the wsgi applications init can be invoked multiple times on failure. The current use of the sqlalcamey enginefacade transaction_context is not reentrant resulting in a type error being raised on subsequet calls to configure when the wsgi application is reloaded. Expected result === it should be possible to reload the nova and placement api wsgi application under mod_wsgi on failure Actual result = 46087 [Wed Oct 10 15:10:49.433284 2018] [:error] [pid 14] [remote 172.25.0.10:208] mod_wsgi (pid=14): Target WSGI script '/var/www/cgi-bin/nova/nova-placement-api' cannot be loaded as Python module. 46088 [Wed Oct 10 15:10:49.433305 2018] [:error] [pid 14] [remote 172.25.0.10:208] mod_wsgi (pid=14): Exception occurred processing WSGI script '/var/www/cgi-bin/nova/nova-placement-api'. 46089 [Wed Oct 10 15:10:49.433320 2018] [:error] [pid 14] [remote 172.25.0.10:208] Traceback (most recent call last): 46090 [Wed Oct 10 15:10:49.43 2018] [:error] [pid 14] [remote 172.25.0.10:208] File "/var/www/cgi-bin/nova/nova-placement-api", line 54, in 46091 [Wed Oct 10 15:10:49.433354 2018] [:error] [pid 14] [remote 172.25.0.10:208] application = init_application() 46092 [Wed Oct 10 15:10:49.433361 2018] [:error] [pid 14] [remote 172.25.0.10:208] File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi.py", line 108, in init_application 46093 [Wed Oct 10 15:10:49.433386 2018] [:error] [pid 14] [remote 172.25.0.10:208] db_api.configure(conf.CONF) 46094 [Wed Oct 10 15:10:49.433392 2018] [:error] [pid 14] [remote 172.25.0.10:208] File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/db_api.py", line 35, in configure 46095 [Wed Oct 10 15:10:49.433403 2018] [:error] [pid 14] [remote 172.25.0.10:208] **_get_db_conf(conf.placement_database)) 46096 [Wed Oct 10 15:10:49.433408 2018] [:error] [pid 14] [remote 172.25.0.10:208] File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 788, in configure 46097 [Wed Oct 10 15:10:49.433420 2018] [:error] [pid 14] [remote 172.25.0.10:208] self._factory.configure(**kw) 46098 [Wed Oct 10 15:10:49.433425 2018] [:error] [pid 14] [remote 172.25.0.10:208] File "/usr/lib/python2.7/site-packages/debtcollector/renames.py", line 43, in decorator 46099 [Wed Oct 10 15:10:49.433435 2018] [:error] [pid 14] [remote 172.25.0.10:208] return wrapped(*args, **kwargs) 46100 [Wed Oct 10 15:10:49.433440 2018] [:error] [pid 14] [remote 172.25.0.10:208] File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 312, in configure 46101 [Wed Oct 10 15:10:49.433449 2018] [:error] [pid 14] [remote 172.25.0.10:208] self._configure(False, kw) 46102 [Wed Oct 10 15:10:49.433453 2018] [:error] [pid 14] [remote 172.25.0.10:208] File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 317, in _configure 46103 [Wed Oct 10 15:10:49.433462 2018] [:error] [pid 14] [remote 172.25.0.10:208] raise TypeError("this TransactionFactory is already started") 46104 [Wed Oct 10 15:10:49.433473 2018] [:error] [pid 14] [remote 172.25.0.10:208] TypeError: this TransactionFactory is already started ** Affects: nova Importance: Medium Assignee: sean mooney (sean-k-mooney) Status: In Progress ** Tags: api placement ** Changed in: nova Assignee: (unassigned) => sean mooney (sean-k-mooney) ** Changed in: nova Status: New => In Progress ** Changed in: nova Importance: Undecided => Medium ** Tags added: api placement -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1799246 Title: module level init of db transation contexts cause failure under mod_uwsgi on module reload Status in OpenStack Compute (nova): In Progress Bug description: Description === This is related to a downstream bug first reported here:https://bugzilla.redhat.com/show_bug.cgi?id=1630069 where in it was discovers that due to how triplo currently deploys the placemetn api under mod_wsgi when
[Yahoo-eng-team] [Bug 1799186] [NEW] Queens compute node is not compatible with Pike Controller node
Public bug reported: Description === We Have OpenStack Pike running on Ubuntu 16.04. As per OpenStack documentation, compute node should support N+1 version of controller. But when we upgrade the controller node, we get below error on compute side for all actions performed. ERROR oslo_messaging.rpc.server UnsupportedVersion: Endpoint does not support RPC version 5.0. Attempted method: build_and_run_instance The RPC version of Pike compute is not supported with RPC version of Queens controller. Thus we are unable to upgrade this setup. Steps to reproduce == 1. Setup openstack Pike on Ubuntu 16.04 2. Upgrade controller node to Queens by adding new keys 3. Synch db, restart (standard upgrade process) 4. After successful upgrade of controller node, check functions of nova (create instance, start/stop instance) Here controller is on Queens and Compute is on Pike 5. You should get error that RPC versions are not supported Expected result === Compute Should be compatible with N+1 version of OpenStack. In This case, this scenario must be supported and compute functions must work. Actual result = All compute related functions fails. Start/stop/reboot instance fails. Logs ERROR oslo_messaging.rpc.server UnsupportedVersion: Endpoint does not support RPC version 5.0. Attempted method: build_and_run_instance Environment === Ubuntu 16.04 controller and compute with Pike installation. Do let me know if you need more details. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1799186 Title: Queens compute node is not compatible with Pike Controller node Status in OpenStack Compute (nova): New Bug description: Description === We Have OpenStack Pike running on Ubuntu 16.04. As per OpenStack documentation, compute node should support N+1 version of controller. But when we upgrade the controller node, we get below error on compute side for all actions performed. ERROR oslo_messaging.rpc.server UnsupportedVersion: Endpoint does not support RPC version 5.0. Attempted method: build_and_run_instance The RPC version of Pike compute is not supported with RPC version of Queens controller. Thus we are unable to upgrade this setup. Steps to reproduce == 1. Setup openstack Pike on Ubuntu 16.04 2. Upgrade controller node to Queens by adding new keys 3. Synch db, restart (standard upgrade process) 4. After successful upgrade of controller node, check functions of nova (create instance, start/stop instance) Here controller is on Queens and Compute is on Pike 5. You should get error that RPC versions are not supported Expected result === Compute Should be compatible with N+1 version of OpenStack. In This case, this scenario must be supported and compute functions must work. Actual result = All compute related functions fails. Start/stop/reboot instance fails. Logs ERROR oslo_messaging.rpc.server UnsupportedVersion: Endpoint does not support RPC version 5.0. Attempted method: build_and_run_instance Environment === Ubuntu 16.04 controller and compute with Pike installation. Do let me know if you need more details. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1799186/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799178] [NEW] l2 pop doesn't always provide the whole list of fdb entries on agent restart
Public bug reported: The whole list of fdb entries is provided to the agent in case a port form new network appears, or when agent is restarted. Currently agent restart is detected by agent_boot_time option, 180 sec by default. In fact boot time differs depending on port count and on some loaded clusters may exceed 180 secs on gateway nodes easily. Changing boot time in config works, but honestly this is not an ideal solution. There should be a smarter way for agent restart detection (like agent itself sending flag in state report). ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1799178 Title: l2 pop doesn't always provide the whole list of fdb entries on agent restart Status in neutron: New Bug description: The whole list of fdb entries is provided to the agent in case a port form new network appears, or when agent is restarted. Currently agent restart is detected by agent_boot_time option, 180 sec by default. In fact boot time differs depending on port count and on some loaded clusters may exceed 180 secs on gateway nodes easily. Changing boot time in config works, but honestly this is not an ideal solution. There should be a smarter way for agent restart detection (like agent itself sending flag in state report). To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1799178/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1797309] Re: Every item in navigation bar of workflow form should be hide if the parameter ready is false
** Also affects: horizon Importance: Undecided Status: New ** Changed in: horizon Status: New => Confirmed ** Changed in: horizon Importance: Undecided => Medium -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1797309 Title: Every item in navigation bar of workflow form should be hide if the parameter ready is false Status in OpenStack Dashboard (Horizon): Confirmed Status in horizon package in Ubuntu: New Bug description: In workflow wizard, every item of navigation used parameter 'ng-show="viewModel.ready"' to determine whether it should be display. I think it should use the parameter 'ready' of every item, like this: 'ng-show="step.ready"'. I think it make sense. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1797309/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1798806] Re: Race condition between RT and scheduler
*** This bug is a duplicate of bug 1729621 *** https://bugs.launchpad.net/bugs/1729621 I just found that this problem is fixed in the master branch as part of bug #1729621. However, it is not backported to stable releases. ** This bug has been marked a duplicate of bug 1729621 Inconsistent value for vcpu_used -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1798806 Title: Race condition between RT and scheduler Status in OpenStack Compute (nova): In Progress Bug description: The HostState object which is used by the scheduler is using the 'stats' property of the compute node to derive its own values, e.g. : self.stats = compute.stats or {} self.num_instances = int(self.stats.get('num_instances', 0)) self.num_io_ops = int(self.stats.get('io_workload', 0)) self.failed_builds = int(self.stats.get('failed_builds', 0)) These values are used for both filtering and weighing compute hosts. However, the 'stats' property of the compute node is cleared during the periodic update_available_resources() and populated again. The clearing occurs in RT._copy_resources() and it preserves only the old value of 'failed_builds'. This creates a race condition between RT and scheduler which may result into populating wrong values for 'num_io_ops' and 'num_instances' into the HostState object and thus leading to incorrect scheduling decisions. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1798806/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799153] [NEW] Inappropriate behaviour of limits when passing --region None in create and list.
Public bug reported: When creating registered limit by passing --region None in registered limit create cli, it is giving error message "More than one resource exist for region" which is definitely a wrong message as regions with same name cannot be created neither same exist. The correct behaviour should be - 1. In the case if --region None it should create a registered limit. 2. "No region exist with name xyz" if passed a invalid 'xyz' region while creating. Same in case of registerd limit list 1. In the case if --region None it should list all limits ignoring None. 2. "No region exist with name xyz" if passed a invalid 'xyz' region while listing. Sane behaviors for limit create and list ** Affects: keystone Importance: Undecided Assignee: Vishakha Agarwal (vishakha.agarwal) Status: New ** Description changed: When creating registered limit by passing --region None in registered limit create cli, it is giving error message "More than one resource exist for region" which is definitely a wrong message as regions with name name cannot be created neither same exist. - The correct behaviour should be - + The correct behaviour should be - 1. In the case if --region None it should create a registered limit. 2. "No region exist with name xyz" if passed a invalid 'xyz' region while creating. Same in case of registerd limit list 1. In the case if --region None it should list all limits ignoring None. 2. "No region exist with name xyz" if passed a invalid 'xyz' region while listing. + + Sane behaviors for limit create and list ** Changed in: keystone Assignee: (unassigned) => Vishakha Agarwal (vishakha.agarwal) ** Description changed: When creating registered limit by passing --region None in registered limit create cli, it is giving error message "More than one resource exist for region" which is definitely a wrong message as regions with - name name cannot be created neither same exist. + same name cannot be created neither same exist. The correct behaviour should be - 1. In the case if --region None it should create a registered limit. 2. "No region exist with name xyz" if passed a invalid 'xyz' region while creating. Same in case of registerd limit list 1. In the case if --region None it should list all limits ignoring None. 2. "No region exist with name xyz" if passed a invalid 'xyz' region while listing. Sane behaviors for limit create and list -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1799153 Title: Inappropriate behaviour of limits when passing --region None in create and list. Status in OpenStack Identity (keystone): New Bug description: When creating registered limit by passing --region None in registered limit create cli, it is giving error message "More than one resource exist for region" which is definitely a wrong message as regions with same name cannot be created neither same exist. The correct behaviour should be - 1. In the case if --region None it should create a registered limit. 2. "No region exist with name xyz" if passed a invalid 'xyz' region while creating. Same in case of registerd limit list 1. In the case if --region None it should list all limits ignoring None. 2. "No region exist with name xyz" if passed a invalid 'xyz' region while listing. Sane behaviors for limit create and list To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1799153/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799155] [NEW] [l3][port_forwarding] tow different protocols can not have the same internal/external port number at the same time
Public bug reported: ENV: devstack master Floating IP port_forwardings with different protocols can not have the same internal or external port number to the same vm_port. But we can have different application server, for instance TCP server and UDP server, listen to the same port at same time. For instance, if you create a port_forwarding to a floating IP with the following input: {"port_forwarding": { "internal_port_id": "3145b56c-949d-45d4-9e35-614117b5f69c", "internal_port": 22, "protocol": "tcp", "external_port": 22, "internal_ip_address": "192.168.188.3" } } And then add another port_forwarding with protocol to udp and internal port number 22 again: {"port_forwarding": { "internal_port_id": "3145b56c-949d-45d4-9e35-614117b5f69c", "internal_port": 22, "protocol": "udp", "external_port": , "internal_ip_address": "192.168.188.3" } } The neutron will return 40x error. This is the key point, these unique constraints do not consider the protocol: https://github.com/openstack/neutron/blob/master/neutron/db/migration/alembic_migrations/versions/rocky/expand/867d39095bf4_port_forwarding.py#L53-L58 ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1799155 Title: [l3][port_forwarding] tow different protocols can not have the same internal/external port number at the same time Status in neutron: New Bug description: ENV: devstack master Floating IP port_forwardings with different protocols can not have the same internal or external port number to the same vm_port. But we can have different application server, for instance TCP server and UDP server, listen to the same port at same time. For instance, if you create a port_forwarding to a floating IP with the following input: {"port_forwarding": { "internal_port_id": "3145b56c-949d-45d4-9e35-614117b5f69c", "internal_port": 22, "protocol": "tcp", "external_port": 22, "internal_ip_address": "192.168.188.3" } } And then add another port_forwarding with protocol to udp and internal port number 22 again: {"port_forwarding": { "internal_port_id": "3145b56c-949d-45d4-9e35-614117b5f69c", "internal_port": 22, "protocol": "udp", "external_port": , "internal_ip_address": "192.168.188.3" } } The neutron will return 40x error. This is the key point, these unique constraints do not consider the protocol: https://github.com/openstack/neutron/blob/master/neutron/db/migration/alembic_migrations/versions/rocky/expand/867d39095bf4_port_forwarding.py#L53-L58 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1799155/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799151] [NEW] table: the checkbox show uncertain when updating single row
Public bug reported: If there are more than one processes running one http service, and when updating single row data by ajax, the checkbox of the table will be displayed or hidden uncertainty. ** Affects: horizon Importance: Undecided Status: New ** Description changed: If there are more than one processes running one http service, and when updating single row data by ajax, the checkbox of the table will be - display or hidden uncertainly. + displayed or hidden uncertainty. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1799151 Title: table: the checkbox show uncertain when updating single row Status in OpenStack Dashboard (Horizon): New Bug description: If there are more than one processes running one http service, and when updating single row data by ajax, the checkbox of the table will be displayed or hidden uncertainty. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1799151/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799152] [NEW] Retry after hitting libvirt error code VIR_ERR_OPERATION_INVALID in live migration.
Public bug reported: Description === When migration of a persistent guest completes, the guest merely shuts off, but libvirt unhelpfully raises an VIR_ERR_OPERATION_INVALID error code, in the nova code, we pretend this case means success. But if we are in the middle of a live migration, and sadly qemu-kvm process is killed accidentally, such as by host OOM, which happens rarely in our environment but it does happen few times, domain state is SHUTOFF and then we will get VIR_ERR_OPERATION_INVALID while trying to call `self._domain.jobStats()`. Under the circumstance, migration should be considered failed, otherwise post_live_migration() function starts to clean up instance files and we will lose customers' data forever. IMHO, we may need to `pretend` the migration job is still running after hitting VIR_ERR_OPERATION_INVALID and retry to get job stats for a few times, which the count of retries can be configured. Because if migration succeeds finally, we won't get VIR_ERR_OPERATION_INVALID after some retries, but the error code still happens if qemu-kvm process is killed accidentally. Steps to reproduce == * Do nova live-migration on controller node. * Once live migration monitor on source compute node starts to get JobInfo, kill the qemu-kvm process on source host. * Check if post_live_migration on source host starts to execute. * Check if post_live_migration on destination host starts to execute. * Check image files on both source host and destination host. Expected result === Migration should be consider failed. Actual result = Post live migration on source host starts to execute and clean instance files. Instance disappears on both source and destination host. Environment === 1. My environment is packstack, and openstack nova release is Queens. 2. Libvirt + KVM Logs & Configs == Some logs after qemu-kvm process is killed. ``` ... 2018-09-21 14:08:34.180 11099 DEBUG nova.virt.libvirt.migration [req-d8e0cfab-ea85-4716-a2fe-1307a7004f12 bf015418722f437e9f031efabc7a98e6 ca68d7d736374dbfb38d4ef2f80b2a5c - default default] [instance: ba8feaea-eedc-4b7c-8ffa-01152fc9bde8] Downtime does not need to change update_downtime /usr/lib/python2.7/site-packages/nova/virt/libvirt/migration.py:410 2018-09-21 14:08:34.305 11099 DEBUG nova.virt.libvirt.driver [req-d8e0cfab-ea85-4716-a2fe-1307a7004f12 bf015418722f437e9f031efabc7a98e6 ca68d7d736374dbfb38d4ef2f80b2a5c - default default] [instance: ba8feaea-eedc-4b7c-8ffa-01152fc9bde8] Migration running for 10 secs, memory 100% remaining; (bytes processed=0, remaining=0, total=0) _live_migration_monitor /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py:7394 2018-09-21 14:08:34.886 11099 DEBUG nova.virt.libvirt.guest [req-d8e0cfab-ea85-4716-a2fe-1307a7004f12 bf015418722f437e9f031efabc7a98e6 ca68d7d736374dbfb38d4ef2f80b2a5c - default default] Domain has shutdown/gone away: Requested operation is not valid: domain is not running get_job_info /usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py:720 2018-09-21 14:08:34.887 11099 INFO nova.virt.libvirt.driver [req-d8e0cfab-ea85-4716-a2fe-1307a7004f12 bf015418722f437e9f031efabc7a98e6 ca68d7d736374dbfb38d4ef2f80b2a5c - default default] [instance: ba8feaea-eedc-4b7c-8ffa-01152fc9bde8] Migration operation has completed 2018-09-21 14:08:34.887 11099 INFO nova.compute.manager [req-d8e0cfab-ea85-4716-a2fe-1307a7004f12 bf015418722f437e9f031efabc7a98e6 ca68d7d736374dbfb38d4ef2f80b2a5c - default default] [instance: ba8feaea-eedc-4b7c-8ffa-01152fc9bde8] _post_live_migration() is started.. ... ``` ** Affects: nova Importance: Undecided Assignee: Fan Zhang (fanzhang) Status: New ** Changed in: nova Assignee: (unassigned) => Fan Zhang (fanzhang) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1799152 Title: Retry after hitting libvirt error code VIR_ERR_OPERATION_INVALID in live migration. Status in OpenStack Compute (nova): New Bug description: Description === When migration of a persistent guest completes, the guest merely shuts off, but libvirt unhelpfully raises an VIR_ERR_OPERATION_INVALID error code, in the nova code, we pretend this case means success. But if we are in the middle of a live migration, and sadly qemu-kvm process is killed accidentally, such as by host OOM, which happens rarely in our environment but it does happen few times, domain state is SHUTOFF and then we will get VIR_ERR_OPERATION_INVALID while trying to call `self._domain.jobStats()`. Under the circumstance, migration should be considered failed, otherwise post_live_migration() function starts to clean up instance files and we will lose customers' data forever. IMHO, we may need to `pretend` the migration job is still running after hitting
[Yahoo-eng-team] [Bug 1799150] [NEW] [l3][port_forwarding] internal/external port should not allow 0
Public bug reported: ENV: devstack master Floating IP port forwarding internal or external port number should not allow 0, otherwise you will get some ValueError exception in neutron server. Step to reproduce: 1. create router with connected privated subnet and public gateway. 2. create VM to the private subnet 3. create floating IP 4. create port forwarding with internal or external port number 0 ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1799150 Title: [l3][port_forwarding] internal/external port should not allow 0 Status in neutron: New Bug description: ENV: devstack master Floating IP port forwarding internal or external port number should not allow 0, otherwise you will get some ValueError exception in neutron server. Step to reproduce: 1. create router with connected privated subnet and public gateway. 2. create VM to the private subnet 3. create floating IP 4. create port forwarding with internal or external port number 0 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1799150/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799140] [NEW] [l3][port_forwarding] subnet can not remove from router if have port_forwarding
Public bug reported: ENV: devstack master step to reproduce: 1. create router 2. add router public gateway 3. add router interface to subnet1, subnet2, subnet3 4. create a vm to subnet1 5. create floating IP with port forwarding to the vm port from subnet1 Then, you will not be able to remove router interface from subnet2 and subnet3. Neutron server will raise some netaddr related error. ** Affects: neutron Importance: Undecided Status: New ** Description changed: ENV: devstack master - step to reproduce: 1. create router 2. add router public gateway 3. add router interface to subnet1, subnet2, subnet3 4. create a vm to subnet1 - 4. create floating IP with port forwarding to the vm port from subnet1 + 5. create floating IP with port forwarding to the vm port from subnet1 Then, you will not be able to remove router interface from subnet2 and subnet3. Neutron server will raise some netaddr related error. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1799140 Title: [l3][port_forwarding] subnet can not remove from router if have port_forwarding Status in neutron: New Bug description: ENV: devstack master step to reproduce: 1. create router 2. add router public gateway 3. add router interface to subnet1, subnet2, subnet3 4. create a vm to subnet1 5. create floating IP with port forwarding to the vm port from subnet1 Then, you will not be able to remove router interface from subnet2 and subnet3. Neutron server will raise some netaddr related error. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1799140/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799137] [NEW] [l3][port_forwarding] should not allow creating port_forwarding to a port which already has a binding floating IP
Public bug reported: Should not allow creating port_forwarding to a port which already has a binding floating IP for dvr routers. ENV: devstack master step to reproduce: 1. create dvr router with connected privated subnet and public gateway. 2. create VM to the private subnet 3. binding floating IP A to VM port 4. create floating IP B with port forwarding to the VM port Then floating IP B with port forwarding will not work. This should be restricted by neutron. ** Affects: neutron Importance: Undecided Status: New ** Description changed: Should not allow creating port_forwarding to a port which already has a binding floating IP for dvr routers. ENV: devstack master - step to reproduce: 1. create dvr router with connected privated subnet and public gateway. 2. create VM to the private subnet - 3. create floating IP B with port forwarding to the VM port - 4. binding floating IP B to VM port + 3. binding floating IP B to VM port + 4. create floating IP B with port forwarding to the VM port Then floating IP B with port forwarding will not work. This should be restricted by neutron. ** Description changed: Should not allow creating port_forwarding to a port which already has a binding floating IP for dvr routers. ENV: devstack master step to reproduce: 1. create dvr router with connected privated subnet and public gateway. 2. create VM to the private subnet - 3. binding floating IP B to VM port + 3. binding floating IP A to VM port 4. create floating IP B with port forwarding to the VM port Then floating IP B with port forwarding will not work. This should be restricted by neutron. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1799137 Title: [l3][port_forwarding] should not allow creating port_forwarding to a port which already has a binding floating IP Status in neutron: New Bug description: Should not allow creating port_forwarding to a port which already has a binding floating IP for dvr routers. ENV: devstack master step to reproduce: 1. create dvr router with connected privated subnet and public gateway. 2. create VM to the private subnet 3. binding floating IP A to VM port 4. create floating IP B with port forwarding to the VM port Then floating IP B with port forwarding will not work. This should be restricted by neutron. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1799137/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799138] [NEW] [l3][port_forwarding] a port can have port_forwarding and then bind floating IP again
Public bug reported: ENV: devstack master step to reproduce: 1. create dvr router with connected privated subnet and public gateway. 2. create VM to the private subnet 3. create floating IP A with port forwarding to the VM port 4. binding floating IP B to VM port Then floating IP A with port forwarding will not work. This should be restricted by neutron. Something really similar to bug: https://bugs.launchpad.net/neutron/+bug/1799137 ** Affects: neutron Importance: Undecided Status: New ** Summary changed: - [l3][port_forwarding] a port can have port_forwarding and then binding floating IP again + [l3][port_forwarding] a port can have port_forwarding and then bind floating IP again -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1799138 Title: [l3][port_forwarding] a port can have port_forwarding and then bind floating IP again Status in neutron: New Bug description: ENV: devstack master step to reproduce: 1. create dvr router with connected privated subnet and public gateway. 2. create VM to the private subnet 3. create floating IP A with port forwarding to the VM port 4. binding floating IP B to VM port Then floating IP A with port forwarding will not work. This should be restricted by neutron. Something really similar to bug: https://bugs.launchpad.net/neutron/+bug/1799137 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1799138/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1799135] [NEW] [l3][port_forwarding] update floating IP (has binding port_forwarding) with empty {} input will lose router_id
Public bug reported: ENV: devstack master Step to reproduce: 1. create floating IP 2. create port forwarding for that floating IP 3. update floating IP with empty dict: curl -g -i -X PUT http://controller:9696/v2.0/floatingips/2bb4cc5d-7fae-4c1b-9482-ead60d67abea \ -H "User-Agent: python-neutronclient" -H "Accept: application/json" \ -H "X-Auth-Token: " \ -d '{"floatingip": {}}' Then this floating IP will turn to a bad status, it can not be managed anymore. Every action on this floating IP will get a neutron-server ERROR log. Furturemore only updating floating IP qos_policy_id can also result such behavior. curl -g -i -X PUT http://controller:9696/v2.0/floatingips/2bb4cc5d-7fae-4c1b-9482-ead60d67abea \ -H "User-Agent: python-neutronclient" -H "Accept: application/json" \ -H "X-Auth-Token: " \ -d '{"floatingip": {"qos_policy_id": "d9d3639e-b616-4007-a8fe-52d6154f1eec"}}' ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1799135 Title: [l3][port_forwarding] update floating IP (has binding port_forwarding) with empty {} input will lose router_id Status in neutron: New Bug description: ENV: devstack master Step to reproduce: 1. create floating IP 2. create port forwarding for that floating IP 3. update floating IP with empty dict: curl -g -i -X PUT http://controller:9696/v2.0/floatingips/2bb4cc5d-7fae-4c1b-9482-ead60d67abea \ -H "User-Agent: python-neutronclient" -H "Accept: application/json" \ -H "X-Auth-Token: " \ -d '{"floatingip": {}}' Then this floating IP will turn to a bad status, it can not be managed anymore. Every action on this floating IP will get a neutron-server ERROR log. Furturemore only updating floating IP qos_policy_id can also result such behavior. curl -g -i -X PUT http://controller:9696/v2.0/floatingips/2bb4cc5d-7fae-4c1b-9482-ead60d67abea \ -H "User-Agent: python-neutronclient" -H "Accept: application/json" \ -H "X-Auth-Token: " \ -d '{"floatingip": {"qos_policy_id": "d9d3639e-b616-4007-a8fe-52d6154f1eec"}}' To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1799135/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp