[Yahoo-eng-team] [Bug 1827453] [NEW] Nova scheduler attempts to re-assign currently in-use SR-IOV VF to new VM
Public bug reported: Running a small cluster with 16 compute nodes and 3 controller nodes on OpenStack Queens using SR-IOV VFs. From time to time, it appears that the Nova scheduler loses track of some of the PCI devices (VFs) that are actively mapped into servers. We don't know exactly when this occurs and we cannot trigger it on demand, but it occurs on a number of the compute nodes over time. Restarting the given compute node resolves the issue. The problem is manifest with the following errors: /var/log/nova/nova-conductor.log:2019-05-03 01:35:27.309 13073 ERROR nova.scheduler.utils [req-8418eb3a-4118-4505-97e3-fffbaae7aae6 2469493ff8b546ff9a6f4e339cc50ac2 33bb32d9463340bca0bb72a8c36579a9 - default default] [instance: b2b4dbf2-d381-4416-95c9-b410aa6d8377] Error from last host: node05 (node {REDACTED}): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/dist- packages/nova/compute/manager.py", line 1828, in _do_build_and_run_instance\nfilter_properties, request_spec)\n', u' File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2108, in _build_and_run_instance\ninstance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance b2b4dbf2-d381-4416-95c9-b410aa6d8377 was re-scheduled: Requested operation is not valid: PCI device :04:01.3 is in use by driver QEMU, domain instance-1466\n'] The compute nodes in question are configured with the following PCI whitelist: [pci] passthrough_whitelist = [{"vendor_id": "15b3", "product_id": "1004"}] Note the, despite similar bugs, there haven't been changes to the whitelist that would likely cause this to occur. It just seems to develop over time. = Versions = Compute nodes: ii nova-common 2:17.0.6-0ubuntu1 all OpenStack Compute - common files ii nova-compute 2:17.0.6-0ubuntu1 all OpenStack Compute - compute node base ii nova-compute-kvm 2:17.0.6-0ubuntu1 all OpenStack Compute - compute node (KVM) ii nova-compute-libvirt 2:17.0.6-0ubuntu1 all OpenStack Compute - compute node libvirt support Controller nodes: ii nova-api 2:17.0.9-0ubuntu1 all OpenStack Compute - API frontend ii nova-common 2:17.0.9-0ubuntu1 all OpenStack Compute - common files ii nova-compute 2:17.0.9-0ubuntu1 all OpenStack Compute - compute node base ii nova-compute-kvm 2:17.0.9-0ubuntu1 all OpenStack Compute - compute node (KVM) ii nova-compute-libvirt 2:17.0.9-0ubuntu1 all OpenStack Compute - compute node libvirt support ii nova-conductor2:17.0.9-0ubuntu1 all OpenStack Compute - conductor service ii nova-consoleauth 2:17.0.9-0ubuntu1 all OpenStack Compute - Console Authenticator ii nova-novncproxy 2:17.0.9-0ubuntu1 all OpenStack Compute - NoVNC proxy ii nova-placement-api2:17.0.9-0ubuntu1 all OpenStack Compute - placement API frontend ii nova-scheduler2:17.0.9-0ubuntu1 all OpenStack Compute - virtual machine scheduler ii nova-serialproxy 2:17.0.9-0ubuntu1 all OpenStack Compute - serial proxy ii nova-xvpvncproxy 2:17.0.9-0ubuntu1 all OpenStack Compute - XVP VNC proxy ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1827453 Title: Nova scheduler attempts to re-assign currently in-use SR-IOV VF to new VM Status in OpenStack Compute (nova): New Bug description: Running a small cluster with 16 compute nodes and 3 controller nodes on OpenStack Queens using SR-IOV VFs. From time to time, it appears that the Nova scheduler loses track of some of the PCI devices (VFs) that are actively mapped into servers. We don't know exactly when this occurs and we cannot trigger it on demand, but it occurs on a number of the compute nodes over time. Restarting the given compute node resolves the issue. The problem is manifest with the following errors: /var/log/nova/nova-conductor.log:2019-05-03 01:35:27.309 13073 ERROR nova.scheduler.utils
[Yahoo-eng-team] [Bug 1784342] Re: AttributeError: 'Subnet' object has no attribute '_obj_network_id'
subscribed field-high, added Ubuntu Neutron package, since this has occurred in multiple production sites. ** Also affects: neutron (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1784342 Title: AttributeError: 'Subnet' object has no attribute '_obj_network_id' Status in neutron: Confirmed Status in neutron package in Ubuntu: New Bug description: Running rally caused subnets to be created without a network_id causing this AttributeError. OpenStack Queens RDO packages [root@controller1 ~]# rpm -qa | grep -i neutron python-neutron-12.0.2-1.el7.noarch openstack-neutron-12.0.2-1.el7.noarch python2-neutron-dynamic-routing-12.0.1-1.el7.noarch python2-neutron-lib-1.13.0-1.el7.noarch openstack-neutron-dynamic-routing-common-12.0.1-1.el7.noarch python2-neutronclient-6.7.0-1.el7.noarch openstack-neutron-bgp-dragent-12.0.1-1.el7.noarch openstack-neutron-common-12.0.2-1.el7.noarch openstack-neutron-ml2-12.0.2-1.el7.noarch MariaDB [neutron]> select project_id, id, name, network_id, cidr from subnets where network_id is null; +--+--+---++-+ | project_id | id | name | network_id | cidr| +--+--+---++-+ | b80468629bc5410ca2c53a7cfbf002b3 | 7a23c72b- 3df8-4641-a494-af7642563c8e | s_rally_1e4bebf1_1s3IN6mo | NULL | 1.9.13.0/24 | | b80468629bc5410ca2c53a7cfbf002b3 | f7a57946-4814-477a-9649-cc475fb4e7b2 | s_rally_1e4bebf1_qWSFSMs9 | NULL | 1.5.20.0/24 | +--+--+---++-+ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation [req-c921b9fb-499b-41c1-9103-93e71a70820c b6b96932bbef41fdbf957c2dc01776aa 050c556faa5944a8953126c867313770 - default default] GET failed.: AttributeError: 'Subnet' object has no attribute '_obj_network_id' 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation Traceback (most recent call last): 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/pecan/core.py", line 678, in __call__ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation self.invoke_controller(controller, args, kwargs, state) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/pecan/core.py", line 569, in invoke_controller 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation result = controller(*args, **kwargs) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 91, in wrapped 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation setattr(e, '_RETRY_EXCEEDED', True) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise() 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 87, in wrapped 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 147, in wrapper 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation ectxt.value = e.inner_exc 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise() 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File
[Yahoo-eng-team] [Bug 1827435] [NEW] add user option to ignore password_regex
Public bug reported: Heat's bug: https://storyboard.openstack.org/#!/story/2005210 Heat creates service users in its dedicated domain on the fly. These are crucial in situations that require deferred authentications, for example autoscaling. There's a password_regex option in [security_compliance] section in Keystone that enforces passwords to pass a certain regex, thus enforcing their strength. However Heat has no way to generate random passwords for its users that will certainly pass any such regex set. In fact the problem of generating a random string from arbitrary regex is quite a non trivial one and for now solutions/libraries exist only when regex uses only a certain subset of a full regex spec. When generating passwords for its domain users Heat creates quite a strong password (32 alphanum+special symbols), but still it may fail a custom regex set in Keystone. It is proposed to add another user option (ignore_password_regex) similar to those already existing in Keystone to override the regex enforcement of the password for given user. ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1827435 Title: add user option to ignore password_regex Status in OpenStack Identity (keystone): New Bug description: Heat's bug: https://storyboard.openstack.org/#!/story/2005210 Heat creates service users in its dedicated domain on the fly. These are crucial in situations that require deferred authentications, for example autoscaling. There's a password_regex option in [security_compliance] section in Keystone that enforces passwords to pass a certain regex, thus enforcing their strength. However Heat has no way to generate random passwords for its users that will certainly pass any such regex set. In fact the problem of generating a random string from arbitrary regex is quite a non trivial one and for now solutions/libraries exist only when regex uses only a certain subset of a full regex spec. When generating passwords for its domain users Heat creates quite a strong password (32 alphanum+special symbols), but still it may fail a custom regex set in Keystone. It is proposed to add another user option (ignore_password_regex) similar to those already existing in Keystone to override the regex enforcement of the password for given user. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1827435/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1827431] [NEW] add user option to ignore user inactivity period
Public bug reported: Heat's bug: https://storyboard.openstack.org/#!/story/2005210 Heat creates service users in its dedicated domain on the fly. These are crucial in situations that require deferred authentications, for example autoscaling. While it is currently possible to ignore some settings in [security_compliance] sections of Keystone for specific users, there's no way to ignore the "disable_user_account_days_inactive" setting. It is proposed to add such user option (similar to those already existing ones) to ignore this setting for a given user ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1827431 Title: add user option to ignore user inactivity period Status in OpenStack Identity (keystone): New Bug description: Heat's bug: https://storyboard.openstack.org/#!/story/2005210 Heat creates service users in its dedicated domain on the fly. These are crucial in situations that require deferred authentications, for example autoscaling. While it is currently possible to ignore some settings in [security_compliance] sections of Keystone for specific users, there's no way to ignore the "disable_user_account_days_inactive" setting. It is proposed to add such user option (similar to those already existing ones) to ignore this setting for a given user To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1827431/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1771506] Re: Unit test failure with OpenSSL 1.1.1
This bug was fixed in the package nova - 2:19.0.0-0ubuntu4 --- nova (2:19.0.0-0ubuntu4) eoan; urgency=medium * d/p/xenapi-agent-change-openssl-error-handling.patch: Cherry-picked from upstream to ensure xenapi agent only raises a RuntimeError exception when openssl returns a non-zero exit code (LP: #1771506). -- Corey Bryant Wed, 01 May 2019 17:10:47 -0400 ** Changed in: nova (Ubuntu) Status: Triaged => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1771506 Title: Unit test failure with OpenSSL 1.1.1 Status in Ubuntu Cloud Archive: Triaged Status in Ubuntu Cloud Archive queens series: Triaged Status in Ubuntu Cloud Archive rocky series: Triaged Status in Ubuntu Cloud Archive stein series: Triaged Status in OpenStack Compute (nova): In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Triaged Status in nova source package in Cosmic: Triaged Status in nova source package in Disco: Triaged Bug description: Hi, Building the Nova Queens package with OpenSSL 1.1.1 leads to unit test problems. This was reported to Debian at: https://bugs.debian.org/898807 The new openssl 1.1.1 is currently in experimental [0]. This package failed to build against this new package [1] while it built fine against the openssl version currently in unstable [2]. Could you please have a look? FAIL: nova.tests.unit.virt.xenapi.test_xenapi.XenAPIDiffieHellmanTestCase.test_encrypt_newlines_inside_message |nova.tests.unit.virt.xenapi.test_xenapi.XenAPIDiffieHellmanTestCase.test_encrypt_newlines_inside_message |-- |_StringException: pythonlogging:'': {{{2018-05-01 20:48:09,960 WARNING [oslo_config.cfg] Config option key_manager.api_class is deprecated. Use option key_manager.backend instead.}}} | |Traceback (most recent call last): | File "/<>/nova/tests/unit/virt/xenapi/test_xenapi.py", line 1592, in test_encrypt_newlines_inside_message |self._test_encryption('Message\nwith\ninterior\nnewlines.') | File "/<>/nova/tests/unit/virt/xenapi/test_xenapi.py", line 1577, in _test_encryption |enc = self.alice.encrypt(message) | File "/<>/nova/virt/xenapi/agent.py", line 432, in encrypt |return self._run_ssl(text).strip('\n') | File "/<>/nova/virt/xenapi/agent.py", line 428, in _run_ssl |raise RuntimeError(_('OpenSSL error: %s') % err) |RuntimeError: OpenSSL error: *** WARNING : deprecated key derivation used. |Using -iter or -pbkdf2 would be better. It looks like due to additional message on stderr. [0] https://lists.debian.org/msgid-search/20180501211400.ga21...@roeckx.be [1] https://breakpoint.cc/openssl-rebuild/2018-05-03-rebuild-openssl1.1.1-pre6/attempted/nova_17.0.0-4_amd64-2018-05-01T20%3A39%3A38Z [2] https://breakpoint.cc/openssl-rebuild/2018-05-03-rebuild-openssl1.1.1-pre6/successful/nova_17.0.0-4_amd64-2018-05-02T18%3A46%3A36Z To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1771506/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1827420] [NEW] Document issues with deep nesting of Quota/limits
Public bug reported: I wrote up the issues with gaming the system that can happen with deep quotas. This has driven what happened with 2 level quota in unified limites. https://adam.younglogic.com/2018/05/tracking-quota/ This should merge in with the documentation to explain why we limit things to 2 levels and figure out how to deal with deeper limits in the future. ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1827420 Title: Document issues with deep nesting of Quota/limits Status in OpenStack Identity (keystone): New Bug description: I wrote up the issues with gaming the system that can happen with deep quotas. This has driven what happened with 2 level quota in unified limites. https://adam.younglogic.com/2018/05/tracking-quota/ This should merge in with the documentation to explain why we limit things to 2 levels and figure out how to deal with deeper limits in the future. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1827420/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1827418] [NEW] Routed provider networks in neutron - placement CLI example
Public bug reported: - [x] This is a doc addition request. The section of the doc that has the curl request to placement which says "As of the writing of this guide, there is not placement API CLI client, so the curl command is used for this example.", that could be replaced with using osc-placement now: https://docs.openstack.org/osc-placement/latest/cli/index.html#resource- provider-aggregate-list --- Release: 13.0.4.dev30 on 2019-04-27 12:52 SHA: b4f3163dc4b1457755f151e1b30d89d31bec5e14 Source: https://git.openstack.org/cgit/openstack/neutron/tree/doc/source/admin/config-routed-networks.rst URL: https://docs.openstack.org/neutron/rocky/admin/config-routed-networks.html ** Affects: neutron Importance: Low Status: Confirmed ** Tags: doc low-hanging-fruit ** Changed in: neutron Importance: Undecided => Low ** Changed in: neutron Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1827418 Title: Routed provider networks in neutron - placement CLI example Status in neutron: Confirmed Bug description: - [x] This is a doc addition request. The section of the doc that has the curl request to placement which says "As of the writing of this guide, there is not placement API CLI client, so the curl command is used for this example.", that could be replaced with using osc-placement now: https://docs.openstack.org/osc-placement/latest/cli/index.html #resource-provider-aggregate-list --- Release: 13.0.4.dev30 on 2019-04-27 12:52 SHA: b4f3163dc4b1457755f151e1b30d89d31bec5e14 Source: https://git.openstack.org/cgit/openstack/neutron/tree/doc/source/admin/config-routed-networks.rst URL: https://docs.openstack.org/neutron/rocky/admin/config-routed-networks.html To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1827418/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1822199] Re: neutron-vpn-netns-wrapper not invoked with --rootwrap_config parameter
Reviewed: https://review.opendev.org/648726 Committed: https://git.openstack.org/cgit/openstack/neutron-vpnaas/commit/?id=7e9922858fc36cb890b59232d72cf6e7bcb5957c Submitter: Zuul Branch:master commit 7e9922858fc36cb890b59232d72cf6e7bcb5957c Author: Stephen Ma Date: Fri Mar 29 09:31:03 2019 -0700 Execute neutron-vpn-netns-wrapper with rootwrap_config argument When neutron uses neutron-rootwrap as the root_helper, add the --rootwrap_config parameter to neutron-vpn-netns-wrapper execution to support environments where rootwrap.conf is not in the default location. Closes-Bug: #1822199 Change-Id: I0a345d1b1815560dc4dd35fa5c9a34055fc9fb08 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1822199 Title: neutron-vpn-netns-wrapper not invoked with --rootwrap_config parameter Status in neutron: Fix Released Bug description: The neutron-vpn-netns-wrapper always assumes the rootwrap.conf lives in the default location of /etc/neutron/ because it is not executed with the --rootwrap_config parameter. If rootwrap.conf is not in the default location, then execution will fail with a message like: 2019-03-27 18:06:49.176 13642 INFO neutron.common.config [-] /opt/stack/service/neutron/venv/bin/neutron-vpn-netns-wrapper version 13.0.3.dev77 2019-03-27 18:06:49.177 13642 ERROR neutron_vpnaas.services.vpn.common.netns_wrapper [-] Incorrect configuration file: /etc/neutron/rootwrap.conf: NoOptionError: No option 'filters_path' in section: 'DEFAULT' ; Stderr: In this case, rootwrap.conf is actually in the non-default directory /opt/stack/service/neutron/etc/. So all neutron-vpn-netns-wrapper execution should include the --rootwrap_config= argument. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1822199/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1792710] Re: glance backend is in crazy resize when an image is uploading
PTG May 2nd Cinder/Glance session: Lets not make the size user settable and instead just resize bigger chunks at the time and shrink back after EOF. ** Changed in: glance Importance: Undecided => High ** Changed in: glance Milestone: None => next ** Also affects: glance-store Importance: Undecided Status: New ** Changed in: glance-store Importance: Undecided => High ** No longer affects: glance -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1792710 Title: glance backend is in crazy resize when an image is uploading Status in Cinder: In Progress Status in glance_store: New Bug description: When uploading a volume to glance as an image, the glance server don't know the image size, so the backend storage server(such as ceph) need to resize the image every time it received new chunk of data(by default 64K). So there will be huge times of resize operations that will impact the performance. - regarding cinder, it has not calculate the image size and pass the correct size to glance - regarding glance, it should allow the client to set image size. This is an known issue which can be found in driver files of all kinds of backend storage system: In file: glance_store/_drivers/rbd.py, function: add In file: glance_store/_drivers/cinder.py, function: add In file: glance_store/_drivers/sheepdog.py, function: add In all these files, there're comments like below: # If the image size provided is zero we need to do # a resize for the amount we are writing. This will # be slower so setting a higher chunk size may # speed things up a bit. To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/1792710/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1822262] Re: Unnecessary _fill_provider_mapping call during reschedule when claim fails
Reviewed: https://review.opendev.org/648676 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a9324ad84c6ef9b645643e8087f0f200d0ab1e88 Submitter: Zuul Branch:master commit a9324ad84c6ef9b645643e8087f0f200d0ab1e88 Author: Balazs Gibizer Date: Fri Mar 29 13:46:32 2019 +0100 Only call _fill_provider_mapping if claim succeeds During re-schedule condutor takes the next Selecton object from the host list and tries to allocate the requested resources on the host in the Selection object. So far the conductor also tried to find the resource provide mapping for such allocation even if the resource claim is failed. This is unnecessary. This patch makes sure that mapping is tried to be calculated if the claim succeeds first. Change-Id: I9944398c38d11466d27c2a4b24035b26d264b000 Closes-Bug: #1822262 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1822262 Title: Unnecessary _fill_provider_mapping call during reschedule when claim fails Status in OpenStack Compute (nova): Fix Released Bug description: During re-schedule the RequestGroup - RP mapping needs to be recalculated based on the new host selected. The current implementation unconditionally try to calculate the mapping even if the resource claim on the new host was unsuccessful [1]. This is wasteful. The mapping can be done conditionally after the claim was successful. [1]https://github.com/openstack/nova/blob/34a8e8ccf61865504b1a50c1d43a25a57d954119/nova/conductor/manager.py#L670-L680 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1822262/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1827363] [NEW] Additional port list / get_ports() failures when filtering and limiting at the same time
Public bug reported: When doing a openstack port list that filters for a fixed-ip/subnet and at the same time limits the amount of results neutron returns a 500 internal server error. This was already addressed in https://bugs.launchpad.net/neutron/+bug/1826186 but this bug is also present in other places. While running tempest against a Neutron Queens installation I came across another _get_ports_query() in neutron/plugins/ml2/plugin.py where filter is again called onto the result of an already limited query. See https://github.com/openstack/neutron/blob/6f4962dcf89aebf2552ee8ec0993c6389a953024/neutron/plugins/ml2/plugin.py#L2206 InvalidRequestError: Query.filter() being called on a Query which already has LIMIT or OFFSET applied. To modify the row-limited results of a Query, call from_self() first. Otherwise, call filter() before limit() or offset() are applied. File "pecan/core.py", line 683, in __call__ self.invoke_controller(controller, args, kwargs, state) [...] File "neutron/db/db_base_plugin_v2.py", line 1417, in get_ports page_reverse=page_reverse) File "neutron/plugins/ml2/plugin.py", line 1941, in _get_ports_query query = query.filter(substr_filter) File "", line 2, in filter File "sqlalchemy/orm/base.py", line 200, in generate assertion(self, fn.__name__) File "sqlalchemy/orm/query.py", line 435, in _no_limit_offset % (meth, meth) I applied a patch similar to the one Gabriele Cerami proposed in https://review.opendev.org/#/c/656066/ on our production setup and this seems to have fixed the bug there as well. When doing a grep for _get_ports_query() in the neutron codebase I find a function with this name being called in neutron/db/dvr_mac_db.py in get_ports_on_host_by_subnet(), I do not have a stacktrace or test for that though. See https://github.com/openstack/neutron/blob/6f4962dcf89aebf2552ee8ec0993c6389a953024/neutron/db/dvr_mac_db.py#L162 ** Affects: neutron Importance: Undecided Status: New ** Description changed: When doing a openstack port list that filters for a fixed-ip/subnet and at the same time limits the amount of results neutron returns a 500 internal server error. This was already addressed in https://bugs.launchpad.net/neutron/+bug/1826186 but this bug is also present in other places. While running tempest against a Neutron Queens installation I came across another _get_ports_query() in neutron/plugins/ml2/plugin.py where filter is again called onto the result of an already limited query. See https://github.com/openstack/neutron/blob/6f4962dcf89aebf2552ee8ec0993c6389a953024/neutron/plugins/ml2/plugin.py#L2206 InvalidRequestError: Query.filter() being called on a Query which already has LIMIT or OFFSET applied. To modify the row-limited results of a Query, call from_self() first. Otherwise, call filter() before limit() or offset() are applied. - File "pecan/core.py", line 683, in __call__ - self.invoke_controller(controller, args, kwargs, state) + File "pecan/core.py", line 683, in __call__ + self.invoke_controller(controller, args, kwargs, state) [...] - File "neutron/db/db_base_plugin_v2.py", line 1417, in get_ports - page_reverse=page_reverse) - File "neutron/plugins/ml2/plugin.py", line 1941, in _get_ports_query - query = query.filter(substr_filter) - File "", line 2, in filter - File "sqlalchemy/orm/base.py", line 200, in generate - assertion(self, fn.__name__) - File "sqlalchemy/orm/query.py", line 435, in _no_limit_offset - % (meth, meth) + File "neutron/db/db_base_plugin_v2.py", line 1417, in get_ports + page_reverse=page_reverse) + File "neutron/plugins/ml2/plugin.py", line 1941, in _get_ports_query + query = query.filter(substr_filter) + File "", line 2, in filter + File "sqlalchemy/orm/base.py", line 200, in generate + assertion(self, fn.__name__) + File "sqlalchemy/orm/query.py", line 435, in _no_limit_offset + % (meth, meth) I applied a patch similar to the one Gabriele Cerami proposed in - https://review.opendev.org/#/c/656066/ + https://review.opendev.org/#/c/656066/ on our production setup and this + seems to have fixed the bug there as well. When doing a grep for _get_ports_query() in the neutron codebase I find a function with this name being called in neutron/db/dvr_mac_db.py in get_ports_on_host_by_subnet(), I do not have a stacktrace or test for that though. See https://github.com/openstack/neutron/blob/6f4962dcf89aebf2552ee8ec0993c6389a953024/neutron/db/dvr_mac_db.py#L162 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1827363 Title: Additional port list / get_ports() failures when filtering and limiting at the same time Status in neutron: New Bug description: When doing a openstack port list that filters for a fixed-ip/subnet and at the same time limits
[Yahoo-eng-team] [Bug 1827342] [NEW] Issue sharing an image with another project (something related to get_image_location)
Public bug reported: I have a small Rocky installation where Glance is configured with 2 backends (old images use the 'file' backend while new ones use the rbd backend, which is the default) show_multiple_locations is true but I have modified the _image_location policies. The used policy.json file is attached If (as regular, non-admin user) I try to share a private image with another project I get an error message: [sgaravat@lxsgaravat ~]$ glance member-list --image-id 3a4763d0-aa49-4389-9b8b-163206a8d671 +--+---++ | Image ID | Member ID | Status | +--+---++ +--+---++ [sgaravat@lxsgaravat ~]$ openstack image add project 3a4763d0-aa49-4389-9b8b-163206a8d671 e81df4c0b493439abb8b85bfd4cbe071 403 Forbidden: Not allowed to create members for image 3a4763d0-aa49-4389-9b8b-163206a8d671. (HTTP 403) But actually the operation succeeded: [sgaravat@lxsgaravat ~]$ glance member-list --image-id 3a4763d0-aa49-4389-9b8b-163206a8d671 +--+--+-+ | Image ID | Member ID| Status | +--+--+-+ | 3a4763d0-aa49-4389-9b8b-163206a8d671 | e81df4c0b493439abb8b85bfd4cbe071 | pending | +--+--+-+ [sgaravat@lxsgaravat ~]$ This is what I see in the log file: /var/log/glance/api.log:2019-05-02 10:01:57.069 8236 INFO eventlet.wsgi.server [req-7c7caee4-06cc-43f8-9716-a5e1a4a34d77 ab573ba3ea014b778193b6922e6d ee1865a76440481cbcff08544c7d580a - default \ default] 193.205.157.174,192.168.60.229 - - [02/May/2019 10:01:57] "GET /v2/images/3a4763d0-aa49-4389-9b8b-163206a8d671 HTTP/1.1" 200 991 0.628997 /var/log/glance/api.log:2019-05-02 10:01:57.199 8223 WARNING glance.api.v2.image_members [req-9aa61dda-012b-415c-b1c9-4ca2c90c8493 ab573ba3ea014b778193b6922e6d ee1865a76440481cbcff08544c7d580a \ - default default] Not allowed to create members for image 3a4763d0-aa49-4389-9b8b-163206a8d671.: Forbidden: You are not authorized to complete get_image_location action. /var/log/glance/api.log:2019-05-02 10:01:57.202 8223 INFO eventlet.wsgi.server [req-9aa61dda-012b-415c-b1c9-4ca2c90c8493 ab573ba3ea014b778193b6922e6d ee1865a76440481cbcff08544c7d580a - default \ default] 193.205.157.174,192.168.60.229 - - [02/May/2019 10:01:57] "POST /v2/images/3a4763d0-aa49-4389-9b8b-163206a8d671/members HTTP/1.1" 403 408 0.084475 /var/log/glance/api.log:2019-05-02 10:02:03.599 8238 INFO eventlet.wsgi.server [req-c807bbd7-924c-4d75-aea2-12da525f50ff ab573ba3ea014b778193b6922e6d ee1865a76440481cbcff08544c7d580a - default \ default] 193.205.157.174,192.168.60.229 - - [02/May/2019 10:02:03] "GET /v2/images/3a4763d0-aa49-4389-9b8b-163206a8d671/members HTTP/1.1" 200 472 0.487064 I also attached the output of "openstack image show 3a4763d0-aa49-4389 -9b8b-163206a8d671" issued by this non-admin user ** Affects: glance Importance: Undecided Status: New ** Attachment added: "image-show-regular.txt" https://bugs.launchpad.net/bugs/1827342/+attachment/5260796/+files/image-show-regular.txt -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1827342 Title: Issue sharing an image with another project (something related to get_image_location) Status in Glance: New Bug description: I have a small Rocky installation where Glance is configured with 2 backends (old images use the 'file' backend while new ones use the rbd backend, which is the default) show_multiple_locations is true but I have modified the _image_location policies. The used policy.json file is attached If (as regular, non-admin user) I try to share a private image with another project I get an error message: [sgaravat@lxsgaravat ~]$ glance member-list --image-id 3a4763d0-aa49-4389-9b8b-163206a8d671 +--+---++ | Image ID | Member ID | Status | +--+---++ +--+---++ [sgaravat@lxsgaravat ~]$ openstack image add project 3a4763d0-aa49-4389-9b8b-163206a8d671 e81df4c0b493439abb8b85bfd4cbe071 403 Forbidden: Not allowed to create members for image 3a4763d0-aa49-4389-9b8b-163206a8d671. (HTTP 403) But actually the operation succeeded: [sgaravat@lxsgaravat ~]$ glance member-list --image-id 3a4763d0-aa49-4389-9b8b-163206a8d671 +--+--+-+ | Image ID | Member ID| Status | +--+--+-+ | 3a4763d0-aa49-4389-9b8b-163206a8d671 | e81df4c0b493439abb8b85bfd4cbe071 | pending |