[Yahoo-eng-team] [Bug 1770527] Re: add volume fails over 26vols and returns 500 API error with libvirt driver
Reviewed: https://review.openstack.org/632904 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6489f2d2b44827d133dad9a3bb52436ee304a934 Submitter: Zuul Branch:master commit 6489f2d2b44827d133dad9a3bb52436ee304a934 Author: melanie witt Date: Fri Jan 18 16:30:40 2019 + Raise 403 instead of 500 error from attach volume API Currently, the libvirt driver has a limit on the maximum number of disk devices allowed to attach to a single instance of 26. If a user attempts to attach a volume which would make the total number of attached disk devices > 26 for the instance, the user receives a 500 error from the API. This adds a new exception type TooManyDiskDevices and raises it for the "No free disk devices names" condition, instead of InternalError, and handles it in the attach volume API. We raise TooManyDiskDevices directly from the libvirt driver because InternalError is ambiguous and can be raised for different error reasons within the same method call. Closes-Bug: #1770527 Change-Id: I1b08ed6826d7eb41ecdfc7102e5e8fcf3d1eb2e1 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1770527 Title: add volume fails over 26vols and returns 500 API error with libvirt driver Status in OpenStack Compute (nova): Fix Released Bug description: Description === openstack server add volume fails over 26vols Steps to reproduce == * I did execute this openstack command. # create a instance with single volume named sles15rc # openstack server add volume sles15rc vol2 # openstack server add volume sles15rc vol3 : # openstack server add volume sles15rc vol26 # openstack server add volume sles15rc vol27 Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. (HTTP 500) (Request-ID: req-d95fea94-31fe-4063-9262-a84088cbaf29) # Expected result === This command will success. # openstack server add volume sles15rc vol27 and instance will had '/dev/vdaa' device volumes. Actual result = # openstack server add volume sles15rc vol26 instance will get named /dev/vdz volume. Next, # openstack server add volume sles15rc vol27 Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. (HTTP 500) (Request-ID: req-d95fea94-31fe-4063-9262-a84088cbaf29) I did kvm based command, Its OK. #virsh attach-device instance-001e ~/vol27.xml #virsh attach-device instance-001e ~/vol28.xml and instance will had a volumes, vdaa,vdab It's nova limitation problem. I made a concept blueprint. https://blueprints.launchpad.net/nova/+spec/nova-improvement-of-maximum-attach-volumes-more-than-26-vols Environment === SOC7(Suse Openstack Cloud version 7 openstack-newton) Logs & Configs == Nova API returned this result. NovaException_Remote: No free disk device names for prefix 'vd' 2018-05-11 08:51:11.602 3667 INFO nova.osapi_compute.wsgi.server [req-6cf5fdbf-9681-445b-8117-a61acd4057f4 2e2bd0e5b23c4665a5a78d710e2bbeb5 b516d75b77bf4dc2b70 5d680487f6f19 - default default] 10.19.3.70 "POST /v2.1/b516d75b77bf4dc2b705d680487f6f19/servers/6a324dc8-d2ea-4eba-9870-e2ada1cb2bf4/os-volume_attachments HTT P/1.1" status: 200 len: 522 time: 0.2834640 2018-05-11 08:51:11.716 3668 INFO nova.osapi_compute.wsgi.server [-] 127.0.0.1 "GET / HTTP/1.1" status: 200 len: 491 time: 0.0007210 2018-05-11 08:51:13.880 3668 INFO nova.api.openstack.wsgi [req-c7fc0e7a-13ec-48e7-947c-cd22232aa31f 2e2bd0e5b23c4665a5a78d710e2bbeb5 b516d75b77bf4dc2b705d68048 7f6f19 - default default] HTTP exception thrown: Instance sles15rc could not be found. 2018-05-11 08:51:13.881 3668 INFO nova.osapi_compute.wsgi.server [req-c7fc0e7a-13ec-48e7-947c-cd22232aa31f 2e2bd0e5b23c4665a5a78d710e2bbeb5 b516d75b77bf4dc2b70 5d680487f6f19 - default default] 10.19.3.70 "GET /v2.1/b516d75b77bf4dc2b705d680487f6f19/servers/sles15rc HTTP/1.1" status: 404 len: 432 time: 0.0449822 2018-05-11 08:51:13.887 3668 INFO nova.api.openstack.wsgi [req-7cbf4817-8810-4c30-8e09-8b106f66dee4 2e2bd0e5b23c4665a5a78d710e2bbeb5 b516d75b77bf4dc2b705d68048 7f6f19 - default default] HTTP exception thrown: Instance sles15rc could not be found. 2018-05-11 08:51:13.889 3668 INFO nova.osapi_compute.wsgi.server [req-7cbf4817-8810-4c30-8e09-8b106f66dee4 2e2bd0e5b23c4665a5a78d710e2bbeb5 b516d75b77bf4dc2b70 5d680487f6f19 - default default] 10.19.3.70 "GET /v2.1/b516d75b77bf4dc2b705d680487f6f19/servers/sles15rc HTTP/1.1" status: 404 len: 432 time: 0.0053861 2018-05-11 08:51:13.929 3668 INFO nova.osapi_compute.wsgi.server [re
[Yahoo-eng-team] [Bug 1812110] Re: Detaching pre-existing port fails to remove user's dns record
Reviewed: https://review.openstack.org/631684 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1b797f6f7e99fdef380340d6fe29e4004be48781 Submitter: Zuul Branch:master commit 1b797f6f7e99fdef380340d6fe29e4004be48781 Author: Hang Yang Date: Thu Jan 17 16:57:56 2019 -0800 Fix port dns_name reset When external DNS service is enabled, use user's context to request dns_name reset instead of using admin context. The dns record need be found in user's zone and recordset. Change-Id: I35335b501f8961b9ac8e5f92e0686e402b78617b Closes-Bug: #1812110 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1812110 Title: Detaching pre-existing port fails to remove user's dns record Status in OpenStack Compute (nova): Fix Released Bug description: Environment: Queens: Nova, Neutron, Designate(External DNS) Network has dns_domain set Steps to reproduce: 1. Create a port from the network 2. Create an instance with the port attached, dns_name in the port will be updated and a record will be created in user's recordset. 3. Delete the instance. 4. Check the port, the dns_name is reset to empty but the dns record under user's zone is not removed. This also affects the usage of orchestration components like Senlin/Heat since they usually create a port separately and then create an instance with the port. This bug was filed in Neutron before and has more debug information: https://bugs.launchpad.net/neutron/+bug/1741079 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1812110/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1813642] Re: Deprecated config on rocky's installation tutorial
Sorry, I was wrong and everything is ok with documentation. I mark this bug report as invalid. Sorry again ** Changed in: glance Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1813642 Title: Deprecated config on rocky's installation tutorial Status in Glance: Invalid Bug description: This bug tracker is for errors with the documentation, use the following as a template and remove or add fields as you see fit. Convert [ ] into [x] to check boxes: - [x] This doc is inaccurate in this way: Deprecated options - [ ] This is a doc addition request. - [ ] I have a fix to the document that I can paste below including example: input and output. This tutorial(https://docs.openstack.org/glance/rocky/install/install- ubuntu.html#install-and-configure-components) of installation of glance indicates to modify options which my configuration template informs as deprecated, as example: /etc/glance/glance-api.conf # DEPRECATED: The URL to the keystone service. If "use_user_token" is not in # effect and using keystone auth, then URL of keystone can be specified. (string # value) # This option is deprecated for removal. # Its value may be silently ignored in the future. # Reason: This option was considered harmful and has been deprecated in M # release. It will be removed in O release. For more information read OSSN-0060. # Related functionality with uploading big images has been implemented with # Keystone trusts support. #auth_url = Same thing with this two parameters username password and others not even appears on config file, like project_domain_name user_domain_name project_name Using: Linux controller 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:43:28 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux Glance 2.9.1 Should I ignore theses warnings and proceed with installation ? Best regards --- Release: on 2019-01-23 02:42 SHA: 206deb9b8b6851babb30dc610dba28f4ae2880b3 Source: https://git.openstack.org/cgit/openstack/glance/tree/doc/source/install/install-ubuntu.rst URL: https://docs.openstack.org/glance/rocky/install/install-ubuntu.html To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1813642/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1813812] [NEW] expecting metadata server is reachable on the lexical first interface
Public bug reported: In OpenStack it is possible to configure a network setup such that the metadata server is accessible via a "secondary" NIC, for example eth1. In such a setup cloud-init fails to locate the proper NIC to access the metadata server. ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1813812 Title: expecting metadata server is reachable on the lexical first interface Status in cloud-init: New Bug description: In OpenStack it is possible to configure a network setup such that the metadata server is accessible via a "secondary" NIC, for example eth1. In such a setup cloud-init fails to locate the proper NIC to access the metadata server. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1813812/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1813817] [NEW] Max retries exceeded with StaleDataError in standardattributes during live migration
Public bug reported: I am observing StaleDataError on stable/pike during live migration, causing live migration to fail. It occurs when attempting to live- migrate a handful of VM's (5-6 VM's is all it takes) in rapid succession from the same source to the same target. This quick and dirty script is able to make the issue appear reliably: for i in `openstack server list --all-projects --host -c ID -f value`; do openstack server migrate $i --live ; done >From the neutron server logs: DB exceeded retry limit.: StaleDataError: UPDATE statement on table 'standardattributes' expected to update 1 row(s); 0 were matched. 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api Traceback (most recent call last): 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/oslo_db/api.py", line 138, in wrapper 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api return f(*args, **kwargs) 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/neutron/db/api.py", line 128, in wrapped 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api LOG.debug("Retry wrapper got retriable exception: %s", e) 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api self.force_reraise() 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api six.reraise(self.type_, self.value, self.tb) 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/neutron/db/api.py", line 124, in wrapped 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api return f(*dup_args, **dup_kwargs) 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py", line 1346, in update_port 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api mech_context, attrs) 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py", line 354, in _process_port_binding 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api db.clear_binding_levels(plugin_context, port_id, original_host) 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 979, in wrapper 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api return fn(*args, **kwargs) 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__ 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api self.gen.next() 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 1029, in _transaction_scope 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api yield resource 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__ 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api self.gen.next() 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 655, in _session 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api self.session.flush() 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2171, in flush 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api self._flush(objects) 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2291, in _flush 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api transaction.rollback(_capture_exception=True) 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__ 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api compat.reraise(exc_type, exc_value, exc_tb) 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2255, in _flush 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api flush_context.execute() 2019-01-24 09:57:34.959 255478 ERROR oslo_db.api File "/opt/stack/venv/neutron-20181030T130300Z/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 389, in execut
[Yahoo-eng-team] [Bug 1813667] Re: Update netplan dependency package
** Tags added: disco packaging ** Changed in: cloud-init (Ubuntu) Status: New => Confirmed ** Also affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1813667 Title: Update netplan dependency package Status in cloud-init: New Status in cloud-init package in Ubuntu: Confirmed Bug description: Currently cloud-init has a dependency on nplan [1], which is the transitional package for netplan.io [2]. It should really depend on netplan.io instead. [1] https://packages.ubuntu.com/disco/cloud-init [2] https://packages.ubuntu.com/disco/nplan To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1813667/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1813789] [NEW] Evacuate test intermittently fails with network-vif-plugged timeout exception
Public bug reported: The nova-live-migration job has 2 nodes and in addition to running live migration tests it also runs evacuate tests for image-backed and volume- backed servers. Seeing intermittent failures, with debug in this change: https://review.openstack.org/#/c/571325/ It looks like the network-vif-plugged event is coming to the API before nova-compute has registered a callback for that event, so nova-compute does not process the event and then times out waiting for the event to complete by the time it registered. The API processes the network-vif-plugged for that server here: http://logs.openstack.org/25/571325/10/gate/nova-live- migration/52e1cd0/logs/screen-n-api.txt.gz#_Jan_29_01_11_49_707004 Jan 29 01:11:49.707004 ubuntu-xenial-rax-ord-0002201755 devstack@n-api.service[22319]: DEBUG nova.api.openstack.wsgi [req- 3ed0ada1-7328-4d5a-a3ea-da34dcdb252d req-1f5aeede- 83c6-44d5-afd4-b5435e71d61f service nova] Action: 'create', calling method: >, body: {"events": [{"status": "completed", "tag": "e241f79f-fb0d-4961-b0c8-aea9de2755bf", "name": "network-vif- plugged", "server_uuid": "2e82ddcd-75b8-4a41-8ecd-ce175adbdc67"}]} {{(pid=22322) _process_stack /opt/stack/new/nova/nova/api/openstack/wsgi.py:520}} Jan 29 01:11:49.765687 ubuntu-xenial-rax-ord-0002201755 devstack@n-api.service[22319]: INFO nova.api.openstack.compute.server_external_events [req-3ed0ada1-7328-4d5a-a3ea-da34dcdb252d req-1f5aeede-83c6-44d5-afd4-b5435e71d61f service nova] Creating event network-vif-plugged:e241f79f-fb0d-4961-b0c8-aea9de2755bf for instance 2e82ddcd-75b8-4a41-8ecd-ce175adbdc67 on ubuntu-xenial-rax-ord-0002201758 Jan 29 01:11:49.776578 ubuntu-xenial-rax-ord-0002201755 devstack@n-api.service[22319]: DEBUG nova.compute.api [req-3ed0ada1-7328-4d5a-a3ea-da34dcdb252d req-1f5aeede-83c6-44d5-afd4-b5435e71d61f service nova] Instance 2e82ddcd-75b8-4a41-8ecd-ce175adbdc67 is migrating, copying events to all relevant hosts: set([u'ubuntu-xenial-rax-ord-0002201758', u'ubuntu-xenial-rax-ord-0002201755']) {{(pid=22322) _get_relevant_hosts /opt/stack/new/nova/nova/compute/api.py:4818}} Jan 29 01:11:49.786947 ubuntu-xenial-rax-ord-0002201755 devstack@n-api.service[22319]: INFO nova.api.openstack.requestlog [req-3ed0ada1-7328-4d5a-a3ea-da34dcdb252d req-1f5aeede-83c6-44d5-afd4-b5435e71d61f service nova] 10.210.224.40 "POST /compute/v2.1/os-server-external-events" status: 200 len: 183 microversion: 2.1 time: 0.084393 Which is weird because that's before the vif plugging timeout in the compute logs. From the compute logs, we plugged the vif here: Jan 29 01:11:55.571807 ubuntu-xenial-rax-ord-0002201755 nova- compute[15903]: INFO os_vif [None req-252304a1-6eff-4ff3-aa4d- b4e0ab87601c demo admin] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:66:03:76,bridge_name='br- int',has_traffic_filtering=True,id=e241f79f- fb0d-4961-b0c8-aea9de2755bf,network=Network(2812d9e2-bfa8-4397-9a81-5dab9fe5a03e),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name ='tape241f79f-fb') So it looks like maybe we're getting the network-vif-plugged event before we're ready for it and so we miss the event and then timeout? That's pretty weird though because the libvirt driver's spawn method is what should be registering and waiting for the vif plugged event: https://github.com/openstack/nova/blob/c134feda3d9527dbc9735e4ae9cd35c4782f1fb4/nova/virt/libvirt/driver.py#L5673 Looks like the compute does register the event callback for network-vif- plugged here: Jan 29 01:11:55.567189 ubuntu-xenial-rax-ord-0002201755 nova- compute[15903]: DEBUG nova.compute.manager [None req-252304a1-6eff-4ff3 -aa4d-b4e0ab87601c demo admin] [instance: 2e82ddcd-75b8-4a41-8ecd- ce175adbdc67] Preparing to wait for external event network-vif-plugged- e241f79f-fb0d-4961-b0c8-aea9de2755bf {{(pid=15903) prepare_for_instance_event /opt/stack/new/nova/nova/compute/manager.py:325}} Which is too late: Jan 29 01:11:49.707004 ubuntu-xenial-rax-ord-0002201755 devstack@n-api.service[22319]: DEBUG nova.api.openstack.wsgi [req- 3ed0ada1-7328-4d5a-a3ea-da34dcdb252d req-1f5aeede- 83c6-44d5-afd4-b5435e71d61f service nova] Action: 'create', calling method: >, body: {"events": [{"status": "completed", "tag": "e241f79f-fb0d-4961-b0c8-aea9de2755bf", "name": "network-vif- plugged", "server_uuid": "2e82ddcd-75b8-4a41-8ecd-ce175adbdc67"}]} {{(pid=22322) _process_stack /opt/stack/new/nova/nova/api/openstack/wsgi.py:520}} http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Timeout%20waiting%20for%20%5B ('network-vif- plugged'%5C%22%20AND%20message%3A%5C%22for%20instance%20with%20vm_state%20error%20and%20task_state%20rebuild_spawning.%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d 4 hits in the last 7 days, check and gate, all failures. ** Affects: nova Importance: Medium Status: Confirmed ** Tags: evacuate gate-failure networking -- You received this bug notification becaus
[Yahoo-eng-team] [Bug 1813787] [NEW] [L3] DVR router in compute node was not up but nova port needs its functionality
Public bug reported: There is a race condition between nova-compute boots instance and l3-agent processes DVR (local) router in compute node. This issue can be seen when a large number of instances were booted to one same host, and instances are under different DVR router. So the l3-agent will concurrently process all these dvr router in this host at the same time. Although we have a green pool for the router ResourceProcessingQueue with 8 greenlet, https://github.com/openstack/neutron/blob/master/neutron/agent/l3/agent.py#L642 some of these routers can still be waiting, event worse thing is that there are time-consuming actions during the router processing procedure. For instance, installing arp entrys, iptables rules, route rules etc. So when the VM is up, it will try to get meta via the local proxy hosting by the dvr router. But the router is not ready yet in that host. And finally those instances will not be able to setup some config in the guest OS. Some potential solutions: (1) increase that green pool room (2) still (provisioning) block the VM port to be set to ACTIVE until the dvr router is up in that host for the first one. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1813787 Title: [L3] DVR router in compute node was not up but nova port needs its functionality Status in neutron: New Bug description: There is a race condition between nova-compute boots instance and l3-agent processes DVR (local) router in compute node. This issue can be seen when a large number of instances were booted to one same host, and instances are under different DVR router. So the l3-agent will concurrently process all these dvr router in this host at the same time. Although we have a green pool for the router ResourceProcessingQueue with 8 greenlet, https://github.com/openstack/neutron/blob/master/neutron/agent/l3/agent.py#L642 some of these routers can still be waiting, event worse thing is that there are time-consuming actions during the router processing procedure. For instance, installing arp entrys, iptables rules, route rules etc. So when the VM is up, it will try to get meta via the local proxy hosting by the dvr router. But the router is not ready yet in that host. And finally those instances will not be able to setup some config in the guest OS. Some potential solutions: (1) increase that green pool room (2) still (provisioning) block the VM port to be set to ACTIVE until the dvr router is up in that host for the first one. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1813787/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1813198] Re: TestNetworkBasicOps:test_subnet_details intermittently fails with "cat: can't open '/var/run/udhcpc..pid': No such file or directory"
** Changed in: tempest Assignee: Matt Riedemann (mriedem) => Bence Romsics (bence-romsics) ** Changed in: tempest Importance: Undecided => High ** No longer affects: neutron -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1813198 Title: TestNetworkBasicOps:test_subnet_details intermittently fails with "cat: can't open '/var/run/udhcpc..pid': No such file or directory" Status in tempest: In Progress Bug description: Seen here: http://logs.openstack.org/78/570078/17/check/tempest-slow/161ea32/job- output.txt.gz#_2019-01-24_18_26_22_886987 2019-01-24 18:26:22.886987 | controller | {0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_subnet_details [125.520364s] ... FAILED 2019-01-24 18:26:22.887067 | controller | 2019-01-24 18:26:22.887166 | controller | Captured traceback: 2019-01-24 18:26:22.887251 | controller | ~~~ 2019-01-24 18:26:22.887370 | controller | Traceback (most recent call last): 2019-01-24 18:26:22.887545 | controller | File "tempest/common/utils/__init__.py", line 89, in wrapper 2019-01-24 18:26:22.887663 | controller | return f(*func_args, **func_kwargs) 2019-01-24 18:26:22.887868 | controller | File "tempest/scenario/test_network_basic_ops.py", line 629, in test_subnet_details 2019-01-24 18:26:22.887952 | controller | renew_delay), 2019-01-24 18:26:22.888141 | controller | File "tempest/lib/common/utils/test_utils.py", line 107, in call_until_true 2019-01-24 18:26:22.888237 | controller | if func(*args, **kwargs): 2019-01-24 18:26:22.888443 | controller | File "tempest/scenario/test_network_basic_ops.py", line 621, in check_new_dns_server 2019-01-24 18:26:22.888583 | controller | dhcp_client=CONF.scenario.dhcp_client) 2019-01-24 18:26:22.888776 | controller | File "tempest/common/utils/linux/remote_client.py", line 140, in renew_lease 2019-01-24 18:26:22.888957 | controller | return getattr(self, '_renew_lease_' + dhcp_client)(fixed_ip=fixed_ip) 2019-01-24 18:26:22.889161 | controller | File "tempest/common/utils/linux/remote_client.py", line 116, in _renew_lease_udhcpc 2019-01-24 18:26:22.889279 | controller | format(path=file_path, nic=nic_name)) 2019-01-24 18:26:22.889474 | controller | File "tempest/lib/common/utils/linux/remote_client.py", line 33, in wrapper 2019-01-24 18:26:22.889595 | controller | return function(self, *args, **kwargs) 2019-01-24 18:26:22.889793 | controller | File "tempest/lib/common/utils/linux/remote_client.py", line 108, in exec_command 2019-01-24 18:26:22.890231 | controller | return self.ssh_client.exec_command(cmd) 2019-01-24 18:26:22.890402 | controller | File "tempest/lib/common/ssh.py", line 202, in exec_command 2019-01-24 18:26:22.890520 | controller | stderr=err_data, stdout=out_data) 2019-01-24 18:26:22.890848 | controller | tempest.lib.exceptions.SSHExecCommandFailed: Command 'set -eu -o pipefail; PATH=$PATH:/sbin; cat /var/run/udhcpc..pid', exit status: 1, stderr: 2019-01-24 18:26:22.891027 | controller | cat: can't open '/var/run/udhcpc..pid': No such file or directory 2019-01-24 18:26:22.891068 | controller | 2019-01-24 18:26:22.891142 | controller | stdout: Looks like the problem would be in the file name "udchpc..pid" -- too many extension separator dots. Maybe something in this code: http://git.openstack.org/cgit/openstack/tempest/tree/tempest/common/utils/linux/remote_client.py#n111 def _renew_lease_udhcpc(self, fixed_ip=None): """Renews DHCP lease via udhcpc client. """ file_path = '/var/run/udhcpc.' nic_name = self.get_nic_name_by_ip(fixed_ip) pid = self.exec_command('cat {path}{nic}.pid'. format(path=file_path, nic=nic_name)) pid = pid.strip() cmd = 'sudo /bin/kill -{sig} {pid}'.format(pid=pid, sig='USR1') self.exec_command(cmd) The nic_name must be coming back empty and that's how we get /var/run/udhcpc..pid. To manage notifications about this bug go to: https://bugs.launchpad.net/tempest/+bug/1813198/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1715270] Re: Remove usage of kwarg retry_on_request in API
** Changed in: tripleo Status: In Progress => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1715270 Title: Remove usage of kwarg retry_on_request in API Status in OpenStack Compute (nova): Fix Released Status in tripleo: Invalid Bug description: As Retry on request is always enabled in oslo.db, so kwarg retry_on_request is deprecated for removal in new release Queens. https://bugs.launchpad.net/oslo.db/+bug/1714440 http://git.openstack.org/cgit/openstack/oslo.db/tree/oslo_db/api.py#n109 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1715270/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1813224] Re: fedora28 standalone failing on tempest
There are lots of warnings showing up in ovs-vswitchd log 2019-01-24T18:40:59.070Z|00058|connmgr|INFO|br-ex<->tcp:127.0.0.1:6633: 2 flow_mods 10 s ago (2 adds) 2019-01-24T18:40:59.184Z|00059|connmgr|INFO|br-tun<->tcp:127.0.0.1:6633: 10 flow_mods 10 s ago (10 adds) 2019-01-24T18:46:19.496Z|00060|bridge|INFO|bridge br-int: added interface tapd0b4fd96-39 on port 3 2019-01-24T18:46:19.609Z|00061|netdev_linux|INFO|ioctl(SIOCGIFHWADDR) on tapd0b4fd96-39 device failed: No such device 2019-01-24T18:46:25.160Z|00062|bridge|INFO|bridge br-int: added interface qr-32fda31e-3a on port 4 2019-01-24T18:46:25.272Z|00063|netdev_linux|INFO|ioctl(SIOCGIFHWADDR) on qr-32fda31e-3a device failed: No such device 2019-01-24T18:46:25.645Z|00064|bridge|INFO|bridge br-int: added interface qg-193495cd-3d on port 5 2019-01-24T18:46:25.701Z|00065|netdev_linux|INFO|ioctl(SIOCGIFHWADDR) on qg-193495cd-3d device failed: No such device ** Also affects: neutron Importance: Undecided Status: New ** Changed in: neutron Assignee: (unassigned) => Brian Haley (brian-haley) ** Changed in: neutron Milestone: None => stein-3 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1813224 Title: fedora28 standalone failing on tempest Status in neutron: New Status in tripleo: Triaged Bug description: The fedora28 tempest jobs are failing in check (it's voting but not in the gate). tempest.scenario.test_network_basic_ops.TestNetworkBasicOps tempest.scenario.test_server_basic_ops.TestServerBasicOps tempest.scenario.test_minimum_basic.TestMinimumBasicScenario http://logs.openstack.org/31/626631/11/check/tripleo-ci-fedora-28-standalone/cf314a4/logs/undercloud/home/zuul/tempest/tempest.html.gz http://logs.openstack.org/53/623353/10/check/tripleo-ci- fedora-28-standalone/0841969/logs/tempest.html http://logs.openstack.org/56/593056/43/check/tripleo-ci-fedora-28-standalone/106db25/logs/tempest.html http://logs.openstack.org/97/631297/3/check/tripleo-ci-fedora-28-standalone/7fe7dc1/logs/tempest.html sova is reporting this job to be <80% passing. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1813224/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1813754] [NEW] FeatureRequest: Add user name and group to action log
Public bug reported: Hello, our users need to know who made each OS instance, but unfortunately, there is no clue to get this info in instance action log as there is user-id only. Can you please consider to add some additional info about a user here? I mean user-name and user-group maybe? Please let me know and thank you. Best regards, Lukas F. ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1813754 Title: FeatureRequest: Add user name and group to action log Status in OpenStack Dashboard (Horizon): New Bug description: Hello, our users need to know who made each OS instance, but unfortunately, there is no clue to get this info in instance action log as there is user-id only. Can you please consider to add some additional info about a user here? I mean user-name and user-group maybe? Please let me know and thank you. Best regards, Lukas F. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1813754/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1813737] [NEW] on edit image form, do not enter the min disk and min ram, no error message shown
Public bug reported: on edit image form, do not enter the min disk and min ram, no error message shown ** Affects: horizon Importance: Undecided Assignee: pengyuesheng (pengyuesheng) Status: In Progress ** Changed in: horizon Assignee: (unassigned) => pengyuesheng (pengyuesheng) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1813737 Title: on edit image form, do not enter the min disk and min ram, no error message shown Status in OpenStack Dashboard (Horizon): In Progress Bug description: on edit image form, do not enter the min disk and min ram, no error message shown To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1813737/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp