[Yahoo-eng-team] [Bug 2012530] Re: nova-scheduler will crash at startup if placement is not up
Reviewed: https://review.opendev.org/c/openstack/nova/+/878238 Committed: https://opendev.org/openstack/nova/commit/d37cca361a4d575311318cb870da40079eb1617c Submitter: "Zuul (22348)" Branch:master commit d37cca361a4d575311318cb870da40079eb1617c Author: Dan Smith Date: Wed Mar 22 08:20:58 2023 -0700 Make scheduler lazy-load the placement client Like we did for conductor, this makes the scheduler lazy-load the placement client instead of only doing it during __init__. This avoids a startup crash if keystone or placement are not available, but retains startup failures for other problems and errors likely to be a result of misconfigurations. Closes-Bug: #2012530 Change-Id: I42ed876b84d80536e83d9ae01696b0a64299c9f7 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2012530 Title: nova-scheduler will crash at startup if placement is not up Status in OpenStack Compute (nova): Fix Released Bug description: This is the same problem as https://bugs.launchpad.net/nova/+bug/1846820 but for scheduler. Because we initialize our placement client during manager init, we will crash (and loop) on startup if keystone or placement are down. Example trace: Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova.scheduler.client.report [None req-edf5-6f86-4910-a458-72decae8e451 None None] Failed to initialize placement client (is keystone available?): openstack.exceptions.NotSupported: The placement service for 192.168.122.154:RegionOne exists but does not have any supported versions. Mar 22 15:54:39 jammy nova-scheduler[119746]: CRITICAL nova [None req-edf5-6f86-4910-a458-72decae8e451 None None] Unhandled error: openstack.exceptions.NotSupported: The placement service for 192.168.122.154:RegionOne exists but does not have any supported versions. Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova Traceback (most recent call last): Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/usr/local/bin/nova-scheduler", line 10, in Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova sys.exit(main()) Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/cmd/scheduler.py", line 47, in main Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova server = service.Service.create( Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/service.py", line 252, in create Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova service_obj = cls(host, binary, topic, manager, Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/service.py", line 116, in __init__ Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova self.manager = manager_class(host=self.host, *args, **kwargs) Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/scheduler/manager.py", line 70, in __init__ Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova self.placement_client = report.report_client_singleton() Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/scheduler/client/report.py", line 91, in report_client_singleton Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova PLACEMENTCLIENT = SchedulerReportClient() Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/scheduler/client/report.py", line 234, in __init__ Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova self._client = self._create_client() Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/scheduler/client/report.py", line 277, in _create_client Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova client = self._adapter or utils.get_sdk_adapter('placement') Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/utils.py", line 984, in get_sdk_adapter Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova return getattr(conn, service_type) Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/usr/local/lib/python3.10/dist-packages/openstack/service_description.py", line 87, in __get__ Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova proxy = self._make_proxy(instance) Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/usr/local/lib/python3.10/dist-packages/openstack/service_description.py", line 266, in _make_proxy Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova raise exceptions.NotSupported( Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova openstack.exceptions.NotSupported: The placement service for 192.168.122.154:RegionOne exists but does not have any supported
[Yahoo-eng-team] [Bug 2015090] Re: neutron.agent.dhcp.agent TypeError: 'bool' object is not subscriptable
** Description changed: - [Impact] - - In a fresh jammy-antelope environment the metadata service is not - available to the overcloud deployed instances, this is an environment - using neutron-openvswitch environment (not OVN), when looking into the - logs the stacktrace below is suspicious. The metadata service becomes - available when restarting the services (e.g. juju config neutron- - openvswitch debug=True) - - [Test Case] - - ``` - git clone https://opendev.org/openstack/charm-neutron-openvswitch - cd charm-neutron-openvswitch - git review -d https://review.opendev.org/c/openstack/charm-neutron-openvswitch/+/873819 - tox -e build # make sure charmcraft-2.1 is installed before - tox -e func-target -- jammy-antelope - ``` + In a fresh environment running Antelope (on top of Ubuntu 22.04), with a + DVR configuration (neutron-dhcp-agent and neutron-metadata-agent running + on compute nodes) the metadata service is not available instances, this + is an environment using neutron-openvswitch environment (not OVN), when + looking into the logs the stacktrace below indicates an unexpected data + type. [Stacktrace] 2023-03-31 19:35:06.093 58625 DEBUG neutron.agent.dhcp.agent [-] Calling driver for network: 6d246d86-11b5-4d5f-aa9c-c2bcbcc28b62/seg=None action: get_metadata_bind_interface _call_driver /usr/lib/python3/dist-packages/neutron/agent/dhcp/agent.py:242 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent [-] 'bool' object is not subscriptable: TypeError: 'bool' object is not subscriptable 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent Traceback (most recent call last): 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/neutron/common/utils.py", line 182, in call 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent return func(*args, **kwargs) 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/neutron/agent/dhcp/agent.py", line 434, in safe_configure_dhcp_for_network 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent self.configure_dhcp_for_network(network) 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/osprofiler/profiler.py", line 159, in wrapper 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent result = f(*args, **kwargs) 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/neutron/agent/dhcp/agent.py", line 447, in configure_dhcp_for_network 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent self.update_isolated_metadata_proxy(network) 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/osprofiler/profiler.py", line 159, in wrapper 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent result = f(*args, **kwargs) 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/neutron/agent/dhcp/agent.py", line 763, in update_isolated_metadata_proxy 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent self.enable_isolated_metadata_proxy(network) 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/osprofiler/profiler.py", line 159, in wrapper 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent result = f(*args, **kwargs) 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/neutron/agent/dhcp/agent.py", line 819, in enable_isolated_metadata_proxy 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent metadata_driver.MetadataDriver.spawn_monitored_metadata_proxy( 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/neutron/agent/metadata/driver.py", line 244, in spawn_monitored_metadata_proxy 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent ip_lib.IpAddrCommand( 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/neutron/agent/linux/ip_lib.py", line 609, in wait_until_address_ready 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent common_utils.wait_until_true( 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/neutron/common/utils.py", line 744, in wait_until_true 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent while not predicate(): 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3/dist-packages/neutron/agent/linux/ip_lib.py", line 594, in is_address_ready 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent addr_info = self.list(to=address)[0] 2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent File
[Yahoo-eng-team] [Bug 2015728] [NEW] ovs/ovn source(with master branch) deployments broken
Public bug reported: With [1] ovn/ovs jobs running with OVS_BRANCH=master,OVN_BRANCH=main are broken, fails as below:- utilities/ovn-dbctl.c: In function ‘server_loop’: utilities/ovn-dbctl.c:1105:5: error: too few arguments to function ‘daemonize_start’ 1105 | daemonize_start(false); | ^~~ In file included from utilities/ovn-dbctl.c:22: /opt/stack/ovs/lib/daemon.h:170:6: note: declared here 170 | void daemonize_start(bool access_datapath, bool access_hardware_ports); | ^~~ make[1]: *** [Makefile:2374: utilities/ovn-dbctl.o] Error 1 make[1]: *** Waiting for unfinished jobs Example failure:- https://zuul.openstack.org/build/b7b1700e2e5941f7a52b57ca411db722 Builds:- - https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ipv6-only-ovs-master - https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-full-multinode-ovs-master - https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ovs-master - https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ovs-master-centos-9-stream - https://zuul.openstack.org/builds?job_name=ovn-octavia-provider-functional-master - https://zuul.openstack.org/builds?job_name=ovn-octavia-provider-tempest-master Until ovn main branch is adapted to this change we need to pin ovs_branch to working commit or better stable branch(as done with [2]) Also i noticed some of these jobs running in neutron/ovn-octavia stable branches, that likely not needed to be run, so should be checked and cleaned up. [1] https://github.com/openvswitch/ovs/commit/07cf5810de [2] https://github.com/ovn-org/ovn/commit/b61e819bf9673 ** Affects: neutron Importance: High Assignee: yatin (yatinkarel) Status: Confirmed ** Tags: gat ** Changed in: neutron Status: New => Confirmed ** Changed in: neutron Importance: Undecided => Critical ** Changed in: neutron Assignee: (unassigned) => yatin (yatinkarel) ** Tags added: gat ** Changed in: neutron Importance: Critical => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2015728 Title: ovs/ovn source(with master branch) deployments broken Status in neutron: Confirmed Bug description: With [1] ovn/ovs jobs running with OVS_BRANCH=master,OVN_BRANCH=main are broken, fails as below:- utilities/ovn-dbctl.c: In function ‘server_loop’: utilities/ovn-dbctl.c:1105:5: error: too few arguments to function ‘daemonize_start’ 1105 | daemonize_start(false); | ^~~ In file included from utilities/ovn-dbctl.c:22: /opt/stack/ovs/lib/daemon.h:170:6: note: declared here 170 | void daemonize_start(bool access_datapath, bool access_hardware_ports); | ^~~ make[1]: *** [Makefile:2374: utilities/ovn-dbctl.o] Error 1 make[1]: *** Waiting for unfinished jobs Example failure:- https://zuul.openstack.org/build/b7b1700e2e5941f7a52b57ca411db722 Builds:- - https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ipv6-only-ovs-master - https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-full-multinode-ovs-master - https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ovs-master - https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ovs-master-centos-9-stream - https://zuul.openstack.org/builds?job_name=ovn-octavia-provider-functional-master - https://zuul.openstack.org/builds?job_name=ovn-octavia-provider-tempest-master Until ovn main branch is adapted to this change we need to pin ovs_branch to working commit or better stable branch(as done with [2]) Also i noticed some of these jobs running in neutron/ovn-octavia stable branches, that likely not needed to be run, so should be checked and cleaned up. [1] https://github.com/openvswitch/ovs/commit/07cf5810de [2] https://github.com/ovn-org/ovn/commit/b61e819bf9673 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/2015728/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp