[Yahoo-eng-team] [Bug 2012530] Re: nova-scheduler will crash at startup if placement is not up

2023-04-10 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/c/openstack/nova/+/878238
Committed: 
https://opendev.org/openstack/nova/commit/d37cca361a4d575311318cb870da40079eb1617c
Submitter: "Zuul (22348)"
Branch:master

commit d37cca361a4d575311318cb870da40079eb1617c
Author: Dan Smith 
Date:   Wed Mar 22 08:20:58 2023 -0700

Make scheduler lazy-load the placement client

Like we did for conductor, this makes the scheduler lazy-load the
placement client instead of only doing it during __init__. This avoids
a startup crash if keystone or placement are not available, but
retains startup failures for other problems and errors likely to be
a result of misconfigurations.

Closes-Bug: #2012530
Change-Id: I42ed876b84d80536e83d9ae01696b0a64299c9f7


** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2012530

Title:
  nova-scheduler will crash at startup if placement is not up

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  This is the same problem as
  https://bugs.launchpad.net/nova/+bug/1846820 but for scheduler.
  Because we initialize our placement client during manager init, we
  will crash (and loop) on startup if keystone or placement are down.
  Example trace:

  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR 
nova.scheduler.client.report [None req-edf5-6f86-4910-a458-72decae8e451 
None None] Failed to initialize placement client (is keystone available?): 
openstack.exceptions.NotSupported: The placement service for 
192.168.122.154:RegionOne exists but does not have any supported versions.
  Mar 22 15:54:39 jammy nova-scheduler[119746]: CRITICAL nova [None 
req-edf5-6f86-4910-a458-72decae8e451 None None] Unhandled error: 
openstack.exceptions.NotSupported: The placement service for 
192.168.122.154:RegionOne exists but does not have any supported versions.
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova Traceback (most 
recent call last):
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/usr/local/bin/nova-scheduler", line 10, in 
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova sys.exit(main())
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/cmd/scheduler.py", line 47, in main
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova server = 
service.Service.create(
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/service.py", line 252, in create
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova service_obj = 
cls(host, binary, topic, manager,
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/service.py", line 116, in __init__
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova self.manager = 
manager_class(host=self.host, *args, **kwargs)
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/scheduler/manager.py", line 70, in __init__
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova 
self.placement_client = report.report_client_singleton()
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/scheduler/client/report.py", line 91, in 
report_client_singleton
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova PLACEMENTCLIENT 
= SchedulerReportClient()
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/scheduler/client/report.py", line 234, in __init__
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova self._client = 
self._create_client()
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/scheduler/client/report.py", line 277, in _create_client
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova client = 
self._adapter or utils.get_sdk_adapter('placement')
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/utils.py", line 984, in get_sdk_adapter
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova return 
getattr(conn, service_type)
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/usr/local/lib/python3.10/dist-packages/openstack/service_description.py", 
line 87, in __get__
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova proxy = 
self._make_proxy(instance)
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/usr/local/lib/python3.10/dist-packages/openstack/service_description.py", 
line 266, in _make_proxy
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova raise 
exceptions.NotSupported(
  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova 
openstack.exceptions.NotSupported: The placement service for 
192.168.122.154:RegionOne exists but does not have any supported 

[Yahoo-eng-team] [Bug 2015090] Re: neutron.agent.dhcp.agent TypeError: 'bool' object is not subscriptable

2023-04-10 Thread Felipe Reyes
** Description changed:

- [Impact]
- 
- In a fresh jammy-antelope environment the metadata service is not
- available to the overcloud deployed instances, this is an environment
- using neutron-openvswitch environment (not OVN), when looking into the
- logs the stacktrace below is suspicious. The metadata service becomes
- available when restarting the services (e.g. juju config neutron-
- openvswitch debug=True)
- 
- [Test Case]
- 
- ```
- git clone https://opendev.org/openstack/charm-neutron-openvswitch
- cd charm-neutron-openvswitch
- git review -d 
https://review.opendev.org/c/openstack/charm-neutron-openvswitch/+/873819
- tox -e build  # make sure charmcraft-2.1 is installed before
- tox -e func-target -- jammy-antelope
- ```
+ In a fresh environment running Antelope (on top of Ubuntu 22.04), with a
+ DVR configuration (neutron-dhcp-agent and neutron-metadata-agent running
+ on compute nodes) the metadata service is not available instances, this
+ is an environment using neutron-openvswitch environment (not OVN), when
+ looking into the logs the stacktrace below indicates an unexpected data
+ type.
  
  [Stacktrace]
  
  2023-03-31 19:35:06.093 58625 DEBUG neutron.agent.dhcp.agent [-] Calling 
driver for network: 6d246d86-11b5-4d5f-aa9c-c2bcbcc28b62/seg=None action: 
get_metadata_bind_interface _call_driver 
/usr/lib/python3/dist-packages/neutron/agent/dhcp/agent.py:242
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent [-] 'bool' 
object is not subscriptable: TypeError: 'bool' object is not subscriptable
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent Traceback (most 
recent call last):
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/neutron/common/utils.py", line 182, in call
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent return 
func(*args, **kwargs)
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/neutron/agent/dhcp/agent.py", line 434, in 
safe_configure_dhcp_for_network
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent 
self.configure_dhcp_for_network(network)
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/osprofiler/profiler.py", line 159, in wrapper
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent result = 
f(*args, **kwargs)
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/neutron/agent/dhcp/agent.py", line 447, in 
configure_dhcp_for_network
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent 
self.update_isolated_metadata_proxy(network)
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/osprofiler/profiler.py", line 159, in wrapper
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent result = 
f(*args, **kwargs)
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/neutron/agent/dhcp/agent.py", line 763, in 
update_isolated_metadata_proxy
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent 
self.enable_isolated_metadata_proxy(network)
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/osprofiler/profiler.py", line 159, in wrapper
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent result = 
f(*args, **kwargs)
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/neutron/agent/dhcp/agent.py", line 819, in 
enable_isolated_metadata_proxy
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent 
metadata_driver.MetadataDriver.spawn_monitored_metadata_proxy(
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/neutron/agent/metadata/driver.py", line 244, in 
spawn_monitored_metadata_proxy
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent 
ip_lib.IpAddrCommand(
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/neutron/agent/linux/ip_lib.py", line 609, in 
wait_until_address_ready
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent 
common_utils.wait_until_true(
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/neutron/common/utils.py", line 744, in 
wait_until_true
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent while not 
predicate():
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python3/dist-packages/neutron/agent/linux/ip_lib.py", line 594, in 
is_address_ready
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent addr_info = 
self.list(to=address)[0]
  2023-03-31 19:35:06.095 58625 ERROR neutron.agent.dhcp.agent   File 

[Yahoo-eng-team] [Bug 2015728] [NEW] ovs/ovn source(with master branch) deployments broken

2023-04-10 Thread yatin
Public bug reported:

With [1] ovn/ovs jobs running with OVS_BRANCH=master,OVN_BRANCH=main are
broken, fails as below:-

utilities/ovn-dbctl.c: In function ‘server_loop’:
utilities/ovn-dbctl.c:1105:5: error: too few arguments to function 
‘daemonize_start’
 1105 | daemonize_start(false);
  | ^~~
In file included from utilities/ovn-dbctl.c:22:
/opt/stack/ovs/lib/daemon.h:170:6: note: declared here
  170 | void daemonize_start(bool access_datapath, bool access_hardware_ports);
  |  ^~~
make[1]: *** [Makefile:2374: utilities/ovn-dbctl.o] Error 1
make[1]: *** Waiting for unfinished jobs


Example failure:- 
https://zuul.openstack.org/build/b7b1700e2e5941f7a52b57ca411db722

Builds:-
- 
https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ipv6-only-ovs-master
- 
https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-full-multinode-ovs-master
- https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ovs-master
- 
https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ovs-master-centos-9-stream
- 
https://zuul.openstack.org/builds?job_name=ovn-octavia-provider-functional-master
- https://zuul.openstack.org/builds?job_name=ovn-octavia-provider-tempest-master


Until ovn main branch is adapted to this change we need to pin ovs_branch to 
working commit or better stable branch(as done with [2])

Also i noticed some of these jobs running in neutron/ovn-octavia stable
branches, that likely not needed to be run, so should be checked and
cleaned up.

[1] https://github.com/openvswitch/ovs/commit/07cf5810de 
[2] https://github.com/ovn-org/ovn/commit/b61e819bf9673

** Affects: neutron
 Importance: High
 Assignee: yatin (yatinkarel)
 Status: Confirmed


** Tags: gat

** Changed in: neutron
   Status: New => Confirmed

** Changed in: neutron
   Importance: Undecided => Critical

** Changed in: neutron
 Assignee: (unassigned) => yatin (yatinkarel)

** Tags added: gat

** Changed in: neutron
   Importance: Critical => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2015728

Title:
  ovs/ovn source(with master branch) deployments broken

Status in neutron:
  Confirmed

Bug description:
  With [1] ovn/ovs jobs running with OVS_BRANCH=master,OVN_BRANCH=main
  are broken, fails as below:-

  utilities/ovn-dbctl.c: In function ‘server_loop’:
  utilities/ovn-dbctl.c:1105:5: error: too few arguments to function 
‘daemonize_start’
   1105 | daemonize_start(false);
| ^~~
  In file included from utilities/ovn-dbctl.c:22:
  /opt/stack/ovs/lib/daemon.h:170:6: note: declared here
170 | void daemonize_start(bool access_datapath, bool 
access_hardware_ports);
|  ^~~
  make[1]: *** [Makefile:2374: utilities/ovn-dbctl.o] Error 1
  make[1]: *** Waiting for unfinished jobs

  
  Example failure:- 
https://zuul.openstack.org/build/b7b1700e2e5941f7a52b57ca411db722

  Builds:-
  - 
https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ipv6-only-ovs-master
  - 
https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-full-multinode-ovs-master
  - https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ovs-master
  - 
https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ovs-master-centos-9-stream
  - 
https://zuul.openstack.org/builds?job_name=ovn-octavia-provider-functional-master
  - 
https://zuul.openstack.org/builds?job_name=ovn-octavia-provider-tempest-master

  
  Until ovn main branch is adapted to this change we need to pin ovs_branch to 
working commit or better stable branch(as done with [2])

  Also i noticed some of these jobs running in neutron/ovn-octavia
  stable branches, that likely not needed to be run, so should be
  checked and cleaned up.

  [1] https://github.com/openvswitch/ovs/commit/07cf5810de 
  [2] https://github.com/ovn-org/ovn/commit/b61e819bf9673

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2015728/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp