[Yahoo-eng-team] [Bug 1936983] [NEW] tempest-slow-py3 is failing while creating initial network in neutron
Public bug reported: Example of failure https://128d50eaaf9c22786068-bb0b8d002b29cd153f6a742d68988dd1.ssl.cf5.rackcdn.com/792299/6/check/tempest- slow-py3/fabc438/controller/logs/screen-q-svc.txt ** Affects: neutron Importance: Critical Assignee: Slawek Kaplonski (slaweq) Status: Confirmed ** Tags: gate-failure ovn -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1936983 Title: tempest-slow-py3 is failing while creating initial network in neutron Status in neutron: Confirmed Bug description: Example of failure https://128d50eaaf9c22786068-bb0b8d002b29cd153f6a742d68988dd1.ssl.cf5.rackcdn.com/792299/6/check/tempest- slow-py3/fabc438/controller/logs/screen-q-svc.txt To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1936983/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1936980] [NEW] [DVR] ARP entries for allowed address pairs with IPv4 addresses are added using qr- interface from IPv6 subnets
Public bug reported: ARP entries for allowed address pairs are added in the DVR routers also for IPv6 subnets even if IP is IPv4 address really. ** Affects: neutron Importance: Medium Assignee: Slawek Kaplonski (slaweq) Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1936980 Title: [DVR] ARP entries for allowed address pairs with IPv4 addresses are added using qr- interface from IPv6 subnets Status in neutron: New Bug description: ARP entries for allowed address pairs are added in the DVR routers also for IPv6 subnets even if IP is IPv4 address really. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1936980/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1936972] [NEW] MAAS deploys fail if host has NIC w/ random MAC
Public bug reported: The Nvidia DGX A100 server includes a USB Redfish Host Interface NIC. This NIC apparently provides no MAC address of it's own, so the driver generates a random MAC for it: ./drivers/net/usb/cdc_ether.c: static int usbnet_cdc_zte_bind(struct usbnet *dev, struct usb_interface *intf) { int status = usbnet_cdc_bind(dev, intf); if (!status && (dev->net->dev_addr[0] & 0x02)) eth_hw_addr_random(dev->net); return status; } This causes a problem with MAAS because, during deployment, MAAS sees this as a normal NIC and records the MAC. The post-install reboot then fails: [ 43.652573] cloud-init[3761]: init.apply_network_config(bring_up=not args.local) [ 43.700516] cloud-init[3761]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 735, in apply_network_config [ 43.724496] cloud-init[3761]: self.distro.networking.wait_for_physdevs(netcfg) [ 43.740509] cloud-init[3761]: File "/usr/lib/python3/dist-packages/cloudinit/distros/networking.py", line 177, in wait_for_physdevs [ 43.764523] cloud-init[3761]: raise RuntimeError(msg) [ 43.780511] cloud-init[3761]: RuntimeError: Not all expected physical devices present: {'fe:b8:63:69:9f:71'} I'm not sure what the best answer for MAAS is here, but here's some thoughts: 1) Ignore all Redfish system interfaces. These are a connect between the host and the BMC, so they don't really have a use-case in the MAAS model AFAICT. These devices can be identified using the SMBIOS as described in the Redfish Host Interface Specification, section 8: https://www.dmtf.org/sites/default/files/standards/documents/DSP0270_1.3.0.pdf Which can be read from within Linux using dmidecode. 2) Ignore (or specially handle) all NICs with randomly generated MAC addresses. While this is the only time I've seen the random MAC with production server hardware, it is something I've seen on e.g. ARM development boards. Problem is, I don't know how to detect a generated MAC. I'd hoped the permanent MAC (ethtool -P) MAC would be NULL, but it seems to also be set to the generated MAC :( fyi, 2 workarounds for this that seem to work: 1) Delete the NIC from the MAAS model in the MAAS UI after every commissioning. 2) Use a tag's kernel_opts field to modprobe.blacklist the driver used for the Redfish NIC. ** Affects: cloud-init Importance: Undecided Status: New ** Affects: curtin Importance: Undecided Status: New ** Affects: maas Importance: Undecided Status: New ** Also affects: cloud-init Importance: Undecided Status: New ** Also affects: curtin Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1936972 Title: MAAS deploys fail if host has NIC w/ random MAC Status in cloud-init: New Status in curtin: New Status in MAAS: New Bug description: The Nvidia DGX A100 server includes a USB Redfish Host Interface NIC. This NIC apparently provides no MAC address of it's own, so the driver generates a random MAC for it: ./drivers/net/usb/cdc_ether.c: static int usbnet_cdc_zte_bind(struct usbnet *dev, struct usb_interface *intf) { int status = usbnet_cdc_bind(dev, intf); if (!status && (dev->net->dev_addr[0] & 0x02)) eth_hw_addr_random(dev->net); return status; } This causes a problem with MAAS because, during deployment, MAAS sees this as a normal NIC and records the MAC. The post-install reboot then fails: [ 43.652573] cloud-init[3761]: init.apply_network_config(bring_up=not args.local) [ 43.700516] cloud-init[3761]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 735, in apply_network_config [ 43.724496] cloud-init[3761]: self.distro.networking.wait_for_physdevs(netcfg) [ 43.740509] cloud-init[3761]: File "/usr/lib/python3/dist-packages/cloudinit/distros/networking.py", line 177, in wait_for_physdevs [ 43.764523] cloud-init[3761]: raise RuntimeError(msg) [ 43.780511] cloud-init[3761]: RuntimeError: Not all expected physical devices present: {'fe:b8:63:69:9f:71'} I'm not sure what the best answer for MAAS is here, but here's some thoughts: 1) Ignore all Redfish system interfaces. These are a connect between the host and the BMC, so they don't really have a use-case in the MAAS model AFAICT. These devices can be identified using the SMBIOS as described in the Redfish Host Interface Specification, section 8: https://www.dmtf.org/sites/default/files/standards/documents/DSP0270_1.3.0.pdf Which can be read from within Linux using dmidecode. 2) Ignore (or specially handle) all NICs with randomly generated MAC addresses. While this is the only time I've seen the random MAC with production server hardware, it is something I've seen
[Yahoo-eng-team] [Bug 1936959] [NEW] [OVN Octavia Provider] Unable to delete Load Balancer with PENDING_DELETE
Public bug reported: While attempting to delete a Load Balancer the provisioning status is moved to PENDING_DELETE and remains that way, blocking the deletion process to finalize. The following tracebacks were found on the logs regarding that specific lb: 2021-07-17 13:49:26.131 19 INFO octavia.api.v2.controllers.load_balancer [req-b8b3cbd8-3014-4c45-9680-d4c67346ed1c - 1e38d4dfbfb7427787725df69fabc22b - default default] Sending delete Load Balancer 19d8e465-c704-40a9-b1fd-5b0824408e5d to provider ovn 2021-07-17 13:49:26.139 19 DEBUG ovn_octavia_provider.helper [-] Handling request lb_delete with info {'id': '19d8e465-c704-40a9-b1fd-5b0824408e5d', 'cascade': True} request_handler /usr/lib/python3.6/site-packages/ovn_octavia_provider/helper.py:303 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper [-] Exception occurred during deletion of loadbalancer: RuntimeError: dictionary changed size during iteration 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper Traceback (most recent call last): 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper File "/usr/lib/python3.6/site-packages/ovn_octavia_provider/helper.py", line 907, in lb_delete 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper status = self._lb_delete(loadbalancer, ovn_lb, status) 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper File "/usr/lib/python3.6/site-packages/ovn_octavia_provider/helper.py", line 960, in _lb_delete 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper for ls in self._find_lb_in_table(ovn_lb, 'Logical_Switch'): 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper File "/usr/lib/python3.6/site-packages/ovn_octavia_provider/helper.py", line 289, in _find_lb_in_table 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper return [item for item in self.ovn_nbdb_api.tables[table].rows.values() 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper File "/usr/lib/python3.6/site-packages/ovn_octavia_provider/helper.py", line 289, in 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper return [item for item in self.ovn_nbdb_api.tables[table].rows.values() 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper File "/usr/lib64/python3.6/_collections_abc.py", line 761, in __iter__ 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper for key in self._mapping: 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper RuntimeError: dictionary changed size during iteration 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper 2021-07-17 13:49:26.446 13 DEBUG octavia.common.keystone [req-267feb7e-2235-43d9-bec8-88ff532b9019 - 1e38d4dfbfb7427787725df69fabc22b - default default] Request path is / and it does not require keystone authentication process_request /usr/lib/python3.6/site-packages/octavia/common/keystone.py:77 2021-07-17 13:49:26.554 19 DEBUG ovn_octavia_provider.helper [-] Updating status to octavia: {'loadbalancers': [{'id': '19d8e465-c704-40a9-b1fd-5b0824408e5d', 'provisioning_status': 'ERROR', 'operating_status': 'ERROR'}], 'listeners': [{'id': '0806594a-4ed7-4889-81fa-6fd8d02b0d80', 'provisioning_status': 'DELETED', 'operating_status': 'OFFLINE'}], 'pools': [{'id': 'b8a98db0-6d2e-4745-b533-d2eb3548d1b9', 'provisioning_status': 'DELETED'}], 'members': [{'id': '08464181-728b-425a-b690-d3eb656f7e0a', 'provisioning_status': 'DELETED'}]} _update_status_to_octavia /usr/lib/python3.6/site-packages/ovn_octavia_provider/helper.py:32 The problem here is that using rows.values() is inherently racy as if there are multiple threads running this can happen eventually. ** Affects: neutron Importance: High Assignee: Brian Haley (brian-haley) Status: In Progress ** Tags: ovn-octavia-provider -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1936959 Title: [OVN Octavia Provider] Unable to delete Load Balancer with PENDING_DELETE Status in neutron: In Progress Bug description: While attempting to delete a Load Balancer the provisioning status is moved to PENDING_DELETE and remains that way, blocking the deletion process to finalize. The following tracebacks were found on the logs regarding that specific lb: 2021-07-17 13:49:26.131 19 INFO octavia.api.v2.controllers.load_balancer [req-b8b3cbd8-3014-4c45-9680-d4c67346ed1c - 1e38d4dfbfb7427787725df69fabc22b - default default] Sending delete Load Balancer 19d8e465-c704-40a9-b1fd-5b0824408e5d to provider ovn 2021-07-17 13:49:26.139 19 DEBUG ovn_octavia_provider.helper [-] Handling request lb_delete with info {'id': '19d8e465-c704-40a9-b1fd-5b0824408e5d', 'cascade': True} request_handler /usr/lib/python3.6/site-packages/ovn_octavia_provider/helper.py:303 2021-07-17 13:49:26.196 19 ERROR ovn_octavia_provider.helper [-] Exception occurred during deletion of
[Yahoo-eng-team] [Bug 1934957] Re: [sriov] Unable to change the VF state for i350 interface
We discussed that issue in our team meeting today https://meetings.opendev.org/meetings/networking/2021/networking.2021-07-20-14.00.log.html Our conclusion is that this is Intel's driver bug and we shouldn't try to fix/workaround it in Neutron. It should be fixed in the driver's code. So I'm going to close this bug. ** Changed in: neutron Status: Triaged => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1934957 Title: [sriov] Unable to change the VF state for i350 interface Status in neutron: Won't Fix Bug description: When sriov-nic-agent configures VF state, the exception is as follows: 2021-07-08 06:15:47.773 34 DEBUG oslo.privsep.daemon [-] privsep: Exception during request[139820149013392]: Operation not supported on interface eno4, namespace None. _process_cmd /usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py:490 Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 263, in _run_iproute_link return ip.link(command, index=idx, **kwargs) File "/usr/local/lib/python3.6/site-packages/pyroute2/iproute/linux.py", line 1360, in link msg_flags=msg_flags) File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 376, in nlm_request return tuple(self._genlm_request(*argv, **kwarg)) File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 869, in nlm_request callback=callback): File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 379, in get return tuple(self._genlm_get(*argv, **kwarg)) File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 704, in get raise msg['header']['error'] pyroute2.netlink.exceptions.NetlinkError: (95, 'Operation not supported') During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 485, in _process_cmd ret = func(*f_args, **f_kwargs) File "/usr/local/lib/python3.6/site-packages/oslo_privsep/priv_context.py", line 249, in _wrap return func(*args, **kwargs) File "/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 403, in set_link_vf_feature return _run_iproute_link("set", device, namespace=namespace, vf=vf_config) File "/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 265, in _run_iproute_link _translate_ip_device_exception(e, device, namespace) File "/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py", line 237, in _translate_ip_device_exception namespace=namespace) neutron.privileged.agent.linux.ip_lib.InterfaceOperationNotSupported: Operation not supported on interface eno4, namespace None. 2021-07-08 06:15:47.773 34 DEBUG oslo.privsep.daemon [-] privsep: reply[139820149013392]: (5, 'neutron.privileged.agent.linux.ip_lib.InterfaceOperationNotSupported', ('Operation not supported on interface eno4, namespace None.',)) _call_back /usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py:511 2021-07-08 06:15:47.774 24 WARNING neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-661d08fb-983f-4632-9eb4-91585a557753 - - - - -] Device fa:16:3e:66:e4:91 does not support state change: neutron.privileged.agent.linux.ip_lib.InterfaceOperationNotSupported: Operation not supported on interface eno4, namespace None. But the vm network traffic is no problem. We use i350 interface, and I found these discuss about i350[1][2]. This exception is not impact for vm traffic, maybe we can ignore it when interface is i350. [1]https://sourceforge.net/p/e1000/bugs/653/ [2]https://community.intel.com/t5/Ethernet-Products/On-SRIOV-interface-I350-unable-to-change-the-VF-state-from-auto/td-p/704769 version: neutron-sriov-nic-agent version 17.1.3 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1934957/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1936675] Re: Do not coerce a subquery into a select, create a proper select condition
Reviewed: https://review.opendev.org/c/openstack/neutron/+/801076 Committed: https://opendev.org/openstack/neutron/commit/923284fc3791949d93d7ded9820a44f98fc734b8 Submitter: "Zuul (22348)" Branch:master commit 923284fc3791949d93d7ded9820a44f98fc734b8 Author: Rodolfo Alonso Hernandez Date: Fri Jul 16 14:55:55 2021 + Use explicit select condition in SQL query in "_port_filter_hook" Instead of executing a subquery inside a select, use a proper filter condition instead. Closes-Bug: #1936675 Change-Id: I97e9ca244c0716510fcd4ec81d54046be9c5f8f8 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1936675 Title: Do not coerce a subquery into a select, create a proper select condition Status in neutron: Fix Released Bug description: When executing the UTs, we can see the following error: /opt/stack/neutron/neutron/db/db_base_plugin_v2.py:114: SAWarning: Coercing Subquery object into a select() for use in IN(); please pass a select() construct explicitly conditions |= (models_v2.Port.network_id.in_( This is happening in method [1]. [1]https://github.com/openstack/neutron/blob/d20b8708bc03fa0dfe45656b48408b3f74a67709/neutron/db/db_base_plugin_v2.py#L111-L118 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1936675/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1936667] Re: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3
Reviewed: https://review.opendev.org/c/openstack/neutron/+/801068 Committed: https://opendev.org/openstack/neutron/commit/e961c6d4734ec2336ba807e4c7aa77bdc354e2d3 Submitter: "Zuul (22348)" Branch:master commit e961c6d4734ec2336ba807e4c7aa77bdc354e2d3 Author: Rodolfo Alonso Hernandez Date: Fri Jul 16 14:27:15 2021 + Import ABC classes from collection.abc ABC classes should be imported from "collections.abc", not "collections". Closes-Bug: #1936667 Change-Id: I863f21b310fdf39030b13e2926e947b16043851a ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1936667 Title: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3 Status in OpenStack Identity (keystone): In Progress Status in OpenStack Shared File Systems Service (Manila): In Progress Status in Mistral: In Progress Status in neutron: Fix Released Status in OpenStack Object Storage (swift): Fix Released Status in taskflow: Fix Released Status in tempest: In Progress Status in zaqar: In Progress Bug description: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3. For example: >>> import collections >>> collections.Iterable :1: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working >>> from collections import abc >>> abc.Iterable To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1936667/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1936911] [NEW] Scenario test test_established_tcp_session_after_re_attachinging_sg is failing on the linuxbridge backend
Public bug reported: This test is failing in that backend from time to time. For example: https://4d0dc33a1771f7b089e2-b79c57b376466cab8e443243a2295837.ssl.cf1.rackcdn.com/601336/95/check/neutron- tempest-plugin-scenario-linuxbridge/f5be5f7/testr_results.html https://2c312e10b9f362ff0be0-ac198ee519f662a1d471c5eebfdff2e7.ssl.cf5.rackcdn.com/798009/3/check/neutron- tempest-plugin-scenario-linuxbridge/534618f/testr_results.html Every time I saw it it was failing in the linuxbridge job but maybe that's just coincidence. ** Affects: neutron Importance: High Status: Confirmed ** Tags: gate-failure linuxbridge -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1936911 Title: Scenario test test_established_tcp_session_after_re_attachinging_sg is failing on the linuxbridge backend Status in neutron: Confirmed Bug description: This test is failing in that backend from time to time. For example: https://4d0dc33a1771f7b089e2-b79c57b376466cab8e443243a2295837.ssl.cf1.rackcdn.com/601336/95/check/neutron- tempest-plugin-scenario-linuxbridge/f5be5f7/testr_results.html https://2c312e10b9f362ff0be0-ac198ee519f662a1d471c5eebfdff2e7.ssl.cf5.rackcdn.com/798009/3/check/neutron- tempest-plugin-scenario-linuxbridge/534618f/testr_results.html Every time I saw it it was failing in the linuxbridge job but maybe that's just coincidence. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1936911/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1936906] Re: SAWarning: TypeDecorator SoftDeleteInteger() will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type object
https://opendev.org/openstack/oslo.db/commit/1dc20f646b558354e4ba434f4132a1b7979d563e ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1936906 Title: SAWarning: TypeDecorator SoftDeleteInteger() will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type object's state is safe to use in a cache key, or False to disable this warning. Status in OpenStack Compute (nova): Invalid Bug description: Seeing the following dumped while running various nova-manage commands: /opt/stack/nova/nova/db/sqlalchemy/api.py:3019: SAWarning: TypeDecorator SoftDeleteInteger() will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type object's state is safe to use in a cache key, or False to disable this warning. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1936906/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1936906] [NEW] SAWarning: TypeDecorator SoftDeleteInteger() will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type obje
Public bug reported: Seeing the following dumped while running various nova-manage commands: /opt/stack/nova/nova/db/sqlalchemy/api.py:3019: SAWarning: TypeDecorator SoftDeleteInteger() will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type object's state is safe to use in a cache key, or False to disable this warning. ** Affects: nova Importance: Undecided Status: Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1936906 Title: SAWarning: TypeDecorator SoftDeleteInteger() will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type object's state is safe to use in a cache key, or False to disable this warning. Status in OpenStack Compute (nova): Invalid Bug description: Seeing the following dumped while running various nova-manage commands: /opt/stack/nova/nova/db/sqlalchemy/api.py:3019: SAWarning: TypeDecorator SoftDeleteInteger() will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type object's state is safe to use in a cache key, or False to disable this warning. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1936906/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1936574] Re: nova-compute SSL connections make rabbitmq pods OOM
** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1936574 Title: nova-compute SSL connections make rabbitmq pods OOM Status in OpenStack Compute (nova): Invalid Status in oslo.messaging: New Status in RabbitMQ: New Bug description: we have an Rocky openstack deployment that includes 3 controller and 500 computes.just at 15:58,nova-compute detect that rabbitmq connection was broken ,then reconnected. 2021-07-05 15:58:28.633 8 ERROR oslo.messaging._drivers.impl_rabbit [req-a09d4a8b-c24b-4b30-b433-64fe4f6bace5 - - - - -] [8ed1f425-ad67-4b98-874c-e4516aaf3134] AMQP server on 145.247.103.16:5671 is unreachable: . Trying again in 1 seconds.: timeout 2021-07-05 15:58:29.656 8 INFO oslo.messaging._drivers.impl_rabbit [req-a09d4a8b-c24b-4b30-b433-64fe4f6bace5 - - - - -] [8ed1f425-ad67-4b98-874c-e4516aaf3134] Reconnected to AMQP server on 145.247.103.16:5671 via [amqp] client with port 28205. then rabbitmq report huge connections was closed by client. =WARNING REPORT 5-Jul-2021::15:57:59 === closing AMQP connection <0.6345.754> (20.16.36.44:2451 -> 145.247.103.14:5671 - nova-compute:8:b4ce7b09-b9b5-4db1-983b-a071dc031c64, vhost: '/', user: 'openstack'): client unexpectedly closed TCP connection after 10 minutes ,cluster was blocked with 0.4 memory watermark. =INFO REPORT 5-Jul-2021::16:19:29 === vm_memory_high_watermark set. Memory used:111358541824 allowed:107949065830 ** *** Publishers will be blocked until this alarm clears *** ** However ,after the publishers were bloked ,rabbitmq pod still result in memory leaking,in the end, the node OOM,system force pod to restart. amqp release : 2.5.2 oslo-messaging release :8.1.4 openstack : Rocky To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1936574/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp