[Yahoo-eng-team] [Bug 1917409] [NEW] neutron-l3-agents won't become active
Public bug reported: We have a Ubuntu Ussari cloud deployed on Ubuntu 20.04 using the juju charms from the 20.08 bundle (planning to upgrade soon). The problem that is occuring that all l3 agents for routers using a particular external network show up with their ha_state in standby. I've tried removing and re-adding, and we never see the state go to active. $ neutron l3-agent-list-hosting-router bradm-router neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--+-++---+--+ | id | host| admin_state_up | alive | ha_state | +--+-++---+--+ | 09ae92c9-ae8f-4209-b1a8-d593cc6d6602 | oschv1.maas | True | :-) | standby | | 4d9fe934-b1f8-4c2b-83ea-04971f827209 | oschv2.maas | True | :-) | standby | | 70b8b60e-7fbd-4b3a-80a3-90875ca72ce6 | oschv4.maas | True | :-) | standby | +--+-++---+--+ This generates a stack trace: 2021-03-01 02:59:47.344 3675486 ERROR neutron.agent.l3.router_info [-] 'NoneType' object has no attribute 'get' Traceback (most recent call last): File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming res = self.dispatcher.dispatch(message) File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 276, in dispatch return self._do_dispatch(endpoint, method, ctxt, args) File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 196, in _do_dispatch result = func(ctxt, **new_args) File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 139, in wrapped setattr(e, '_RETRY_EXCEEDED', True) File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise raise value File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 135, in wrapped return f(*args, **kwargs) File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 154, in wrapper ectxt.value = e.inner_exc File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise raise value File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 142, in wrapper return f(*args, **kwargs) File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 183, in wrapped LOG.debug("Retry wrapper got retriable exception: %s", e) File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise raise value File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 179, in wrapped return f(*dup_args, **dup_kwargs) File "/usr/lib/python3/dist-packages/neutron/api/rpc/handlers/l3_rpc.py", line 306, in get_agent_gateway_port agent_port = self.l3plugin.create_fip_agent_gw_port_if_not_exists( File "/usr/lib/python3/dist-packages/neutron/db/l3_dvr_db.py", line 1101, in create_fip_agent_gw_port_if_not_exists self._populate_mtu_and_subnets_for_ports(context, [agent_port]) File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1772, in _populate_mtu_and_subnets_for_ports network_ids = [p['network_id'] File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1772, in network_ids = [p['network_id'] File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1720, in _each_port_having_fixed_ips fixed_ips = port.get('fixed_ips', []) This system was running successfully after deployment, and has been left running for a while and when it was revisited was in this state. I've been unable to successfully debug what has caused it to be in this state. Versions: Ubuntu 20.04 Juju charms 20.08 Openstack ussari Environment: Clustered services using containers on converged hypervisors $ dpkg-query -W neutron-common neutron-common 2:16.2.0-0ubuntu2 Please let me know if there is any further information that could be used to see what is happening here. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is
[Yahoo-eng-team] [Bug 1917393] [NEW] [L3][Port forwarding] admin state DOWN/UP router will lose all pf-floating-ips and nat rules
Public bug reported: Need to clean cache when router is down, otherwise the port forwarding extension will skip all objects processing due to cache is hitting. ** Affects: neutron Importance: High Status: Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1917393 Title: [L3][Port forwarding] admin state DOWN/UP router will lose all pf- floating-ips and nat rules Status in neutron: Confirmed Bug description: Need to clean cache when router is down, otherwise the port forwarding extension will skip all objects processing due to cache is hitting. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1917393/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1917370] [NEW] [functional] ovn maintenance worker isn't mocked in functional tests
Public bug reported: In most of the functional tests there is no need to run MaintenanceThread from the ovn mech driver. It caused a lot of error logs in the job's output and may also cause some failures in some cases from time to time. ** Affects: neutron Importance: High Assignee: Slawek Kaplonski (slaweq) Status: Confirmed ** Tags: functional-tests ovn -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1917370 Title: [functional] ovn maintenance worker isn't mocked in functional tests Status in neutron: Confirmed Bug description: In most of the functional tests there is no need to run MaintenanceThread from the ovn mech driver. It caused a lot of error logs in the job's output and may also cause some failures in some cases from time to time. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1917370/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832021] Re: Checksum drop of metadata traffic on isolated networks with DPDK
** Description changed: + [Impact] + When an isolated network using provider networks for tenants (meaning without virtual routers: DVR or network node), metadata access occurs in the qdhcp ip netns rather than the qrouter netns. The following options are set in the dhcp_agent.ini file: force_metadata = True enable_isolated_metadata = True VMs on the provider tenant network are unable to access metadata as packets are dropped due to checksum. - When we added the following in the qdhcp netns, VMs regained access to - metadata: + [Test Plan] - iptables -t mangle -A OUTPUT -o ns-+ -p tcp --sport 80 -j CHECKSUM - --checksum-fill + 1. Create an OpenStack deployment with DPDK options enabled and 'enable- + local-dhcp-and-metadata: true' in neutron-openvswitch. A sample, simple + 3 node bundle can be found here[1]. - It seems this setting was recently removed from the qrouter netns [0] - but it never existed in the qdhcp to begin with. + 2. Create an external flat network and subnet: - [0] https://review.opendev.org/#/c/654645/ + openstack network show dpdk_net || \ + openstack network create --provider-network-type flat \ +--provider-physical-network physnet1 dpdk_net \ +--external - Related LP Bug #1831935 - See https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1831935/comments/10 + openstack subnet show dpdk_net || \ + openstack subnet create --allocation-pool start=10.230.58.100,end=10.230.58.200 \ + --subnet-range 10.230.56.0/21 --dhcp --gateway 10.230.56.1 \ + --dns-nameserver 10.230.56.2 \ + --ip-version 4 --network dpdk_net dpdk_subnet + + + 3. Create an instance attached to that network. The instance must have a flavor that uses huge pages. + + openstack flavor create --ram 8192 --disk 50 --vcpus 4 m1.dpdk + openstack flavor set m1.dpdk --property hw:mem_page_size=large + + openstack server create --wait --image xenial --flavor m1.dpdk --key- + name testkey --network dpdk_net i1 + + 4. Log into the instance host and check the instance console. The + instance will hang into the boot and show the following message: + + 2020-11-20 09:43:26,790 - openstack.py[DEBUG]: Failed reading optional + path http://169.254.169.254/openstack/2015-10-15/user_data due to: + HTTPConnectionPool(host='169.254.169.254', port=80): Read timed out. + (read timeout=10.0) + + 5. Apply the fix in all computes, restart the DHCP agents in all + computes and create the instance again. + + 6. No errors should be shown and the instance quickly boots. + + + [Where problems could occur] + + * This change is only touched if datapath_type and ovs_use_veth. Those settings are mostly used for DPDK environments. The core of the fix is + to toggle off checksum offload done by the DHCP namespace interfaces. + This will have the drawback of adding some overhead on the packet processing for DHCP traffic but given DHCP does not demand too much data, this should be a minor proble. + + * Future changes on the syntax of the ethtool command could cause + regressions + + + [Other Info] + + * None + + + [1] https://gist.github.com/sombrafam/e0741138773e444960eb4aeace6e3e79 ** Also affects: cloud-archive Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832021 Title: Checksum drop of metadata traffic on isolated networks with DPDK Status in OpenStack neutron-openvswitch charm: Fix Released Status in Ubuntu Cloud Archive: New Status in neutron: Fix Released Bug description: [Impact] When an isolated network using provider networks for tenants (meaning without virtual routers: DVR or network node), metadata access occurs in the qdhcp ip netns rather than the qrouter netns. The following options are set in the dhcp_agent.ini file: force_metadata = True enable_isolated_metadata = True VMs on the provider tenant network are unable to access metadata as packets are dropped due to checksum. [Test Plan] 1. Create an OpenStack deployment with DPDK options enabled and 'enable-local-dhcp-and-metadata: true' in neutron-openvswitch. A sample, simple 3 node bundle can be found here[1]. 2. Create an external flat network and subnet: openstack network show dpdk_net || \ openstack network create --provider-network-type flat \ --provider-physical-network physnet1 dpdk_net \ --external openstack subnet show dpdk_net || \ openstack subnet create --allocation-pool start=10.230.58.100,end=10.230.58.200 \ --subnet-range 10.230.56.0/21 --dhcp --gateway 10.230.56.1 \ --dns-nameserver 10.230.56.2 \
[Yahoo-eng-team] [Bug 1735724] Re: Metadata iptables rules never inserted upon exception on router creation
Thanks for digging into the report. Based on your analysis, the VMT has no plans to issue an advisory, since none of our supported releases is considered vulnerable to this any longer. If new information is brought to light which indicates there is still a means to exploit this flaw in more recent releases, we're happy to reconsider the decision at that time. ** Changed in: ossa Status: Incomplete => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1735724 Title: Metadata iptables rules never inserted upon exception on router creation Status in neutron: Fix Released Status in OpenStack Security Advisory: Won't Fix Bug description: We've been debugging some issues being seen lately [0] and found out that there's a bug in l3 agent when creating routers (or during initial sync). Jakub Libosvar and I spent some time recreating the issue and this is what we got: Especially since we bumped to ovsdbapp 0.8.0, we've seen some jobs failing due to errors when authenticating using PK to a VM. The TCP connection to the SSH port was successfully established but the authentication failed. After debugging further, we found out that metadata rules in qrouter namespace which redirect traffic to haproxy (which replaced old neutron-ns-metadata-proxy) were missing, so VM's weren't fetching metadata (hence, public key). These rules are installed by metadata driver after a router is created [1] on the AFTER_CREATE notification. Also, they will get created during the initial sync of the l3 agent (since it's still unknown for the agent) [2]. Here, if we don't know the router yet, we'll call _proccess_added_router() and if it's a known router we'll call _process_updated_router(). After our tests, we've seen that iptables rules are never restored if we simulate an Exception inside ri.process() at [3] even though the router is scheduled for resync [4]. The reason why this happens is because we've already added it to our router info [5] so even though ri.process() fails at L481 and it's scheduled for resync, next time _process_updated_router() will get called instead of _process_added_router() thus not pushing the notification into metadata driver to install iptables rules and they never get installed. In conclusion, if an error occurs during _process_added_router() we might end up losing metadata forever until we restart the agent and this call succeeds. Worse, we will be forwarding metadata requests via br-ex which could lead to security issues (ie. we could be injecting wrong metadata from the outside or the metadata server running in the underlying cloud may respond). With ovsdbapp 0.9.0 we're minimizing this because if a port fails to be added to br-int, ovsdbapp will enqueue the transaction instead of throwing an Exception but there could be still some other exceptions I guess that reproduces this scenario outside of ovsdbapp so we need to fix it in Neutron. Thanks Daniel Alvarez --- [0] https://bugs.launchpad.net/tripleo/+bug/1731063 [1] https://github.com/openstack/neutron/blob/02fa049c5f5a38a276bec6e55c68ac19cd08c59f/neutron/agent/metadata/driver.py#L288 [2] https://github.com/openstack/neutron/blob/02fa049c5f5a38a276bec6e55c68ac19cd08c59f/neutron/agent/l3/agent.py#L472 [3] https://github.com/openstack/neutron/blob/02fa049c5f5a38a276bec6e55c68ac19cd08c59f/neutron/agent/l3/agent.py#L481 [4] https://github.com/openstack/neutron/blob/02fa049c5f5a38a276bec6e55c68ac19cd08c59f/neutron/agent/l3/agent.py#L565 [5] https://github.com/openstack/neutron/blob/02fa049c5f5a38a276bec6e55c68ac19cd08c59f/neutron/agent/l3/agent.py#L478 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1735724/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1892361] Re: SRIOV instance gets type-PF interface, libvirt kvm fails
This bug was fixed in the package nova - 2:20.5.0-0ubuntu1~cloud0 --- nova (2:20.5.0-0ubuntu1~cloud0) bionic-train; urgency=medium . * New stable point release for OpenStack Train (LP: #1915787). * d/p/lp1892361.patch: Removed after change landed upstream. . nova (2:20.4.1-0ubuntu1~cloud1) bionic-train; urgency=medium . * d/p/lp1892361.patch: Update pci stat pools based on PCI device changes (LP: #1892361). ** Changed in: cloud-archive/train Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1892361 Title: SRIOV instance gets type-PF interface, libvirt kvm fails Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: New Status in Ubuntu Cloud Archive rocky series: New Status in Ubuntu Cloud Archive stein series: New Status in Ubuntu Cloud Archive train series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: In Progress Status in OpenStack Compute (nova) rocky series: In Progress Status in OpenStack Compute (nova) stein series: Fix Committed Status in OpenStack Compute (nova) train series: Fix Released Status in OpenStack Compute (nova) ussuri series: Fix Released Status in OpenStack Compute (nova) victoria series: Fix Released Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: New Status in nova source package in Focal: Fix Released Status in nova source package in Groovy: Fix Released Status in nova source package in Hirsute: Fix Released Bug description: When spawning an SR-IOV enabled instance on a newly deployed host, nova attempts to spawn it with an type-PF pci device. This fails with the below stack trace. After restarting neutron-sriov-agent and nova-compute services on the compute node and spawning an SR-IOV instance again, a type-VF pci device is selected, and instance spawning succeeds. Stack trace: 2020-08-20 08:29:09.558 7624 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 6db8011e6ecd4fd0aaa53c8f89f08b1b __call__ /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:400 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [req-e3e49d07-24c6-4c62-916e-f830f70983a2 ddcfb3640535428798aa3c8545362bd4 dd99e7950a5b46b5b924ccd1720b6257 - 015e4fd7db304665ab5378caa691bb8b 015e4fd7db304665ab5378caa691bb8b] [insta nce: 9498ea75-fe88-4020-9a9e-f4c437c6de11] Instance failed to spawn: libvirtError: unsupported configuration: Interface type hostdev is currently supported on SR-IOV Virtual Functions only 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] Traceback (most recent call last): 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2274, in _build_resources 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] yield resources 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2054, in _build_and_run_instance 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] block_device_info=block_device_info) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 3147, in spawn 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] destroy_disks_on_failure=True) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5651, in _create_domain_and_network 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] destroy_disks_on_failure) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] self.force_reraise() 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
[Yahoo-eng-team] [Bug 1735724] Re: Metadata iptables rules never inserted upon exception on router creation
I was trying to reproduce that issue today and I couldn't. Looking at the code it seems for me that after Brian's change [1] those rules are now added to the iptables_manager during creation of the router_info instance. So it's way before ri.process() is really called. If there will be any issue in that constuctor, there will be even no namespace created at all for the router. [1] https://review.openstack.org/524406 ** Changed in: neutron Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1735724 Title: Metadata iptables rules never inserted upon exception on router creation Status in neutron: Fix Released Status in OpenStack Security Advisory: Incomplete Bug description: We've been debugging some issues being seen lately [0] and found out that there's a bug in l3 agent when creating routers (or during initial sync). Jakub Libosvar and I spent some time recreating the issue and this is what we got: Especially since we bumped to ovsdbapp 0.8.0, we've seen some jobs failing due to errors when authenticating using PK to a VM. The TCP connection to the SSH port was successfully established but the authentication failed. After debugging further, we found out that metadata rules in qrouter namespace which redirect traffic to haproxy (which replaced old neutron-ns-metadata-proxy) were missing, so VM's weren't fetching metadata (hence, public key). These rules are installed by metadata driver after a router is created [1] on the AFTER_CREATE notification. Also, they will get created during the initial sync of the l3 agent (since it's still unknown for the agent) [2]. Here, if we don't know the router yet, we'll call _proccess_added_router() and if it's a known router we'll call _process_updated_router(). After our tests, we've seen that iptables rules are never restored if we simulate an Exception inside ri.process() at [3] even though the router is scheduled for resync [4]. The reason why this happens is because we've already added it to our router info [5] so even though ri.process() fails at L481 and it's scheduled for resync, next time _process_updated_router() will get called instead of _process_added_router() thus not pushing the notification into metadata driver to install iptables rules and they never get installed. In conclusion, if an error occurs during _process_added_router() we might end up losing metadata forever until we restart the agent and this call succeeds. Worse, we will be forwarding metadata requests via br-ex which could lead to security issues (ie. we could be injecting wrong metadata from the outside or the metadata server running in the underlying cloud may respond). With ovsdbapp 0.9.0 we're minimizing this because if a port fails to be added to br-int, ovsdbapp will enqueue the transaction instead of throwing an Exception but there could be still some other exceptions I guess that reproduces this scenario outside of ovsdbapp so we need to fix it in Neutron. Thanks Daniel Alvarez --- [0] https://bugs.launchpad.net/tripleo/+bug/1731063 [1] https://github.com/openstack/neutron/blob/02fa049c5f5a38a276bec6e55c68ac19cd08c59f/neutron/agent/metadata/driver.py#L288 [2] https://github.com/openstack/neutron/blob/02fa049c5f5a38a276bec6e55c68ac19cd08c59f/neutron/agent/l3/agent.py#L472 [3] https://github.com/openstack/neutron/blob/02fa049c5f5a38a276bec6e55c68ac19cd08c59f/neutron/agent/l3/agent.py#L481 [4] https://github.com/openstack/neutron/blob/02fa049c5f5a38a276bec6e55c68ac19cd08c59f/neutron/agent/l3/agent.py#L565 [5] https://github.com/openstack/neutron/blob/02fa049c5f5a38a276bec6e55c68ac19cd08c59f/neutron/agent/l3/agent.py#L478 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1735724/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1904399] Re: [OVN] Inconsistent "flooding to unregistered" IGMP configuration
This bug was fixed in the package neutron - 2:16.2.0-0ubuntu3~cloud0 --- neutron (2:16.2.0-0ubuntu3~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.2.0-0ubuntu3) focal; urgency=medium . * d/p/ovn-fix-inconsistent-igmp-configuration.patch: Cherry-picked from upstream stable/ussuri to ensure flooding of unregistered multicast packets to all ports is disabled (LP: #1904399). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1904399 Title: [OVN] Inconsistent "flooding to unregistered" IGMP configuration Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Groovy: Fix Released Status in neutron source package in Hirsute: Fix Released Bug description: ML2/OVN reuses the same "[ovs]/igmp_snooping_enable" configuration option from ML2/OVS, which says [0]: "Setting this option to True will also enable Open vSwitch mcast- snooping-disable-flood-unregistered flag. This option will disable flooding of unregistered multicast packets to all ports." But, that's not true for ML2/OVN, in fact, this is the opposite because ML2/OVN does have the option to flood to unregistered VMs enabled by default. In order to keep the consistent between both drivers and this configuration option, ML2/OVN needs to disable the "mcast_flood_unregistered" configuration in the other_config column from the Logical Switch table when igmp_snooping_enable is True. [0] https://opendev.org/openstack/neutron/src/branch/master/neutron/conf/agent/ovs_conf.py#L36-L47 [Impact] See above. [Test Case] Run the following and expect success: root@f1:~# sudo apt install python3-neutron root@f1:/usr/lib/python3/dist-packages# python3 -m unittest neutron.tests.unit.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_maintenance.TestDBInconsistenciesPeriodics.test_check_for_igmp_snoop_support I would also like to get test feedback from Canonical bootstack as they are hitting this issue. [Regression Potential] This is a very minimal and targeted change that always hard codes MCAST_FLOOD_UNREGISTERED to 'false'. In assessing regression potential for changes like this, one that comes to mind is potential of a type error when setting MCAST_FLOOD_UNREGISTERED. Upon visual inspection of this code fix, a type error would be impossible, as what was once set to a 'true' or 'false' value is now set to 'false'. Another thought is whether MCAST_FLOOD_UNREGISTERED has any use if MCAST_SNOOP is set to false, but that is not the case according to upstream OVN documentation which states: mcast_flood_unregistered: optional string, either true or false Determines whether unregistered multicast traffic should be flooded or not. Only applicable if other_config:mcast_snoop is enabled. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1904399/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1904399] Re: [OVN] Inconsistent "flooding to unregistered" IGMP configuration
This bug was fixed in the package neutron - 2:17.0.0-0ubuntu3~cloud0 --- neutron (2:17.0.0-0ubuntu3~cloud0) focal-victoria; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:17.0.0-0ubuntu3) groovy; urgency=medium . * d/p/ovn-fix-inconsistent-igmp-configuration.patch: Cherry-picked from upstream stable/victoria to ensure flooding of unregistered multicast packets to all ports is disabled (LP: #1904399). ** Changed in: cloud-archive/victoria Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1904399 Title: [OVN] Inconsistent "flooding to unregistered" IGMP configuration Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Groovy: Fix Released Status in neutron source package in Hirsute: Fix Released Bug description: ML2/OVN reuses the same "[ovs]/igmp_snooping_enable" configuration option from ML2/OVS, which says [0]: "Setting this option to True will also enable Open vSwitch mcast- snooping-disable-flood-unregistered flag. This option will disable flooding of unregistered multicast packets to all ports." But, that's not true for ML2/OVN, in fact, this is the opposite because ML2/OVN does have the option to flood to unregistered VMs enabled by default. In order to keep the consistent between both drivers and this configuration option, ML2/OVN needs to disable the "mcast_flood_unregistered" configuration in the other_config column from the Logical Switch table when igmp_snooping_enable is True. [0] https://opendev.org/openstack/neutron/src/branch/master/neutron/conf/agent/ovs_conf.py#L36-L47 [Impact] See above. [Test Case] Run the following and expect success: root@f1:~# sudo apt install python3-neutron root@f1:/usr/lib/python3/dist-packages# python3 -m unittest neutron.tests.unit.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_maintenance.TestDBInconsistenciesPeriodics.test_check_for_igmp_snoop_support I would also like to get test feedback from Canonical bootstack as they are hitting this issue. [Regression Potential] This is a very minimal and targeted change that always hard codes MCAST_FLOOD_UNREGISTERED to 'false'. In assessing regression potential for changes like this, one that comes to mind is potential of a type error when setting MCAST_FLOOD_UNREGISTERED. Upon visual inspection of this code fix, a type error would be impossible, as what was once set to a 'true' or 'false' value is now set to 'false'. Another thought is whether MCAST_FLOOD_UNREGISTERED has any use if MCAST_SNOOP is set to false, but that is not the case according to upstream OVN documentation which states: mcast_flood_unregistered: optional string, either true or false Determines whether unregistered multicast traffic should be flooded or not. Only applicable if other_config:mcast_snoop is enabled. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1904399/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1917322] [NEW] cloudinit.net.get_devicelist includes "bonding_masters" if present
Public bug reported: $ ls -l /sys/class/net/ total 0 lrwxrwxrwx 1 root root0 Feb 26 21:51 bond0 -> ../../devices/virtual/net/bond0 -rw-r--r-- 1 root root 4096 Feb 26 21:51 bonding_masters lrwxrwxrwx 1 root root0 Feb 26 21:51 enp5s0 -> ../../devices/pci:00/:00:01.4/:05:00.0/virtio12/net/enp5s0 lrwxrwxrwx 1 root root0 Feb 26 21:51 lo -> ../../devices/virtual/net/lo lrwxrwxrwx 1 root root0 Feb 26 21:51 ovs-br -> ../../devices/virtual/net/ovs-br lrwxrwxrwx 1 root root0 Feb 26 21:51 ovs-br.100 -> ../../devices/virtual/net/ovs-br.100 lrwxrwxrwx 1 root root0 Feb 26 21:51 ovs-system -> ../../devices/virtual/net/ovs-system $ python3 -c "from cloudinit.net import get_devicelist; print(get_devicelist())" ['bonding_masters', 'enp5s0', 'bond0', 'ovs-system', 'ovs-br.100', 'lo', 'ovs-br'] ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1917322 Title: cloudinit.net.get_devicelist includes "bonding_masters" if present Status in cloud-init: New Bug description: $ ls -l /sys/class/net/ total 0 lrwxrwxrwx 1 root root0 Feb 26 21:51 bond0 -> ../../devices/virtual/net/bond0 -rw-r--r-- 1 root root 4096 Feb 26 21:51 bonding_masters lrwxrwxrwx 1 root root0 Feb 26 21:51 enp5s0 -> ../../devices/pci:00/:00:01.4/:05:00.0/virtio12/net/enp5s0 lrwxrwxrwx 1 root root0 Feb 26 21:51 lo -> ../../devices/virtual/net/lo lrwxrwxrwx 1 root root0 Feb 26 21:51 ovs-br -> ../../devices/virtual/net/ovs-br lrwxrwxrwx 1 root root0 Feb 26 21:51 ovs-br.100 -> ../../devices/virtual/net/ovs-br.100 lrwxrwxrwx 1 root root0 Feb 26 21:51 ovs-system -> ../../devices/virtual/net/ovs-system $ python3 -c "from cloudinit.net import get_devicelist; print(get_devicelist())" ['bonding_masters', 'enp5s0', 'bond0', 'ovs-system', 'ovs-br.100', 'lo', 'ovs-br'] To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1917322/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1904399] Re: [OVN] Inconsistent "flooding to unregistered" IGMP configuration
This bug was fixed in the package neutron - 2:17.0.0-0ubuntu3 --- neutron (2:17.0.0-0ubuntu3) groovy; urgency=medium * d/p/ovn-fix-inconsistent-igmp-configuration.patch: Cherry-picked from upstream stable/victoria to ensure flooding of unregistered multicast packets to all ports is disabled (LP: #1904399). -- Corey Bryant Mon, 08 Feb 2021 12:25:46 -0500 ** Changed in: neutron (Ubuntu Groovy) Status: Fix Committed => Fix Released ** Changed in: neutron (Ubuntu Focal) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1904399 Title: [OVN] Inconsistent "flooding to unregistered" IGMP configuration Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Fix Committed Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Groovy: Fix Released Status in neutron source package in Hirsute: Fix Released Bug description: ML2/OVN reuses the same "[ovs]/igmp_snooping_enable" configuration option from ML2/OVS, which says [0]: "Setting this option to True will also enable Open vSwitch mcast- snooping-disable-flood-unregistered flag. This option will disable flooding of unregistered multicast packets to all ports." But, that's not true for ML2/OVN, in fact, this is the opposite because ML2/OVN does have the option to flood to unregistered VMs enabled by default. In order to keep the consistent between both drivers and this configuration option, ML2/OVN needs to disable the "mcast_flood_unregistered" configuration in the other_config column from the Logical Switch table when igmp_snooping_enable is True. [0] https://opendev.org/openstack/neutron/src/branch/master/neutron/conf/agent/ovs_conf.py#L36-L47 [Impact] See above. [Test Case] Run the following and expect success: root@f1:~# sudo apt install python3-neutron root@f1:/usr/lib/python3/dist-packages# python3 -m unittest neutron.tests.unit.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_maintenance.TestDBInconsistenciesPeriodics.test_check_for_igmp_snoop_support I would also like to get test feedback from Canonical bootstack as they are hitting this issue. [Regression Potential] This is a very minimal and targeted change that always hard codes MCAST_FLOOD_UNREGISTERED to 'false'. In assessing regression potential for changes like this, one that comes to mind is potential of a type error when setting MCAST_FLOOD_UNREGISTERED. Upon visual inspection of this code fix, a type error would be impossible, as what was once set to a 'true' or 'false' value is now set to 'false'. Another thought is whether MCAST_FLOOD_UNREGISTERED has any use if MCAST_SNOOP is set to false, but that is not the case according to upstream OVN documentation which states: mcast_flood_unregistered: optional string, either true or false Determines whether unregistered multicast traffic should be flooded or not. Only applicable if other_config:mcast_snoop is enabled. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1904399/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp