Adding Noble debdiff

** Description changed:

+ [ Impact ]
+ 
+ In environments using OVN, the neutron-api service fails when attempting
+ to delete an agent from the cache. This results in a TypeError:
+ unhashable type: 'list' exception. This bug prevents the removal of
+ decommissioned OVN agents. Users will see zombie agents in the openstack
+ network agent list that cannot be cleared.
+ 
+ The fix resolves the issue by passing the agent's ID directly instead of
+ wrapping it in a list. The AgentCache().delete() method performs a
+ dictionary lookup; because lists are mutable and unhashable in Python,
+ they cannot be used as dictionary keys. This fix has already been
+ accepted upstream [0].
+ 
+ [ Test Plan ]
+ 
+ To trigger the OVN agent cache deletion logic, we will decommission a
+ compute node.
+ 
+ ** Setup OpenStack Environment **
+ 
+ 1.  Deploy a test cloud in your reproducer bastion using stsstack-bundles:
+     $ cd stsstack-bundles/openstack
+     $ ./generate-bundle.sh --series <UBUNTU_RELEASE> --release 
<OPENSTACK_RELEASE> --name neutron-test --run
+     # wait until all units are active/idle (some will be blocked/waiting)
+     $ ./tools/vault-unseal-and-authorise.sh
+     # wait until all units are active/idle
+     $ ./configure
+     $ source novarc
+ 
+ 2.  Add extra compute capacity to allow for decommissioning tests:
+     $ juju add-unit nova-compute -n 2
+     # wait until all units are active/idle
+ 
+ ** Verify the Bug **
+ 
+ 3.  Decommission a compute unit:
+     $ juju run nova-compute/2 disable 
+     $ juju run nova-compute/2 remove-from-cloud 
+     $ juju remove-unit nova-compute/2
+     $ juju status nova-compute --watch 2s
+     # wait until the unit has been removed
+ 
+ 4.  Attempt to delete the agent from Neutron:
+     $ openstack network agent list
+     $ openstack network agent delete {AGENT_ID}
+ 
+ 5.  Confirm the failure:
+     The agent will still appear in the output of the following command 
despite the deletion attempt:
+     $ openstack network agent list
+ 
+ 6.  Verify the traceback in the logs:
+     On the neutron-api unit, check the neutron-server log:
+     $ juju ssh neutron-api/0 "sudo cat /var/log/neutron/neutron-server.log | 
grep -B 15 TypeError | tail -n 9"
+     Observed results:
+     - Log shows: TypeError: unhashable type: 'list'
+ 
+ ** Verify the Fix **
+ 
+ 7.  Install the patched package on the neutron-api unit:
+     $ juju ssh neutron-api/0 
+     # upgrade to patched neutron
+     $ sudo apt update 
+     $ sudo apt install neutron-common
+ 
+ 8.  Repeat the decommissioning process with a different unit:
+     $ juju run nova-compute/1 disable
+     $ juju run nova-compute/1 remove-from-cloud
+     $ juju remove-unit nova-compute/1
+     $ juju status nova-compute --watch 2s
+     # wait until the unit has been removed
+ 
+ 9.  Delete the agent:
+     $ openstack network agent list
+     $ openstack network agent delete {AGENT_ID}
+ 
+ 10. Confirm the fix:
+     Verify the agent is successfully removed:
+     $ openstack network agent list
+     Verify no new TypeError was observed during the removal process:
+     $ juju ssh neutron-api/0 "sudo cat /var/log/neutron/neutron-server.log | 
grep -B 15 TypeError | tail -n 9"
+     Observed results:
+     - Agent is no longer present in the agent list
+     - No new TypeErrors are generated in the log
+ 
+ [ Where problems could occur ]
+ 
+ The change modifies the input type of AgentCache.delete from a list to a
+ single ID. If other parts of the code (outside this specific OVN logic)
+ call this method and expect it to handle a list, those calls could fail
+ or fail to find the correct key in the cache.
+ 
+ If the external_ids['delete_agent'] value is somehow missing or
+ malformed, the method would attempt to delete a non-existent key. This
+ could lead to stale entries remaining in the agent cache.
+ 
+ [ Other Info ]
+ 
+ This bug was introduced in Bobcat with [1]. Therefore, releases before
+ that (Yoga for the UCA and Jammy for Ubuntu in this case) do not need to
+ be fixed.
+ 
+ Packages in Questing and Resolute already have the fix. Similarly, in
+ the UCA, Epoxy and Flamingo have the fix too.
+ 
+ [0] - https://review.opendev.org/c/openstack/neutron/+/937092
+ [1] - https://review.opendev.org/c/openstack/neutron/+/883607
+ 
+ Original Description: Find the original description of the bug below:
+ 
  Call neutron API to delete ovn agent, it returns 204 but the agent isn't
  deleted. Check neutron-server.log find out there is an exception when
  deleting agent:
  
  ```
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event [-] Unexpected exception in 
notify_loop: TypeError: unhashable type: 'list'
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event Traceback (most recent call 
last):
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/ovsdbapp/event.py", line 177, 
in notify_loop
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event     match.run(event, row, 
updates)
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py",
 line 330, in run
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event     
n_agent.AgentCache().delete([row.external_ids['delete_agent']])
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py",
 line 282, in delete
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event     del self.agents[agent_id]
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event TypeError: unhashable type: 
'list'
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event [-] Unexpected exception in 
notify_loop: TypeError: unhashable type: 'list'
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event Traceback (most recent call 
last):
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/ovsdbapp/event.py", line 177, 
in notify_loop
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event     match.run(event, row, 
updates)
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py",
 line 330, in run
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event     
n_agent.AgentCache().delete([row.external_ids['delete_agent']])
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py",
 line 282, in delete
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event     del self.agents[agent_id]
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event TypeError: unhashable type: 
'list'
  ```

** Also affects: neutron (Ubuntu)
   Importance: Undecided
       Status: New

** Also affects: neutron (Ubuntu Noble)
   Importance: Undecided
       Status: New

** Also affects: cloud-archive
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/dalmatian
   Importance: Undecided
       Status: New

** Also affects: cloud-archive/caracal
   Importance: Undecided
       Status: New

** Patch added: "lp2091071-neutron-noble.debdiff"
   
https://bugs.launchpad.net/cloud-archive/+bug/2091071/+attachment/5953442/+files/lp2091071-neutron-noble.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2091071

Title:
  [OVN] Exception when deleting agent

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/2091071/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to