** Description changed:

  [ Impact ]
  
  In environments using OVN, the neutron-api service fails when attempting
  to delete an agent from the cache. This results in a TypeError:
  unhashable type: 'list' exception. This bug prevents the removal of
  decommissioned OVN agents. Users will see zombie agents in the openstack
  network agent list that cannot be cleared.
  
  The fix resolves the issue by passing the agent's ID directly instead of
  wrapping it in a list. The AgentCache().delete() method performs a
  dictionary lookup; because lists are mutable and unhashable in Python,
  they cannot be used as dictionary keys. This fix has already been
  accepted upstream [0].
  
  [ Test Plan ]
  
  To trigger the OVN agent cache deletion logic, we will decommission a
  compute node.
  
  ** Setup OpenStack Environment **
  
  1.  Deploy a test cloud in your reproducer bastion using stsstack-bundles:
      $ cd stsstack-bundles/openstack
      $ ./generate-bundle.sh --series <UBUNTU_RELEASE> --release 
<OPENSTACK_RELEASE> --name neutron-test --num-compute 3 --run
      # wait until all units are active/idle (some will be blocked/waiting)
      $ ./tools/vault-unseal-and-authorise.sh
      # wait until all units are active/idle
      $ ./configure
      $ source novarc
  
  ** Verify the Bug **
  
  2.  Decommission a compute unit:
      $ juju run nova-compute/2 disable
      $ juju run nova-compute/2 remove-from-cloud
      $ juju remove-unit nova-compute/2
      $ juju status nova-compute --watch 2s
      # wait until the unit has been removed
  
  3.  Attempt to delete the agent from Neutron:
      $ openstack network agent list
      $ openstack network agent delete {AGENT_ID}
  
  4.  Confirm the failure:
-     Verify the agent still appears after attempted removal:
+     Check if agent still appears after attempted removal:
      $ openstack network agent list
  
      Verify the exception in the logs:
      $ juju ssh neutron-api/0 "sudo cat /var/log/neutron/neutron-server.log | 
grep -B 15 TypeError | tail -n 9"
  
      Observed results:
      - Agent is still present in the agent list
      - Log shows:
  2026-03-18 07:15:14.697 75565 ERROR ovsdbapp.event Traceback (most recent 
call last):
  2026-03-18 07:15:14.697 75565 ERROR ovsdbapp.event   File 
"/usr/lib/python3/dist-packages/ovsdbapp/event.py", line 177, in notify_loop
  2026-03-18 07:15:14.697 75565 ERROR ovsdbapp.event     match.run(event, row, 
updates)
  2026-03-18 07:15:14.697 75565 ERROR ovsdbapp.event   File 
"/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py",
 line 332, in run
  2026-03-18 07:15:14.697 75565 ERROR ovsdbapp.event     
n_agent.AgentCache().delete([row.external_ids['delete_agent']])
  2026-03-18 07:15:14.697 75565 ERROR ovsdbapp.event   File 
"/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py",
 line 282, in delete
  2026-03-18 07:15:14.697 75565 ERROR ovsdbapp.event     del 
self.agents[agent_id]
  2026-03-18 07:15:14.697 75565 ERROR ovsdbapp.event         
~~~~~~~~~~~^^^^^^^^^^
  2026-03-18 07:15:14.697 75565 ERROR ovsdbapp.event TypeError: unhashable 
type: 'list'
  
  ** Verify the Fix **
  
  5.  Upgrade to patched neutron
  
  6.  Repeat the decommissioning process with a different unit:
      $ juju run nova-compute/1 disable
      $ juju run nova-compute/1 remove-from-cloud
      $ juju remove-unit nova-compute/1
      $ juju status nova-compute --watch 2s
      # wait until the unit has been removed
  
  7.  Delete the agent:
      $ openstack network agent list
      $ openstack network agent delete {AGENT_ID}
  
  8.  Confirm the fix:
      Verify the agent is successfully removed:
      $ openstack network agent list
  
      Verify no new TypeError was observed during the removal process:
      $ juju ssh neutron-api/0 "sudo cat /var/log/neutron/neutron-server.log | 
grep -B 15 TypeError | tail -n 9"
  
      Observed results:
      - Agent is no longer present in the agent list
      - No new TypeErrors are generated in the log
  
  [ Where problems could occur ]
  
  The change modifies the input type of AgentCache.delete from a list to a
  single ID. If other parts of the code (outside this specific OVN logic)
  call this method and expect it to handle a list, those calls could fail
  or fail to find the correct key in the cache.
  
  If the external_ids['delete_agent'] value is somehow missing or
  malformed, the method would attempt to delete a non-existent key. This
  could lead to stale entries remaining in the agent cache.
  
  [ Other Info ]
  
  This bug was introduced in Bobcat with [1]. Therefore, releases before
  that (Yoga for the UCA and Jammy for Ubuntu in this case) do not need to
  be fixed.
  
  Packages in Questing and Resolute already have the fix. Similarly, in
  the UCA, Epoxy and Flamingo have the fix too.
  
  [0] - https://review.opendev.org/c/openstack/neutron/+/937092
  [1] - https://review.opendev.org/c/openstack/neutron/+/883607
  
  Original Description: Find the original description of the bug below:
  
  Call neutron API to delete ovn agent, it returns 204 but the agent isn't
  deleted. Check neutron-server.log find out there is an exception when
  deleting agent:
  
  ```
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event [-] Unexpected exception in 
notify_loop: TypeError: unhashable type: 'list'
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event Traceback (most recent call 
last):
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/ovsdbapp/event.py", line 177, 
in notify_loop
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event     match.run(event, row, 
updates)
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py",
 line 330, in run
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event     
n_agent.AgentCache().delete([row.external_ids['delete_agent']])
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py",
 line 282, in delete
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event     del self.agents[agent_id]
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event TypeError: unhashable type: 
'list'
  2024-12-04 09:21:15.303 838 ERROR ovsdbapp.event
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event [-] Unexpected exception in 
notify_loop: TypeError: unhashable type: 'list'
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event Traceback (most recent call 
last):
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/ovsdbapp/event.py", line 177, 
in notify_loop
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event     match.run(event, row, 
updates)
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py",
 line 330, in run
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event     
n_agent.AgentCache().delete([row.external_ids['delete_agent']])
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py",
 line 282, in delete
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event     del self.agents[agent_id]
  2024-12-04 09:21:15.303 799 ERROR ovsdbapp.event TypeError: unhashable type: 
'list'
  ```

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2091071

Title:
  [OVN] Exception when deleting agent

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/2091071/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to