Hi,
I’ve faced with a next issue using latest OVN (master branch) with OVN-IC
enabled:
ovn-northd CPU utilisation was at ~70-80% and ovsdb-server serving
OVN_Southbound DB was also under heavy load.
In ovn-northd.log file there were warnings:
2021-10-01T11:14:43.548Z|18845|northd|INFO|deleting Datapath_Binding
dd6af7f7-ea46-4496-988a-e7f9828924d0 with duplicate
external-ids:logical-switch/router ec35e3e0-2674-47e7-b645-2c9b8b31865b
2021-10-01T11:14:44.148Z|18846|poll_loop|INFO|Dropped 32 log messages in last 6
seconds (most recently, 0 seconds ago) due to excessive rate
2021-10-01T11:14:44.148Z|18847|poll_loop|INFO|wakeup due to [POLLIN] on fd 19
(172.20.33.110:55202<->172.20.33.102:16642) at lib/stream-ssl.c:832 (69% CPU
usage)
2021-10-01T11:14:48.336Z|18848|ovsdb_idl|WARN|Dropped 307 log messages in last
60 seconds (most recently, 0 seconds ago) due to excessive rate
2021-10-01T11:14:48.336Z|18849|ovsdb_idl|WARN|transaction error:
{"details":"Deletion of 1 weak reference(s) to deleted (or never-existing) rows
from column \"datapath\" in \"IP_Multicast\" row
72bac803-e484-4358-9e48-11911c8aa16f caused this column to become empty, but
constraints on this column disallow an empty column.","error":"constraint
violation"}
I checked datapath by reported logical-switch id:
~]# ovn-sbctl find datap
external_ids:logical-switch=ec35e3e0-2674-47e7-b645-2c9b8b31865b
_uuid : 7e045551-7700-4a50-b0aa-02cb4e1be59d
external_ids : {interconn-ts=vpc-CCF01DF6-rtb-216BABB1-global,
logical-switch="ec35e3e0-2674-47e7-b645-2c9b8b31865b",
name=vpc-CCF01DF6-rtb-216BABB1-global}
load_balancers : []
tunnel_key : 16712519
_uuid : dd6af7f7-ea46-4496-988a-e7f9828924d0
external_ids : {interconn-ts=vpc-CCF01DF6-rtb-216BABB1-global,
logical-switch="ec35e3e0-2674-47e7-b645-2c9b8b31865b",
name=vpc-CCF01DF6-rtb-216BABB1-global}
load_balancers : []
tunnel_key : 5
It refers to ovn-ic transit switch:
~]# ovn-ic-sbctl list datapath vpc-CCF01DF6-rtb-216BABB1-global
_uuid : fc159fa4-d5ba-46ed-a54c-d00745091021
external_ids : {}
transit_switch : vpc-CCF01DF6-rtb-216BABB1-global
tunnel_key : 16712519
~]# ovn-ic-nbctl list transit vpc-CCF01DF6-rtb-216BABB1-global
_uuid : b5312889-92f9-40fe-98f9-2ea7ce3debcc
external_ids : {}
name : vpc-CCF01DF6-rtb-216BABB1-global
other_config : {}
The problem ip-multicast document:
~]# ovn-sbctl find ip-mul datapath=dd6af7f7-ea46-4496-988a-e7f9828924d0
_uuid : 72bac803-e484-4358-9e48-11911c8aa16f
datapath : dd6af7f7-ea46-4496-988a-e7f9828924d0
enabled : false
eth_src : ""
idle_timeout : 300
ip4_src : ""
ip6_src : ""
querier : true
query_interval : 150
query_max_resp : 1
seq_no : 0
table_size : 2048
I tried manually destroy this ip_multicast document, which blocked deletion of
datapath document:
~]# ovn-sbctl destroy ip-mul 72bac803-e484-4358-9e48-11911c8aa16f
~]# ovn-sbctl list ip-mul 72bac803-e484-4358-9e48-11911c8aa16f
ovn-sbctl: no row "72bac803-e484-4358-9e48-11911c8aa16f" in table IP_Multicast
And problem was resolved. ovn-northd daemon stopped consuming CPU and excess
datapath was deleted automatically.
~]# ovn-sbctl find datapath
external_ids:interconn-ts=vpc-CCF01DF6-rtb-216BABB1-global
_uuid : 7e045551-7700-4a50-b0aa-02cb4e1be59d
external_ids : {interconn-ts=vpc-CCF01DF6-rtb-216BABB1-global,
logical-switch="ec35e3e0-2674-47e7-b645-2c9b8b31865b",
name=vpc-CCF01DF6-rtb-216BABB1-global}
load_balancers : []
tunnel_key : 16712519
The main problem here is that northd was entirely blocked. No processing to SB
was done. Even ovn-controllers couldn’t claim new ports. No changes were
propagated, so this is critial issue.
Can anybody help understand the reason for this?
Regards,
Vladislav Odintsov
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev