Hi,

This patches seems to fix DHCP issues but there is cases when instance booted, received configuration from the metadata service but don't have a public connectivity (done through L2 networking).

Which, if the logical switch has a reasonably high number of ports (maybe around 200) will probably cause the resubmit limit to be hit

This is the case, public L2 network with around ~2000 running instances (or ports in terms of LSP).

Are these OVN router port IPs?  Or are they OVN workload IPs?  Or are they just IPs owned by some fabric hosts, outside of OVN?


.1 IP from each subnet runs by the border gateway. So instance asks for .1 to know GW MAC adddress but due to hitting limit instance receive no response because ARP flow is dropped.

Also, aside from the logs, do you actually see any traffic being impacted?  I.e., are your workloads able to come up and properly communicate?

Nope, there is connectivity loss since some instances has no public connectivity due to ARP issues.


Regards,

Ilia Baikov
[email protected]

26.02.2026 14:31, Dumitru Ceara пишет:
Hi Ilia,

On 2/24/26 3:29 PM, Ilia Baikov wrote:
Just checked openvswitch logs. Resubmit 4096 is actually occurs even on
25.09.2.
v25.09.2 includes:
https://github.com/ovn-org/ovn/commit/0bb60da

Which should fix the "self-DoS" issues introduced by:
https://github.com/ovn-org/ovn/commit/325c7b2

But that means that in some cases, e.g., for real BUM traffic or for
GARPs originated by OVN router ports we will try to "flood" the packet
in the L2 broadcast domain.

Which, if the logical switch has a reasonably high number of ports
(maybe around 200) will probably cause the resubmit limit to be hit.

In the examples below, I see the packets that cause this are ARP
requests requesting the MAC address of:
- 138.124.72.1
- 83.219.248.109
- 138.124.72.1
- 91.92.46.1

Are these OVN router port IPs?  Or are they OVN workload IPs?  Or are
they just IPs owned by some fabric hosts, outside of OVN?

Also, aside from the logs, do you actually see any traffic being
impacted?  I.e., are your workloads able to come up and properly
communicate?

Thanks,
Dumitru

Final flow: unchanged
Megaflow: recirc_id=0,eth,arp,in_port=346,dl_src=fa:16:3e:63:aa:d0
Datapath actions: drop
2026-02-24T14:23:17.457Z|04071|connmgr|INFO|br-int<->unix#4346: 1
flow_mods in the last 0 s (1 adds)
2026-02-24T14:23:34.821Z|00076|ofproto_dpif_xlate(handler24)|WARN|
Dropped 854 log messages in last 60 seconds (most recently, 0 seconds
ago) due to excessive rate
2026-02-24T14:23:34.821Z|00077|ofproto_dpif_xlate(handler24)|WARN|over
4096 resubmit actions on bridge br-int while processing
arp,in_port=4715,vlan_tci=0x0000,dl_src=fa:16:3e:97:65:15,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.142,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:97:65:15,arp_tha=00:00:00:00:00:00
2026-02-24T14:23:45.464Z|00091|dpif(handler28)|WARN|system@ovs-system:
execute
ct(commit,zone=163,mark=0/0x41,label=0/0xffff00000000000000000000,nat(src)),154 
failed (Invalid argument) on packet 
tcp,vlan_tci=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=fa:16:3e:69:22:89,nw_src=31.44.82.94,nw_dst=31.169.126.149,nw_tos=32,nw_ecn=0,nw_ttl=57,nw_frag=no,tp_src=51064,tp_dst=443,tcp_flags=syn
 tcp_csum:d7b0
  with metadata
skb_priority(0),skb_mark(0),ct_state(0x21),ct_zone(0xa3),ct_tuple4(src=31.44.82.94,dst=31.169.126.149,proto=6,tp_src=51064,tp_dst=443),in_port(2)
 mtu 0
2026-02-24T14:23:56.702Z|00072|ofproto_dpif_upcall(handler30)|WARN|
Dropped 697 log messages in last 60 seconds (most recently, 0 seconds
ago) due to excessive rate
2026-02-24T14:23:56.702Z|00073|ofproto_dpif_upcall(handler30)|WARN|Flow:
arp,in_port=409,vlan_tci=0x0000,dl_src=fa:16:3e:22:f2:f7,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.145.28.207,arp_tpa=192.145.28.1,arp_op=1,arp_sha=fa:16:3e:22:f2:f7,arp_tha=00:00:00:00:00:00

bridge("br-int")
----------------
  0. priority 0
     drop

Final flow: unchanged
Megaflow: recirc_id=0,eth,arp,in_port=409,dl_src=fa:16:3e:22:f2:f7
Datapath actions: drop
2026-02-24T14:24:34.891Z|02715|ofproto_dpif_xlate(handler2)|WARN|Dropped
1059 log messages in last 60 seconds (most recently, 1 seconds ago) due
to excessive rate
2026-02-24T14:24:34.891Z|02716|ofproto_dpif_xlate(handler2)|WARN|over
4096 resubmit actions on bridge br-int while processing
arp,in_port=1,vlan_tci=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=83.219.248.1,arp_tpa=83.219.248.109,arp_op=1,arp_sha=0c:86:10:b7:9e:e0,arp_tha=00:00:00:00:00:00
2026-02-24T14:24:46.042Z|04072|connmgr|INFO|br-int<->unix#4353: 1
flow_mods in the last 0 s (1 adds)
2026-02-24T14:24:59.041Z|00066|ofproto_dpif_upcall(handler78)|WARN|
Dropped 662 log messages in last 63 seconds (most recently, 3 seconds
ago) due to excessive rate
2026-02-24T14:24:59.041Z|00067|ofproto_dpif_upcall(handler78)|WARN|Flow:
arp,in_port=339,vlan_tci=0x0000,dl_src=fa:16:3e:39:60:bb,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=91.92.46.85,arp_tpa=91.92.46.1,arp_op=1,arp_sha=fa:16:3e:39:60:bb,arp_tha=00:00:00:00:00:00

bridge("br-int")
----------------
  0. priority 0
     drop

Final flow: unchanged
Megaflow: recirc_id=0,eth,arp,in_port=339,dl_src=fa:16:3e:39:60:bb
Datapath actions: drop
2026-02-24T14:25:34.783Z|00067|ofproto_dpif_xlate(handler7)|WARN|Dropped
952 log messages in last 60 seconds (most recently, 0 seconds ago) due
to excessive rate
2026-02-24T14:25:34.783Z|00068|ofproto_dpif_xlate(handler7)|WARN|over
4096 resubmit actions on bridge br-int while processing
arp,in_port=4812,vlan_tci=0x0000,dl_src=fa:16:3e:68:f7:1b,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.245,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:68:f7:1b,arp_tha=00:00:00:00:00:00
2026-02-24T14:25:59.094Z|00067|ofproto_dpif_upcall(handler11)|WARN|
Dropped 720 log messages in last 60 seconds (most recently, 0 seconds
ago) due to excessive rate
2026-02-24T14:25:59.095Z|00068|ofproto_dpif_upcall(handler11)|WARN|Flow:
arp,in_port=305,vlan_tci=0x0000,dl_src=fa:16:3e:d9:8d:f3,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=91.92.46.188,arp_tpa=91.92.46.1,arp_op=1,arp_sha=fa:16:3e:d9:8d:f3,arp_tha=00:00:00:00:00:00

bridge("br-int")
----------------
  0. priority 0
     drop

Final flow: unchanged
Megaflow: recirc_id=0,eth,arp,in_port=305,dl_src=fa:16:3e:d9:8d:f3
Datapath actions: drop
2026-02-24T14:26:35.024Z|02717|ofproto_dpif_xlate(handler2)|WARN|Dropped
937 log messages in last 61 seconds (most recently, 1 seconds ago) due
to excessive rate
2026-02-24T14:26:35.024Z|02718|ofproto_dpif_xlate(handler2)|WARN|over
4096 resubmit actions on bridge br-int while processing
arp,in_port=1,vlan_tci=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=104.165.244.1,arp_tpa=104.165.244.146,arp_op=1,arp_sha=0c:86:10:b7:9e:e0,arp_tha=00:00:00:00:00:00
2026-02-24T14:26:59.151Z|00067|ofproto_dpif_upcall(handler67)|WARN|
Dropped 884 log messages in last 60 seconds (most recently, 0 seconds
ago) due to excessive rate
2026-02-24T14:26:59.151Z|00068|ofproto_dpif_upcall(handler67)|WARN|Flow:
arp,in_port=380,vlan_tci=0x0000,dl_src=fa:16:3e:f1:5b:e7,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.215,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:f1:5b:e7,arp_tha=00:00:00:00:00:00

bridge("br-int")
----------------
  0. in_port=380, priority 100, cookie 0x2cfc9def
     set_field:0x90/0xffff->reg13
     set_field:0x3->reg11
     set_field:0x1->reg12
     set_field:0x1->metadata
     set_field:0x1d2->reg14
     set_field:0/0xffff0000->reg13
     resubmit(,8)
  8. metadata=0x1, priority 50, cookie 0x43f4e129
     set_field:0/0x1000->reg10
     resubmit(,73)
     73. arp,reg14=0x1d2,metadata=0x1, priority 95, cookie 0x2cfc9def
             resubmit(,74)
         74. arp,reg14=0x1d2,metadata=0x1, priority 80, cookie 0x2cfc9def
             set_field:0x1000/0x1000->reg10
     move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
      -> NXM_NX_XXREG0[111] is now 0x1
     resubmit(,9)
  9. reg0=0x8000/0x8000,metadata=0x1, priority 50, cookie 0xf4bfe3b3
     drop

Final flow:
arp,reg0=0x8000,reg10=0x1000,reg11=0x3,reg12=0x1,reg13=0x90,reg14=0x1d2,metadata=0x1,in_port=380,vlan_tci=0x0000,dl_src=fa:16:3e:f1:5b:e7,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=138.124.72.215,arp_tpa=138.124.72.1,arp_op=1,arp_sha=fa:16:3e:f1:5b:e7,arp_tha=00:00:00:00:00:00
Megaflow: recirc_id=0,eth,arp,in_port=380,dl_src=fa:16:3e:f1:5b:e7
Datapath actions: drop



Broadcast arps to all routers is set to false.
_uuid               : 1841d88f-3fbf-427f-8d6c-c3edaba47a0a
acls                : []
copp                : []
dns_records         : []
external_ids        : {"neutron:availability_zone_hints"="",
"neutron:mtu"="1500", "neutron:network_name"=poland-public,
"neutron:provnet-network-type"=vlan, "neutron:revision_number"="12"}
forwarding_groups   : []
load_balancer       : []
load_balancer_group : []
name                : neutron-da85395e-c326-489d-b4e6-dfb62aad360d
other_config        : {broadcast-arps-to-all-routers="false",
fdb_age_threshold="0", mcast_flood_unregistered="false",
mcast_snoop="false", vlan-passthru="false"}
ports               : [00288a04-90a4-4e8e-bada-8213747c92e4, 0047d609-
ebff-4c43-8f1d-32d83d70c9e6, 00b6c585-ae29-4e88-a52a-3a16e1d91112


Regards,

Ilia Baikov
[email protected]

24.02.2026 17:16, Ilia Baikov пишет:
Hello,
After ugprading to OpenStack 2025.2 with OVN 25.09.2 (which contains
split buf fix) seems like no issues with DHCP, but I see a lot of
missed ARP, VM unable to reach GW and there is no ARP broadcasted to
some of VMs. Debugging shows that ovn installs drop arp flows for some
reason.

ovs-appctl ofproto/trace br-int \
"in_port=2,dl_vlan=1000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,dl_type=0x0806,arp_op=1,arp_spa=192.145.28.1,arp_tpa=192.145.28.113"
 2>&1 | tail -80
Flow:
arp,in_port=2,dl_vlan=1000,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=0c:86:10:b7:9e:e0,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.145.28.1,arp_tpa=192.145.28.113,arp_op=1,arp_sha=00:00:00:00:00:00,arp_tha=00:00:00:00:00:00

bridge("br-int")
----------------
  0. in_port=2, priority 100
     move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23]
      -> OXM_OF_METADATA[0..23] is now 0
     move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14]
      -> NXM_NX_REG14[0..14] is now 0
     move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15]
      -> NXM_NX_REG15[0..15] is now 0
     resubmit(,45)
45. priority 0
     drop

Final flow: unchanged
Megaflow: recirc_id=0,eth,arp,in_port=2,dl_src=0c:86:10:b7:9e:e0
Datapath actions: drop

docker exec ovn_controller ovn-controller --version
ovn-controller 25.09.2
Open vSwitch Library 3.6.2
OpenFlow versions 0x6:0x6
SB DB Schema 21.5.0

ovn-controller logs shows no errors clearly:
2026-02-24T14:06:39.403Z|00001|vlog|INFO|opened log file /var/log/
kolla/openvswitch/ovn-controller.log
2026-02-24T14:06:39.406Z|00002|reconnect|INFO|tcp:127.0.0.1:6640:
connecting...
2026-02-24T14:06:39.406Z|00003|reconnect|INFO|tcp:127.0.0.1:6640:
connected
2026-02-24T14:06:39.463Z|00004|main|INFO|OVN internal version is :
[25.09.2-21.5.0-81.10]
2026-02-24T14:06:39.463Z|00005|main|INFO|OVS IDL reconnected, force
recompute.
2026-02-24T14:06:39.464Z|00006|reconnect|INFO|tcp:10.11.0.4:16641:
connecting...
2026-02-24T14:06:39.464Z|00007|main|INFO|OVNSB IDL reconnected, force
recompute.
2026-02-24T14:06:39.464Z|00008|reconnect|INFO|tcp:10.11.0.4:16641:
connected
2026-02-24T14:06:39.464Z|00001|rconn(ovn_statctrl3)|INFO|unix:/var/
run/openvswitch/br-int.mgmt: connected
2026-02-24T14:06:39.464Z|00001|rconn(ovn_pinctrl0)|INFO|unix:/var/run/
openvswitch/br-int.mgmt: connected
2026-02-24T14:06:39.529Z|00009|main|INFO|OVS feature set changed,
force recompute.
2026-02-24T14:06:39.532Z|00010|rconn|INFO|unix:/var/run/openvswitch/
br-int.mgmt: connected
2026-02-24T14:06:39.532Z|00011|main|INFO|OVS OpenFlow connection
reconnected,force recompute.
2026-02-24T14:06:39.536Z|00012|main|INFO|OVS feature set changed,
force recompute.
2026-02-24T14:06:40.564Z|00013|main|INFO|OVS feature set changed,
force recompute.
2026-02-24T14:06:45.920Z|00014|binding|INFO|Releasing lport bcd3ecfa-
f43c-4e72-8978-73bbad07ed75 from this chassis (sb_readonly=1)
2026-02-24T14:06:45.924Z|00015|binding|INFO|Releasing lport
4f1f45b0-726c-4fea-b462-06dcbf559c25 from this chassis (sb_readonly=1)
2026-02-24T14:06:46.927Z|00016|timeval|WARN|Unreasonably long 1413ms
poll interval (1294ms user, 117ms system)
2026-02-24T14:06:46.927Z|00017|timeval|WARN|faults: 38131 minor, 0 major
2026-02-24T14:06:46.927Z|00018|timeval|WARN|disk: 0 reads, 8 writes
2026-02-24T14:06:46.927Z|00019|timeval|WARN|context switches: 0
voluntary, 65 involuntary
2026-02-24T14:06:46.936Z|00020|coverage|INFO|Event coverage, avg rate
over last: 5 seconds, last minute, last hour,  hash=1a815819:
2026-02-24T14:06:46.936Z|00021|coverage|INFO|physical_run  0.2/sec
  0.017/sec        0.0003/sec   total: 1
2026-02-24T14:06:46.936Z|00022|coverage|INFO|lflow_conj_alloc  0.0/
sec     0.000/sec        0.0000/sec   total: 407
2026-02-24T14:06:46.936Z|00023|coverage|INFO|lflow_cache_miss  0.0/
sec     0.000/sec        0.0000/sec   total: 13470
2026-02-24T14:06:46.936Z|00024|coverage|INFO|lflow_cache_hit 0.0/sec
    0.000/sec        0.0000/sec   total: 394
2026-02-24T14:06:46.936Z|00025|coverage|INFO|lflow_cache_add 0.0/sec
    0.000/sec        0.0000/sec   total: 12956
2026-02-24T14:06:46.936Z|00026|coverage|INFO|lflow_cache_add_matches
0.0/sec     0.000/sec        0.0000/sec   total: 2412
2026-02-24T14:06:46.936Z|00027|coverage|INFO|lflow_cache_add_expr
  0.0/sec     0.000/sec        0.0000/sec   total: 10544
2026-02-24T14:06:46.936Z|00028|coverage|INFO|consider_logical_flow
0.0/sec     0.000/sec        0.0000/sec   total: 20680
2026-02-24T14:06:46.936Z|00029|coverage|INFO|lflow_run 0.2/sec
  0.017/sec        0.0003/sec   total: 1
2026-02-24T14:06:46.936Z|00030|coverage|INFO|miniflow_malloc  16.6/
sec     1.383/sec        0.0231/sec   total: 28561
2026-02-24T14:06:46.936Z|00031|coverage|INFO|hmap_pathological  11.2/
sec     0.933/sec        0.0156/sec   total: 257
2026-02-24T14:06:46.936Z|00032|coverage|INFO|hmap_expand 837.2/sec
69.767/sec        1.1628/sec   total: 30358
2026-02-24T14:06:46.936Z|00033|coverage|INFO|hmap_reserve  0.4/sec
  0.033/sec        0.0006/sec   total: 21733
2026-02-24T14:06:46.936Z|00034|coverage|INFO|txn_unchanged 2.4/sec
  0.200/sec        0.0033/sec   total: 65
2026-02-24T14:06:46.936Z|00035|coverage|INFO|txn_incomplete  1.4/sec
    0.117/sec        0.0019/sec   total: 60
2026-02-24T14:06:46.936Z|00036|coverage|INFO|txn_success 0.6/sec
  0.050/sec        0.0008/sec   total: 3
2026-02-24T14:06:46.936Z|00037|coverage|INFO|poll_create_node 24.0/
sec     2.000/sec        0.0333/sec   total: 1304
2026-02-24T14:06:46.937Z|00038|coverage|INFO|poll_zero_timeout   0.0/
sec     0.000/sec        0.0000/sec   total: 1
2026-02-24T14:06:46.937Z|00039|coverage|INFO|rconn_queued  0.8/sec
  0.067/sec        0.0011/sec   total: 4
2026-02-24T14:06:46.937Z|00040|coverage|INFO|rconn_sent  0.8/sec
  0.067/sec        0.0011/sec   total: 4
2026-02-24T14:06:46.937Z|00041|coverage|INFO|seq_change  9.2/sec
  0.767/sec        0.0128/sec   total: 532
2026-02-24T14:06:46.937Z|00042|coverage|INFO|pstream_open  0.2/sec
  0.017/sec        0.0003/sec   total: 1
2026-02-24T14:06:46.937Z|00043|coverage|INFO|stream_open 1.2/sec
  0.100/sec        0.0017/sec   total: 6
2026-02-24T14:06:46.937Z|00044|coverage|INFO|util_xalloc 29035.4/sec
2419.617/sec       40.3269/sec   total: 2277081
2026-02-24T14:06:46.937Z|00045|coverage|INFO|vconn_received  0.8/sec
    0.067/sec        0.0011/sec   total: 4
2026-02-24T14:06:46.937Z|00046|coverage|INFO|vconn_sent  1.2/sec
  0.100/sec        0.0017/sec   total: 6
2026-02-24T14:06:46.937Z|00047|coverage|INFO|jsonrpc_recv_incomplete
0.6/sec     0.050/sec        0.0008/sec   total: 52
2026-02-24T14:06:46.937Z|00048|coverage|INFO|138 events never hit
2026-02-24T14:06:46.976Z|00049|binding|INFO|Releasing lport
4f1f45b0-726c-4fea-b462-06dcbf559c25 from this chassis (sb_readonly=0)
2026-02-24T14:06:46.977Z|00050|binding|INFO|Releasing lport bcd3ecfa-
f43c-4e72-8978-73bbad07ed75 from this chassis (sb_readonly=0)
2026-02-24T14:06:48.054Z|00051|timeval|WARN|Unreasonably long 1117ms
poll interval (1108ms user, 8ms system)
2026-02-24T14:06:48.054Z|00052|timeval|WARN|faults: 2581 minor, 0 major
2026-02-24T14:06:48.054Z|00053|timeval|WARN|context switches: 0
voluntary, 8 involuntary
2026-02-24T14:06:48.055Z|00054|coverage|INFO|Event coverage, avg rate
over last: 5 seconds, last minute, last hour,  hash=0878340f:
2026-02-24T14:06:48.055Z|00055|coverage|INFO|physical_run  0.2/sec
  0.017/sec        0.0003/sec   total: 2
2026-02-24T14:06:48.055Z|00056|coverage|INFO|lflow_conj_alloc  0.0/
sec     0.000/sec        0.0000/sec   total: 814
2026-02-24T14:06:48.055Z|00057|coverage|INFO|lflow_cache_miss  0.0/
sec     0.000/sec        0.0000/sec   total: 13979
2026-02-24T14:06:48.055Z|00058|coverage|INFO|lflow_cache_hit 0.0/sec
    0.000/sec        0.0000/sec   total: 13671
2026-02-24T14:06:48.055Z|00059|coverage|INFO|lflow_cache_add 0.0/sec
    0.000/sec        0.0000/sec   total: 12956
2026-02-24T14:06:48.055Z|00060|coverage|INFO|lflow_cache_add_matches
0.0/sec     0.000/sec        0.0000/sec   total: 2412
2026-02-24T14:06:48.055Z|00061|coverage|INFO|lflow_cache_add_expr
  0.0/sec     0.000/sec        0.0000/sec   total: 10544
2026-02-24T14:06:48.055Z|00062|coverage|INFO|consider_logical_flow
0.0/sec     0.000/sec        0.0000/sec   total: 41360
2026-02-24T14:06:48.055Z|00063|coverage|INFO|lflow_run 0.2/sec
  0.017/sec        0.0003/sec   total: 2
2026-02-24T14:06:48.055Z|00064|coverage|INFO|cmap_expand 0.0/sec
  0.000/sec        0.0000/sec   total: 7
2026-02-24T14:06:48.055Z|00065|coverage|INFO|miniflow_malloc  16.6/
sec     1.383/sec        0.0231/sec   total: 63156
2026-02-24T14:06:48.055Z|00066|coverage|INFO|hmap_pathological  11.2/
sec     0.933/sec        0.0156/sec   total: 311
2026-02-24T14:06:48.056Z|00067|coverage|INFO|hmap_expand 837.2/sec
69.767/sec        1.1628/sec   total: 30539
2026-02-24T14:06:48.056Z|00068|coverage|INFO|hmap_reserve  0.4/sec
  0.033/sec        0.0006/sec   total: 22553
2026-02-24T14:06:48.056Z|00069|coverage|INFO|txn_unchanged 2.4/sec
  0.200/sec        0.0033/sec   total: 67
2026-02-24T14:06:48.056Z|00070|coverage|INFO|txn_incomplete  1.4/sec
    0.117/sec        0.0019/sec   total: 60
2026-02-24T14:06:48.056Z|00071|coverage|INFO|txn_success 0.6/sec
  0.050/sec        0.0008/sec   total: 4
2026-02-24T14:06:48.056Z|00072|coverage|INFO|poll_create_node 24.0/
sec     2.000/sec        0.0333/sec   total: 1335
2026-02-24T14:06:48.056Z|00073|coverage|INFO|poll_zero_timeout   0.0/
sec     0.000/sec        0.0000/sec   total: 1
2026-02-24T14:06:48.056Z|00074|coverage|INFO|rconn_queued  0.8/sec
  0.067/sec        0.0011/sec   total: 4
2026-02-24T14:06:48.056Z|00075|coverage|INFO|rconn_sent  0.8/sec
  0.067/sec        0.0011/sec   total: 4
2026-02-24T14:06:48.056Z|00076|coverage|INFO|seq_change  9.2/sec
  0.767/sec        0.0128/sec   total: 546
2026-02-24T14:06:48.056Z|00077|coverage|INFO|pstream_open  0.2/sec
  0.017/sec        0.0003/sec   total: 1
2026-02-24T14:06:48.056Z|00078|coverage|INFO|stream_open 1.2/sec
  0.100/sec        0.0017/sec   total: 6
2026-02-24T14:06:48.056Z|00079|coverage|INFO|long_poll_interval
  0.0/sec     0.000/sec        0.0000/sec   total: 1
2026-02-24T14:06:48.056Z|00080|coverage|INFO|util_xalloc 29035.4/sec
2419.617/sec       40.3269/sec   total: 2477649
2026-02-24T14:06:48.056Z|00081|coverage|INFO|vconn_received  0.8/sec
    0.067/sec        0.0011/sec   total: 4
2026-02-24T14:06:48.056Z|00082|coverage|INFO|vconn_sent  1.2/sec
  0.100/sec        0.0017/sec   total: 6
2026-02-24T14:06:48.056Z|00083|coverage|INFO|jsonrpc_recv_incomplete
0.6/sec     0.050/sec        0.0008/sec   total: 52
2026-02-24T14:06:48.056Z|00084|coverage|INFO|136 events never hit
2026-02-24T14:06:48.056Z|00085|poll_loop|INFO|wakeup due to [POLLIN]
on fd 29 (10.11.0.2:40496<->10.11.0.4:16641) at lib/stream-fd.c:157
(82% CPU usage)
2026-02-24T14:06:48.097Z|00086|poll_loop|INFO|wakeup due to [POLLIN]
on fd 29 (10.11.0.2:40496<->10.11.0.4:16641) at lib/stream-fd.c:157
(82% CPU usage)
2026-02-24T14:06:48.104Z|00087|poll_loop|INFO|wakeup due to 0-ms
timeout at controller/ovn-controller.c:7558 (82% CPU usage)
2026-02-24T14:06:48.283Z|00088|poll_loop|INFO|wakeup due to 0-ms
timeout at controller/ofctrl.c:692 (82% CPU usage)
2026-02-24T14:06:48.870Z|00089|poll_loop|INFO|wakeup due to [POLLIN]
on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:153
(82% CPU usage)
2026-02-24T14:06:48.877Z|00090|poll_loop|INFO|wakeup due to [POLLOUT]
on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:153
(82% CPU usage)
2026-02-24T14:06:48.884Z|00091|poll_loop|INFO|wakeup due to [POLLOUT]
on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:153
(82% CPU usage)
2026-02-24T14:06:48.892Z|00092|poll_loop|INFO|wakeup due to [POLLOUT]
on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:153
(82% CPU usage)
2026-02-24T14:06:48.900Z|00093|poll_loop|INFO|wakeup due to [POLLOUT]
on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:153
(82% CPU usage)
2026-02-24T14:06:48.907Z|00094|poll_loop|INFO|wakeup due to [POLLOUT]
on fd 33 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:153
(82% CPU usage)
2026-02-24T14:06:49.875Z|00095|memory|INFO|143124 kB peak resident set
size after 10.5 seconds
2026-02-24T14:06:49.875Z|00096|memory|INFO|idl-cells-
OVN_Southbound:301305 idl-cells-Open_vSwitch:25815 lflow-cache-
entries-cache-expr:10548 lflow-cache-entries-cache-matches:2413 lflow-
cache-size-KB:32447 local_datapath_usage-KB:2
ofctrl_desired_flow_usage-KB:8528 ofctrl_installed_flow_usage-KB:6365
ofctrl_rconn_packet_counter-KB:5161 ofctrl_sb_flow_ref_usage-KB:3196
oflow_update_usage-KB:1

Regards,

Ilia Baikov
[email protected]

12.02.2026 21:22, Ilia Baikov пишет:
Hi,
Returning back to this issue after a while as I'm migrating ml2/ovs
to ml2/ovn.
Seems like the same issue from 2025 still persists.

refs:
[0]https://mail.openvswitch.org/pipermail/ovs-discuss/2025-
February/053456.html
[1]https://mail.openvswitch.org/pipermail/ovs-discuss/2025-
March/053484.html

case:
Big L2 domain with border device learning IPs by flooding ARP. For
some reason in case L2 device (nic with VLANs attached) is attached
to the br-ex bridge, then after a while OVN stops sending DHCP
packets (OFFER/ACK/etc).

Anybody else observed the same issue? The only way to stabilize
region is to switch to L3 networking using ovn-bgp-agent (eth0 is
detached from br-ex, so no more ARPs delivered to ovn-controller),
but there is monstrous overhead using kernel routing: IRQ higher up
to x5-6 times, like 10-12% while L2 networking is just 2% which is fine.

Meanwhile, no errors, warns, resubmit logs in logs.
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to