Hi,

This patch was previously discussed in the mail  [1]. Please review this 
code again.
If there are any issues, I can test and update it as needed.


[1] https://mail.openvswitch.org/pipermail/ovs-dev/2025-October/427165.html


LIU Yulong
2025.01.12
 
------------------ Original ------------------
From: &nbsp;"LIU&nbsp;Yulong"<[email protected]&gt;;
Date: &nbsp;Tue, Dec 30, 2025 06:49 PM
To: &nbsp;"dev"<[email protected]&gt;; 
Cc: &nbsp;"LIU Yulong"<[email protected]&gt;; 
Subject: &nbsp;[PATCH] ofproto-dpif-xlate: Use try-ref for xlate_table_action 
rule.

&nbsp;
Our running ovs-vswitchd was cored with stack [1]. It shows a
concurrent race condition.

When the revalidator thread is performing rule revalidation/translation,
another thread has already deleted the rule tag and reduced the reference
count to 0. Using ofproto_rule_ref() at this point will trigger an assertion,
as it assumes the object is still alive (refcount &gt; 0).

Replace the mandatory reference to the rule in xlate_table_action()
with a non-fatal try-ref: ofproto_rule_try_ref().
If the reference fails (the rule is being/has been released),
treat it as a table miss path, avoiding referencing "dead" rules
and causing assertion errors.

[1] call stack:
*0&nbsp; raise () from /lib64/libc.so.6
*1&nbsp; abort () from /lib64/libc.so.6
*2&nbsp; ovs_abort_valist () at lib/util.c:499
*3&nbsp; vlog_abort_valist () at lib/vlog.c:1249
*4&nbsp; vlog_abort () at lib/vlog.c:1263
*5&nbsp; ovs_assert_failure () at lib/util.c:86
*6&nbsp; ovs_refcount_ref () at lib/ovs-atomic.h:547
*7&nbsp; ovs_refcount_ref () at ofproto/ofproto.c:3010
*8&nbsp; ofproto_rule_ref () at ofproto/ofproto.c:3012
*9&nbsp; xlate_table_action () at ofproto/ofproto-dpif-xlate.c:4506
*10 do_xlate_actions () at ofproto/ofproto-dpif-xlate.c:7183
*11 xlate_recursively () at ofproto/ofproto-dpif-xlate.c:4382
*12 xlate_table_action () at ofproto/ofproto-dpif-xlate.c:4511
*13 do_xlate_actions () at ofproto/ofproto-dpif-xlate.c:7183
*14 clone_xlate_actions () at ofproto/ofproto-dpif-xlate.c:5809
*15 xlate_recursively () at ofproto/ofproto-dpif-xlate.c:4382
*16 xlate_table_action () at ofproto/ofproto-dpif-xlate.c:4511
*17 patch_port_output () at ofproto/ofproto-dpif-xlate.c:3889
*18 compose_output_action__ () at ofproto/ofproto-dpif-xlate.c:4204
*19 compose_output_action () at ofproto/ofproto-dpif-xlate.c:4359
*20 xlate_output_action () at ofproto/ofproto-dpif-xlate.c:5304
*21 do_xlate_actions () at ofproto/ofproto-dpif-xlate.c:7033
*22 xlate_actions () at ofproto/ofproto-dpif-xlate.c:8001
*23 xlate_key () at ofproto/ofproto-dpif-upcall.c:2321
*24 xlate_ukey () at ofproto/ofproto-dpif-upcall.c:2336
*25 revalidate_ukey__ () at ofproto/ofproto-dpif-upcall.c:2382
*26 revalidate_ukey () at ofproto/ofproto-dpif-upcall.c:2491
*27 revalidate () at ofproto/ofproto-dpif-upcall.c:2945
*28 udpif_revalidator () at ofproto/ofproto-dpif-upcall.c:1089
*29 ovsthread_wrapper () at lib/ovs-thread.c:422
*30 start_thread () from /lib64/libpthread.so.0
*31 clone () from /lib64/libc.so.6

Signed-off-by: LIU Yulong <[email protected]&gt;
---
ofproto/ofproto-dpif-xlate.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 920d998e6..b0d42d29d 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -4793,7 +4793,7 @@ xlate_table_action(struct xlate_ctx *ctx, ofp_port_t 
in_port, uint8_t table_id,
 tuple_swap(&amp;ctx-&gt;xin-&gt;flow, ctx-&gt;wc);
 }

-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (rule) {
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (rule &amp;&amp; 
ofproto_rule_try_ref(&amp;rule-&gt;up)) {
 /* Fill in the cache entry here instead of xlate_recursively
 * to make the reference counting more explicit.&nbsp; We take a
 * reference in the lookups above if we are going to cache the
@@ -4804,6 +4804,8 @@ xlate_table_action(struct xlate_ctx *ctx, ofp_port_t 
in_port, uint8_t table_id,
 entry = xlate_cache_add_entry(ctx-&gt;xin-&gt;xcache, XC_RULE);
 entry-&gt;rule = rule;
 ofproto_rule_ref(&amp;rule-&gt;up);
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } else {
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 ofproto_rule_unref(&amp;rule-&gt;up);
 }

 struct ovs_list *old_trace = ctx-&gt;xin-&gt;trace;
-- 
2.50.1 (Apple Git-155)
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to