On 1/14/26 11:39 AM, Jun Gu wrote:
> A bond, BM_TCP with dp_hash, is configured in the select_dst_port
> column of the mirror table. When a packet is forwarded for output
> through such a bond, there will be two recircs generated. Both
> recircs currently forward the packet to the mirror's output_port.
>
> However, the second recirc only performs a rule lookup in table
> 254 matching on dp_hash to determine the output and does not need
> to mirror the packet. This patch avoids forwarding packet to the
> mirror’s output_port during the second recirc to prevent
> duplicate mirroring.
>
> Signed-off-by: Jun Gu <[email protected]>
> ---
> ofproto/ofproto-dpif-rid.c | 2 ++
> tests/ofproto-dpif.at | 54 ++++++++++++++++++++++++++++++++++++++
> 2 files changed, 56 insertions(+)
Hi. Thanks for the patch! And sorry for delay.
See some comments below.
>
> diff --git a/ofproto/ofproto-dpif-rid.c b/ofproto/ofproto-dpif-rid.c
> index f01468025..73be23037 100644
> --- a/ofproto/ofproto-dpif-rid.c
> +++ b/ofproto/ofproto-dpif-rid.c
> @@ -292,6 +292,8 @@ recirc_alloc_id(struct ofproto_dpif *ofproto)
> struct frozen_state state = {
> .table_id = TBL_INTERNAL,
> .ofproto_uuid = ofproto->uuid,
> + /* Avoid duplicate mirrors on bond using BM_TCP with dp_hash. */
> + .mirrors = UINT32_MAX,
While this accomplishes the goal of preventing bonds from mirroring
after recirculation, it doesn't seem like the right place to do so.
A few reasons:
1. This function is more or less generic. It just happened to be used
only by the bonding code, but it shouldn't assume that. It may be
a problem in the future if we want to use this function for something
other than bonds. It may also have consequences if we ever want to
mirror from one of the post-recirculation rules, e.g. if the bond
port is not a simple port, but something that could recirculate
again or re-enter the bridge by some other means. I don't think this
is possible today, but may become an unexpected problem in the future.
2. It seems like the root cause of the problem is that we're mirroring
on recirculation, while we should only be mirroring on the actual
output. If for some reason the post-recurculation rules are not
available, we should not be mirroring the packet. For example, if
the bond is removed while we have the first datapath flow installed,
but the packet after recirculation goes to userspace via MISS upcall
and gets dropped, then we should not mirror that packet. But with
this patch we'll do, as mirroring will be part of the first datapath
flow.
The root of the issue seems to be in the compose_output_action__()
function that should not call the mirror_packet() in case we're not
actually performing the output, but doing a recirculation instead.
I also wonder if the same problem exists for the native tunnel termination
as it's not really an output action as well and we probably shouldn't
mirror in that case. It seems like we would be mirroring traffic before
it is decapsulated as if it was sent into the tunnel (or the LOCAL port?).
> .metadata = {
> .tunnel = {
> .ip_dst = htonl(0),
> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> index 7ebbee56d..53896804a 100644
> --- a/tests/ofproto-dpif.at
> +++ b/tests/ofproto-dpif.at
> @@ -6441,6 +6441,60 @@ AT_CHECK([tail -1 stdout | grep -E
> "trunc\(200\),2,trunc\(300\),3,100|trunc\(300
> OVS_VSWITCHD_STOP
> AT_CLEANUP
>
> +AT_SETUP([ofproto-dpif - mirroring, avoid duplicate mirrors on balance-tcp
> bonding with dp_hash])
nit: I'd remove the ", avoid duplicate mirrors" part from the name.
> +AT_KEYWORDS([mirror mirrors mirroring])
> +OVS_VSWITCHD_START(
> + [add-bond br0 bond0 p1 p2 bond_mode=balance-tcp lacp=active \
> + other-config:lacp-time=fast other-config:bond-rebalance-interval=0 --
> \
> + set interface p1 type=dummy options:pstream=punix:$OVS_RUNDIR/p1.sock
> ofport_request=1 -- \
> + set interface p2 type=dummy options:pstream=punix:$OVS_RUNDIR/p2.sock
> ofport_request=2 -- \
> + add-br br1 -- \
> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 \
> + fail-mode=standalone -- \
> + add-bond br1 bond1 p3 p4 bond_mode=balance-tcp lacp=active \
> + other-config:lacp-time=fast other-config:bond-rebalance-interval=0 --
> \
> + set interface p3 type=dummy options:stream=unix:$OVS_RUNDIR/p1.sock
> ofport_request=3 -- \
> + set interface p4 type=dummy options:stream=unix:$OVS_RUNDIR/p2.sock
> ofport_request=4 -- \
> + add-port br1 br1- -- set interface br1- type=patch options:peer=br1+
> ofport_request=100 -- \
> + add-br br-int -- \
> + set bridge br-int other-config:hwaddr=aa:77:aa:77:00:00 -- \
> + set bridge br-int datapath-type=dummy other-config:datapath-id=1235 \
> + fail-mode=secure -- \
> + add-port br-int br1+ -- set interface br1+ type=patch options:peer=br1-
> ofport_request=101 -- \
> + add-port br-int p5 -- set interface p5 ofport_request=5 type=dummy
> +])
> +AT_CHECK([ovs-appctl netdev-dummy/set-admin-state up], 0, [OK
> +])
> +
> +# Waits for all ifaces enabled.
> +OVS_WAIT_UNTIL([test `ovs-appctl bond/show | grep -- "may_enable: true" |
> wc -l` -ge 4])
nit: May be better to use 'grep -c' instead of 'wc -l'.
> +
> +# The dl_vlan flow should not be ever matched,
> +# since recirculation should not change the flow handling.
> +AT_DATA([flows.txt], [dnl
> +table=0 priority=1 in_port=5 actions=mod_vlan_vid:1,output(101)
> +table=0 priority=2 in_port=5 dl_vlan=1 actions=drop
> +])
> +AT_CHECK([ovs-ofctl add-flows br-int flows.txt])
This vlan part seems unnecessary in this test. Looks like it was copied
from the other bonding test here. But, I think, we can just remove the
br-int and all the port from it, as they are not needed for this test.
p5 can be directly added to br1.
> +
> +add_of_ports br1 6
> +AT_CHECK([ovs-vsctl \
> + set Bridge br1 mirrors=@m --\
> + --id=@bond1 get Port bond1 -- --id=@p6 get Port p6 --\
> + --id=@m create Mirror name=mymirror select_dst_port=@bond1
> select_src_port=@bond1 output_port=@p6],
> + [0], [ignore])
> +
> +# Sends a packet to trigger recirculation.
> +AT_CHECK([ovs-appctl netdev-dummy/receive p5
> "in_port(5),eth(src=50:54:00:00:00:05,dst=50:54:00:00:01:00),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1)"])
> +
> +AT_CHECK([ovs-appctl dpctl/dump-flows --names | grep 'in_port(p5)' | dnl
nit: Since you're using the \ in other parts, may use it here as well.
It also looks better in the logs then dnl.
> + sed 's/.*actions://' | grep -o 'p6' | wc -l | tr -d
> '[[:space:]]'], [0], [1])
nit: maybe tr ',' '\n' | grep -c p6 ?
> +
> +OVS_VSWITCHD_STOP()
> +AT_CLEANUP
> +
> +
> # This test verifies that the table ID is preserved across recirculation
> # when a resubmit action requires it (because the action is relative to
> # the current table rather than specifying a table).
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev