On 06/08/2025 14:26, Ilya Maximets wrote:
> External email: Use caution opening links or attachments
>
>
> On 8/4/25 4:29 PM, Reuven Plevinsky via dev wrote:
>> For example, consider the following OpenFlow rule:
>> - IPv6 -> IPv4 tunnel translation:
>> add-flow br-int 'in_port=vxlan_br-int_0, \
>>      actions=set_field:8.8.8.7->tun_src,  \
>>      set_field:8.8.8.8->tun_dst,set_field:43->tun_id,geneve_br-int_1'
>>
>> As any set_field action, flow wildcards are set to the
>> tun_src/dst. The offload layer fails if not all matches are
>> handled ("consumed"). Clear the irrelevant wildcards.
>>
>> Signed-off-by: Reuven Plevinsky <rplevin...@nvidia.com>
>> ---
>>   ofproto/ofproto-dpif-xlate.c | 17 +++++++++++++
>>   tests/tunnel-push-pop.at     | 48 ++++++++++++++++++++++++++++++++++++
>>   2 files changed, 65 insertions(+)
>>
>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
>> index 2c8197fb7307..e2ba7e6b0250 100644
>> --- a/ofproto/ofproto-dpif-xlate.c
>> +++ b/ofproto/ofproto-dpif-xlate.c
>> @@ -8078,6 +8078,20 @@ xlate_wc_init(struct xlate_ctx *ctx)
>>       tnl_wc_init(&ctx->xin->flow, ctx->wc);
>>   }
>>
>> +static void
>> +clear_tunnel_wc(const struct flow_tnl *spec, struct flow_tnl *mask)
>> +{
>> +    if (!spec->ip_src && !spec->ip_dst) {
>> +        mask->ip_src = 0;
>> +        mask->ip_dst = 0;
>> +    }
>> +    if (!ipv6_addr_is_set(&spec->ipv6_src) &&
>> +        !ipv6_addr_is_set(&spec->ipv6_dst)) {
>> +        mask->ipv6_src = in6addr_any;
>> +        mask->ipv6_dst = in6addr_any;
>> +    }
>> +}
>> +
>>   static void
>>   xlate_wc_finish(struct xlate_ctx *ctx)
>>   {
>> @@ -8152,6 +8166,9 @@ xlate_wc_finish(struct xlate_ctx *ctx)
>>       if (!flow_tnl_dst_is_set(&ctx->xin->upcall_flow->tunnel)) {
>>           memset(&ctx->wc->masks.tunnel, 0, sizeof ctx->wc->masks.tunnel);
>>       }
>> +    /* Both IPv4/IPv6 tunnel wc may be set because of set_fields action.
>> +     * Clear the irrelevant one. */
>> +    clear_tunnel_wc(&ctx->xin->upcall_flow->tunnel, &ctx->wc->masks.tunnel);
>>   }
>>
>>   /* This will tweak the odp actions generated. For now, it will:
>> diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at
>> index 64c41460bbb2..9b46d249fe9f 100644
>> --- a/tests/tunnel-push-pop.at
>> +++ b/tests/tunnel-push-pop.at
>> @@ -1440,3 +1440,51 @@ AT_CHECK([tail -1 stdout], [0],
>>
>>   OVS_VSWITCHD_STOP
>>   AT_CLEANUP
>> +
>> +AT_SETUP([tunnel - decap and re-encap with outer IP version conversion])
>> +OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy])
>> +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br 
>> datapath_type=dummy], [0])
>> +AT_CHECK([ovs-vsctl add-port int-br t1 -- set Interface t1 type=vxlan \
>> +                       options:key=flow options:remote_ip=flow \
>> +                    -- add-port int-br t2 -- set Interface t2 type=geneve \
>> +                       options:key=flow options:remote_ip=flow \
>> +                       ], [0])
>> +
>> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
>> +dnl Setup dummy interface IP addresses.
>> +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.1.1/24], [0], [OK
>> +])
>> +AT_CHECK([ovs-appctl netdev-dummy/ip6addr br0 2001:cafe::1/24], [0], [OK
>> +])
>> +dnl Checking that a local routes for added IPs were successfully installed.
>> +AT_CHECK([ovs-appctl ovs/route/show | grep Cached | sort], [0], [dnl
>> +Cached: 1.1.1.0/24 dev br0 SRC 1.1.1.1 local
>> +Cached: 2001:ca00::/24 dev br0 SRC 2001:cafe::1 local
>> +])
>> +dnl Add entries
>> +AT_CHECK([ovs-appctl tnl/neigh/set br0 1.1.1.2 aa:bb:cc:00:00:01], [0], [OK
>> +])
>> +AT_CHECK([ovs-appctl tnl/neigh/set br0 2001:cafe::2 aa:bb:cc:00:00:01], 
>> [0], [OK
>> +])
>> +AT_CHECK([ovs-appctl tnl/neigh/show | grep br0 | sort], [0], [dnl
>> +1.1.1.2                                       aa:bb:cc:00:00:01   br0
>> +2001:cafe::2                                  aa:bb:cc:00:00:01   br0
>> +])
>> +
>> +AT_CHECK([ovs-ofctl add-flow int-br 
>> 'in_port=t1,tun_ipv6_src=2001:cafe::1,tun_ipv6_dst=2001:cafe::2,tun_id=0x7a,actions=set_field:1.1.1.1->tun_src,set_field:1.1.1.2->tun_dst,set_field:0x7b->tun_id,t2'])
>> +AT_CHECK([ovs-ofctl add-flow int-br 
>> 'in_port=t2,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_id=0x7b,actions=set_field:2001:cafe::1->tun_ipv6_src,set_field:2001:cafe::2->tun_ipv6_dst,set_field:0x7a->tun_id,set_field:0.0.0.0->tun_src,set_field:0.0.0.0->tun_dst,t1'])
>> +
>> +AT_CHECK([ovs-appctl ofproto/trace --names ovs-dummy 
>> 'tunnel(tun_id=0x7a,ipv6_src=2001:cafe::1,ipv6_dst=2001:cafe::2,ttl=64),in_port(4789)'],
>>  [0], [stdout])
>> +AT_CHECK([tail -2 stdout], [0],
>> +  [Megaflow: 
>> recirc_id=0,eth,tun_id=0x7a,tun_ipv6_src=2001:cafe::1,tun_ipv6_dst=2001:cafe::2,tun_tos=0,tun_flags=-df-csum+key,in_port=t1,dl_type=0x05ff
>> +Datapath actions: 
>> tnl_push(tnl_port(genev_sys_6081),header(size=50,type=5,eth(dst=aa:bb:cc:00:00:01,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.1.1,dst=1.1.1.2,proto=17,tos=0,ttl=64,frag=0x4000),udp(src=0,dst=6081,csum=0x0),geneve(vni=0x7b)),out_port(br0)),p0
>> +])
>> +
>> +AT_CHECK([ovs-appctl ofproto/trace --names ovs-dummy 
>> 'tunnel(tun_id=0x7b,src=1.1.1.1,dst=1.1.1.2,ttl=64),in_port(6081)'], [0], 
>> [stdout])
>> +AT_CHECK([tail -2 stdout], [0],
>> +  [Megaflow: 
>> recirc_id=0,eth,tun_id=0x7b,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_tos=0,tun_flags=-df-csum+key,in_port=t2,dl_type=0x05ff
>> +Datapath actions: 
>> tnl_push(tnl_port(vxlan_sys_4789),header(size=70,type=4,eth(dst=aa:bb:cc:00:00:01,src=aa:55:aa:55:00:00,dl_type=0x86dd),ipv6(src=2001:cafe::1,dst=2001:cafe::2,label=0,proto=17,tclass=0x0,hlimit=64),udp(src=0,dst=4789,csum=0xffff),vxlan(flags=0x8000000,vni=0x7a)),out_port(br0)),p0
>> +])
>> +
>> +OVS_VSWITCHD_STOP
>> +AT_CLEANUP
> Hi, Reuven.  Thanks for the update and the test.  I see what you're
> trying to achive now.
>
> However, I don't think this is a right approach as we can't actually
> clear tunnel metadata at xlate_wc_finish() stage.  This is because
> we must have prerequisites unwildcarded in order to clear the masks
> that we think are unnecessary.  And tunnel metadata doesn't have
> any prerequisites, so there is no way to tell if it is necessary to
> match on or not.  The main problem is a match on empty metadata that
> the user can add into OpenFlow rules.  We must never clear those as
> it may be necessary to distinguish tunneled and non-tunneled packets.
> E.g. match on all-zero tun_dst signifies that the packet is not coming
> from an ipv4 tunnel.  Note that input port match is not enough as a
> prerequisite as controller can inject packets from any port.
>
> For example, if you add the following change to your test:
>
> diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at
> index 9b46d249f..1d19e7e5b 100644
> --- a/tests/tunnel-push-pop.at
> +++ b/tests/tunnel-push-pop.at
> @@ -1473,6 +1473,13 @@ AT_CHECK([ovs-appctl tnl/neigh/show | grep br0 | 
> sort], [0], [dnl
>
>   AT_CHECK([ovs-ofctl add-flow int-br 
> 'in_port=t1,tun_ipv6_src=2001:cafe::1,tun_ipv6_dst=2001:cafe::2,tun_id=0x7a,actions=set_field:1.1.1.1->tun_src,set_field:1.1.1.2->tun_dst,set_field:0x7b->tun_id,t2'])
>   AT_CHECK([ovs-ofctl add-flow int-br 
> 'in_port=t2,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_id=0x7b,actions=set_field:2001:cafe::1->tun_ipv6_src,set_field:2001:cafe::2->tun_ipv6_dst,set_field:0x7a->tun_id,set_field:0.0.0.0->tun_src,set_field:0.0.0.0->tun_dst,t1'])
> +AT_CHECK([ovs-ofctl add-flow int-br 
> 'tun_dst=0.0.0.0,tun_ipv6_src=::,actions=drop'])
> +
> +AT_CHECK([ovs-appctl ofproto/trace --names int-br in_port=t1], [0], [stdout])
> +AT_CHECK([tail -2 stdout], [0],
> +  [Megaflow: 
> recirc_id=0,eth,tun_dst=0.0.0.0,tun_ipv6_src=::,in_port=t1,dl_type=0x0000
> +Datapath actions: drop
> +])
>
>   AT_CHECK([ovs-appctl ofproto/trace --names ovs-dummy 
> 'tunnel(tun_id=0x7a,ipv6_src=2001:cafe::1,ipv6_dst=2001:cafe::2,ttl=64),in_port(4789)'],
>  [0], [stdout])
>   AT_CHECK([tail -2 stdout], [0],
> ---
>
> The test will fail and the actual datapath flow will look like this:
>
>    Megaflow: recirc_id=0,eth,in_port=t1,dl_type=0x0000
>
> Such a flow will drop all the traffic and not just non-tunneled one
> as it was intended by the OpenFlow rules.
>
> This issue actually already exist for a while and need to also revert
> the previous commit:
>    d046453b56a9 ("ofproto-dpif-xlate: Clear tunnel wc bits if original packet 
> is non-tunnel.")
> Though, I think, we need a better solution for this before reverting
> as conditions where it would be a problem are kind of unlikely.
>
> A proper solution would be to avoid unwildcarding these fields while
> setting the metadata through varois OpenFlow actions (set_field, load,
> stack pop, etc.) but preserve the masks coming from the classifier.
> The reason why we can actually avoid unwildcarding while setting the
> fileds is that unlike other fileds, tunnel metadata is not a subject
> for the optimization where we skip datapath actions if the value is
> already the same.  So, there is no real reason to unwildcard them.
>
> Do you want to try and work on this kind of change?
>
> BTW, I didn't test, but it seems like IPV4/IPv6 mix is a problem not
> only for the hardware offloading code, but also for the kernel datapath
> in general as it will reject to install the flow with mixed metadata
> types.  Userspace works fine in that case though.  So, technicaly,
> we also need to probe the datapath implementation for support and
> slowpath such traffic, if it doesn't, i.e. set the SLOW_MATCH flag.
>
> Best regards, Ilya Maximets.

Hi Ilya,

Thanks a lot for your review and comments. Since this issue is specific 
to offload, and is not supported by the current upstream DPDK 
implementation (while it is supported by our DOCA implementation), we’ve 
decided to take a different approach and handle it within the DOCA 
offload provider instead. For now, we will abandon this change.

Best regards, Reuven Plevinsky

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to