On 21 Jun 2022, at 10:22, Jianbo Liu wrote:

> On Mon, 2022-06-20 at 12:18 +0200, Eelco Chaudron wrote:
>> On 27 May 2022, at 11:00, Jianbo Liu wrote:
>>
>>> OVS meters are created in advance and openflow rules refer to them
>>> by
>>> their unique ID. New tc_police API is used to offload them. By
>>> calling
>>> the API, police actions are created and meters are mapped to them.
>>> These actions then can be used in tc filter rules by the index.
>>>
>>> Signed-off-by: Jianbo Liu <[email protected]>
>>> ---
>>>  NEWS                             |  2 ++
>>>  lib/dpif-netlink.c               | 31 +++++++++++++++++----
>>>  tests/system-offloads-traffic.at | 48
>>> ++++++++++++++++++++++++++++++++
>>>  3 files changed, 76 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/NEWS b/NEWS
>>> index eece0d0b2..dfd659d4e 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -29,6 +29,8 @@ Post-v2.17.0
>>>     - Windows:
>>>       * Conntrack support for TCPv6, UDPv6, ICMPv6, FTPv6.
>>>       * IPv6 Geneve tunnel support.
>>> +   - Linux datapath:
>>> +     * Add offloading meter tc police.
>>>
>>>
>>>  v2.17.0 - 17 Feb 2022
>>> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
>>> index 06e1e8ca0..0af9ee77e 100644
>>> --- a/lib/dpif-netlink.c
>>> +++ b/lib/dpif-netlink.c
>>> @@ -4163,11 +4163,18 @@ static int
>>>  dpif_netlink_meter_set(struct dpif *dpif_, ofproto_meter_id
>>> meter_id,
>>>                         struct ofputil_meter_config *config)
>>>  {
>>> +    int err;
>>> +
>>>      if (probe_broken_meters(dpif_)) {
>>>          return ENOMEM;
>>>      }
>>>
>>> -    return dpif_netlink_meter_set__(dpif_, meter_id, config);
>>> +    err = dpif_netlink_meter_set__(dpif_, meter_id, config);
>>> +    if (!err && netdev_is_flow_api_enabled()) {
>>> +        meter_offload_set(meter_id, config);
>>
>> Although currently we always return 0, we should still account for it
>> to change in the future, so we should set err to the return value.
>>
>
> If there is err from meter_offload_set, it will be passed to the caller
> of dpif_netlink_meter_set(). I don't agree with that, because the
> caller thinks meter_set operation fail, but actually not. Besides, we
> allow the case that dp meter_set success, but offloading fails, so the
> return the error of meter_offload_set seems unnecessary.

If this is the case, we should change the dpif_netlink_meter_set() API to 
return void.
And add a comment to the function why it would not return an error:

--- a/lib/netdev-offload.c
+++ b/lib/netdev-offload.c
@@ -196,7 +196,7 @@ netdev_assign_flow_api(struct netdev *netdev)
     return -1;
 }

-int
+void
 meter_offload_set(ofproto_meter_id meter_id,
                   struct ofputil_meter_config *config)
 {
@@ -212,8 +212,8 @@ meter_offload_set(ofproto_meter_id meter_id,
            }
         }
     }
-
-    return 0;
+    /* Offload APIs could fail, for example, because the offload is not
+     * supported. This is fine, as the offload API should take care of this. */
 }



>>  +        err = meter_offload_set(meter_id, config);
>>
>>> +    }
>>> +

<SNIP>

>>> diff --git a/tests/system-offloads-traffic.at b/tests/system-
>>> offloads-traffic.at
>>> index 80bc1dd5c..7ec75340f 100644
>>> --- a/tests/system-offloads-traffic.at
>>> +++ b/tests/system-offloads-traffic.at
>>> @@ -168,3 +168,51 @@ matchall
>>>  ])
>>>  OVS_TRAFFIC_VSWITCHD_STOP
>>>  AT_CLEANUP
>>> +
>>> +AT_SETUP([offloads - check if meter offloading ])
>>
>> Can we replace if with interface, as I keep on reading it as "if".
>>
>
> Sure.
>
>>> +AT_KEYWORDS([meter])
>>> +AT_SKIP_IF([test $SUPPORT_TC_INGRESS_PPS = "no"])
>>> +OVS_TRAFFIC_VSWITCHD_START()
>>> +
>>> +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw-
>>> offload=true])
>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps
>>> bands=type=drop rate=1'])
>>> +
>>> +ADD_NAMESPACES(at_ns0, at_ns1)
>>> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "f0:00:00:01:01:01")
>>> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24", "f0:00:00:01:01:02")
>>> +
>>> +NS_CHECK_EXEC([at_ns0], [ip neigh add 10.1.1.2 lladdr
>>> f0:00:00:01:01:02 dev p0])
>>> +NS_CHECK_EXEC([at_ns1], [ip neigh add 10.1.1.1 lladdr
>>> f0:00:00:01:01:01 dev p1])
>>> +
>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 "actions=normal"])
>>> +NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2 |
>>> FORMAT_PING], [0], [dnl
>>> +10 packets transmitted, 10 received, 0% packet loss, time 0ms
>>> +])
>>> +
>>> +NETNS_DAEMONIZE([at_ns1], [nc -u -l 5678 > /dev/null ], [nc0.pid])
>>> +
>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 del-flows br0])
>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
>>> "priority=10,in_port=ovs-p0,udp actions=meter:1,normal"])
>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 "priority=1
>>> actions=normal"])
>>> +
>>> +NS_CHECK_EXEC([at_ns0], [sleep 0.5; echo "mark" | nc -u 10.1.1.2
>>> 5678 -p 6789])
>>
>> Any specific reason why you need the sleep 0.5 here? Is it to be sure
>> the flow is programmed?
>
> I remember I added this because there are failures sometimes. I don't
> know why, but obviously they are related to this patchset. So I added
> the sleep to avoid them. It's only 0.5s, should be no problem, right?

I did a lot of runs, but could not get it to fail without it. So if it fails in 
your case it would be good to investigate.

>> If so, you might just want to do a ovs-vsctl dump-flows and checke
>> the output?
>>
>
> I don't understand your qustion. I checked ovs-vsctl dump-flows below.

I mean in the failure case without he 0.5 seconds, what do you get.

>>> +AT_CHECK([ovs-appctl dpctl/dump-flows | grep "meter" |
>>> DUMP_CLEAN_SORTED], [0], [dnl
>>> +in_port(2),eth(macs),eth_type(0x0800),ipv4(proto=17,frag=no),
>>> packets:0, bytes:0, used:0.001s, actions:outputmeter(0),3
>>> +])
>>
>> Here you verify the DP rule is inserted, but should you not also wait
>> a second to make sure the meter is reset?
>
> OK. I will add.
>
>> Because in theory your could/should sent 11 packets in 1 second, so
>> 10 should be dropped!?
>> This is the case in the kernel environment, but with TC the learning
>> packet is not passing trough the TC meter (this might also be a
>> corner case we need to document).
>>
>
> Yes, it's the reason. But where to document?

Not sure where either, I guess in NEWS and maybe it’s time to add a 
howto/tc-offload.rst file?


>>> +for i in `seq 10`; do
>>> +NS_CHECK_EXEC([at_ns0], [echo "mark" | nc -u 10.1.1.2 5678 -p
>>> 6789])
>>> +done
>>> +
>>> +AT_CHECK([ovs-appctl dpctl/dump-flows | grep "meter" |
>>> DUMP_CLEAN_SORTED], [0], [dnl
>>> +in_port(2),eth(macs),eth_type(0x0800),ipv4(proto=17,frag=no),
>>> packets:10, bytes:330, used:0.001s, actions:outputmeter(0),3
>>> +])
>>> +
>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 meter-stats br0 | sed -e
>>> 's/duration:[[0-9]].[[0-9]]*s/duration:0.001s/'], [0], [dnl
>>> +OFPST_METER reply (OF1.3) (xid=0x2):
>>> +meter:1 flow_count:1 packet_in_count:11 byte_in_count:377
>>> duration:0.001s bands:
>>
>> byte_in_count:517 is a lot larger than with kernel code, do we know
>> why? We should document it.
>>
>
> I think DP also count the bytes of mac layer.

Any idea why?

>>> +0: packet_count:9 byte_count:0
>>
>> Here in theory we should report byte_count, like this:
>>
>> 0: packet_count:10 byte_count:470
>>
>> However, TC does not support dropped byte count, only packet_count.
>> So we should be ok for now, but we must add this limitation to the
>> documentation somewhere that it's clear we will not report byte count
>> with TC offload.
>>
>> If you run this test without HW offload enabled you can see the
>> difference in behavior, and I think there should be none (or at least
>> the corner cases should be documented).
>> You could also add a "- offloads disabled" variant of this test, like
>> other features do and add some additional reasoning why it's
>> different there.
>>
>
> Sure, I will add.

Guess this could go to the same howto/tc-offload.rst in the limitations section?

>
>>> +])
>>> +
>>> +OVS_TRAFFIC_VSWITCHD_STOP
>>> +AT_CLEANUP
>>> -- 
>>> 2.26.2
>>
>> This concludes my review of v5.
>>
>> //Eelco
>>

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to