On 21 Jun 2022, at 10:22, Jianbo Liu wrote:
> On Mon, 2022-06-20 at 12:18 +0200, Eelco Chaudron wrote: >> On 27 May 2022, at 11:00, Jianbo Liu wrote: >> >>> OVS meters are created in advance and openflow rules refer to them >>> by >>> their unique ID. New tc_police API is used to offload them. By >>> calling >>> the API, police actions are created and meters are mapped to them. >>> These actions then can be used in tc filter rules by the index. >>> >>> Signed-off-by: Jianbo Liu <[email protected]> >>> --- >>> NEWS | 2 ++ >>> lib/dpif-netlink.c | 31 +++++++++++++++++---- >>> tests/system-offloads-traffic.at | 48 >>> ++++++++++++++++++++++++++++++++ >>> 3 files changed, 76 insertions(+), 5 deletions(-) >>> >>> diff --git a/NEWS b/NEWS >>> index eece0d0b2..dfd659d4e 100644 >>> --- a/NEWS >>> +++ b/NEWS >>> @@ -29,6 +29,8 @@ Post-v2.17.0 >>> - Windows: >>> * Conntrack support for TCPv6, UDPv6, ICMPv6, FTPv6. >>> * IPv6 Geneve tunnel support. >>> + - Linux datapath: >>> + * Add offloading meter tc police. >>> >>> >>> v2.17.0 - 17 Feb 2022 >>> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c >>> index 06e1e8ca0..0af9ee77e 100644 >>> --- a/lib/dpif-netlink.c >>> +++ b/lib/dpif-netlink.c >>> @@ -4163,11 +4163,18 @@ static int >>> dpif_netlink_meter_set(struct dpif *dpif_, ofproto_meter_id >>> meter_id, >>> struct ofputil_meter_config *config) >>> { >>> + int err; >>> + >>> if (probe_broken_meters(dpif_)) { >>> return ENOMEM; >>> } >>> >>> - return dpif_netlink_meter_set__(dpif_, meter_id, config); >>> + err = dpif_netlink_meter_set__(dpif_, meter_id, config); >>> + if (!err && netdev_is_flow_api_enabled()) { >>> + meter_offload_set(meter_id, config); >> >> Although currently we always return 0, we should still account for it >> to change in the future, so we should set err to the return value. >> > > If there is err from meter_offload_set, it will be passed to the caller > of dpif_netlink_meter_set(). I don't agree with that, because the > caller thinks meter_set operation fail, but actually not. Besides, we > allow the case that dp meter_set success, but offloading fails, so the > return the error of meter_offload_set seems unnecessary. If this is the case, we should change the dpif_netlink_meter_set() API to return void. And add a comment to the function why it would not return an error: --- a/lib/netdev-offload.c +++ b/lib/netdev-offload.c @@ -196,7 +196,7 @@ netdev_assign_flow_api(struct netdev *netdev) return -1; } -int +void meter_offload_set(ofproto_meter_id meter_id, struct ofputil_meter_config *config) { @@ -212,8 +212,8 @@ meter_offload_set(ofproto_meter_id meter_id, } } } - - return 0; + /* Offload APIs could fail, for example, because the offload is not + * supported. This is fine, as the offload API should take care of this. */ } >> + err = meter_offload_set(meter_id, config); >> >>> + } >>> + <SNIP> >>> diff --git a/tests/system-offloads-traffic.at b/tests/system- >>> offloads-traffic.at >>> index 80bc1dd5c..7ec75340f 100644 >>> --- a/tests/system-offloads-traffic.at >>> +++ b/tests/system-offloads-traffic.at >>> @@ -168,3 +168,51 @@ matchall >>> ]) >>> OVS_TRAFFIC_VSWITCHD_STOP >>> AT_CLEANUP >>> + >>> +AT_SETUP([offloads - check if meter offloading ]) >> >> Can we replace if with interface, as I keep on reading it as "if". >> > > Sure. > >>> +AT_KEYWORDS([meter]) >>> +AT_SKIP_IF([test $SUPPORT_TC_INGRESS_PPS = "no"]) >>> +OVS_TRAFFIC_VSWITCHD_START() >>> + >>> +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw- >>> offload=true]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=1 pktps >>> bands=type=drop rate=1']) >>> + >>> +ADD_NAMESPACES(at_ns0, at_ns1) >>> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "f0:00:00:01:01:01") >>> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24", "f0:00:00:01:01:02") >>> + >>> +NS_CHECK_EXEC([at_ns0], [ip neigh add 10.1.1.2 lladdr >>> f0:00:00:01:01:02 dev p0]) >>> +NS_CHECK_EXEC([at_ns1], [ip neigh add 10.1.1.1 lladdr >>> f0:00:00:01:01:01 dev p1]) >>> + >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 "actions=normal"]) >>> +NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2 | >>> FORMAT_PING], [0], [dnl >>> +10 packets transmitted, 10 received, 0% packet loss, time 0ms >>> +]) >>> + >>> +NETNS_DAEMONIZE([at_ns1], [nc -u -l 5678 > /dev/null ], [nc0.pid]) >>> + >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 del-flows br0]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 >>> "priority=10,in_port=ovs-p0,udp actions=meter:1,normal"]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 "priority=1 >>> actions=normal"]) >>> + >>> +NS_CHECK_EXEC([at_ns0], [sleep 0.5; echo "mark" | nc -u 10.1.1.2 >>> 5678 -p 6789]) >> >> Any specific reason why you need the sleep 0.5 here? Is it to be sure >> the flow is programmed? > > I remember I added this because there are failures sometimes. I don't > know why, but obviously they are related to this patchset. So I added > the sleep to avoid them. It's only 0.5s, should be no problem, right? I did a lot of runs, but could not get it to fail without it. So if it fails in your case it would be good to investigate. >> If so, you might just want to do a ovs-vsctl dump-flows and checke >> the output? >> > > I don't understand your qustion. I checked ovs-vsctl dump-flows below. I mean in the failure case without he 0.5 seconds, what do you get. >>> +AT_CHECK([ovs-appctl dpctl/dump-flows | grep "meter" | >>> DUMP_CLEAN_SORTED], [0], [dnl >>> +in_port(2),eth(macs),eth_type(0x0800),ipv4(proto=17,frag=no), >>> packets:0, bytes:0, used:0.001s, actions:outputmeter(0),3 >>> +]) >> >> Here you verify the DP rule is inserted, but should you not also wait >> a second to make sure the meter is reset? > > OK. I will add. > >> Because in theory your could/should sent 11 packets in 1 second, so >> 10 should be dropped!? >> This is the case in the kernel environment, but with TC the learning >> packet is not passing trough the TC meter (this might also be a >> corner case we need to document). >> > > Yes, it's the reason. But where to document? Not sure where either, I guess in NEWS and maybe it’s time to add a howto/tc-offload.rst file? >>> +for i in `seq 10`; do >>> +NS_CHECK_EXEC([at_ns0], [echo "mark" | nc -u 10.1.1.2 5678 -p >>> 6789]) >>> +done >>> + >>> +AT_CHECK([ovs-appctl dpctl/dump-flows | grep "meter" | >>> DUMP_CLEAN_SORTED], [0], [dnl >>> +in_port(2),eth(macs),eth_type(0x0800),ipv4(proto=17,frag=no), >>> packets:10, bytes:330, used:0.001s, actions:outputmeter(0),3 >>> +]) >>> + >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 meter-stats br0 | sed -e >>> 's/duration:[[0-9]].[[0-9]]*s/duration:0.001s/'], [0], [dnl >>> +OFPST_METER reply (OF1.3) (xid=0x2): >>> +meter:1 flow_count:1 packet_in_count:11 byte_in_count:377 >>> duration:0.001s bands: >> >> byte_in_count:517 is a lot larger than with kernel code, do we know >> why? We should document it. >> > > I think DP also count the bytes of mac layer. Any idea why? >>> +0: packet_count:9 byte_count:0 >> >> Here in theory we should report byte_count, like this: >> >> 0: packet_count:10 byte_count:470 >> >> However, TC does not support dropped byte count, only packet_count. >> So we should be ok for now, but we must add this limitation to the >> documentation somewhere that it's clear we will not report byte count >> with TC offload. >> >> If you run this test without HW offload enabled you can see the >> difference in behavior, and I think there should be none (or at least >> the corner cases should be documented). >> You could also add a "- offloads disabled" variant of this test, like >> other features do and add some additional reasoning why it's >> different there. >> > > Sure, I will add. Guess this could go to the same howto/tc-offload.rst in the limitations section? > >>> +]) >>> + >>> +OVS_TRAFFIC_VSWITCHD_STOP >>> +AT_CLEANUP >>> -- >>> 2.26.2 >> >> This concludes my review of v5. >> >> //Eelco >> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
