On Mon, May 01, 2023 at 07:33:38PM +0800, [email protected] wrote: > From: Lin Huang <[email protected]> > > OvS has supported packet-per-second policer which can be set at ingress > and egress side in kernel datapath. But the userspace datapath dosen't
s/dosen't/doesn't/ > support for ingress and egress packet-per-second policing now. > > So, this patch add support for userspace egress pps policing by using > native ovs token bucket library. Token bucket is accumulated by 'rate' > tokens per millisecond and store maxiumim tokens at 'burst' bucket size. s/maxiumim/maximum/ > One token in the bucket means one packet (1 kpkts * millisecond) which > will drop or pass by policer. > > This patch add new configuration option 'kpkts_rate' and 'kpkts_burst' > for egress-policer QoS type which now supports setting packet-per-second > limits in addition to the previously configurable byte rate settings. > > Examples: > $ovs-vsctl set port vhost-user0 qos=@newqos -- > --id=@newqos create qos type=egress-policer \ > other-config:cir=123000 other-config:cbs=123000 > other-config:kpkts_rate=123 other-config:kpkts_burst=123 If I understand things correctly, for BPS rate and burst are set by 'cir' and 'cbs'. While for PPS they are set by 'kpkts_rate' and 'kpkts_burst'. I wonder if we could make the BPS and PPS configuration more symmetric. > Add some unit tests for egress packet-per-second policing. > > Signed-off-by: Lin Huang <[email protected]> > --- > NEWS | 7 ++ > lib/netdev-dpdk.c | 96 +++++++++++++++++++--- > tests/system-dpdk.at | 186 ++++++++++++++++++++++++++++++++++++++++++- > vswitchd/vswitch.xml | 10 +++ > 4 files changed, 285 insertions(+), 14 deletions(-) > > diff --git a/NEWS b/NEWS > index b6418c36e..c41c6adf6 100644 > --- a/NEWS > +++ b/NEWS > @@ -23,6 +23,13 @@ Post-v3.1.0 > process extra privileges when mapping physical interconnect memory. > - SRv6 Tunnel Protocol > * Added support for userspace datapath (only). > + - Userspace datapath: > + * Added new configuration options 'kpkts_rate' and 'kpkts_burst' for > + 'egress-policer' to support packet-per-second policing. > + Examples: > + ovs-vsctl set port vhost-user0 qos=@newqos -- \ > + --id=@newqos create qos type=egress-policer \ > + other-config:kpkts_rate=123 other-config:kpkts_burst=123 > > > v3.1.0 - 16 Feb 2023 > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index fb0dd43f7..bcd13c116 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -19,6 +19,7 @@ > > #include <errno.h> > #include <signal.h> > +#include <stdint.h> > #include <stdlib.h> > #include <string.h> > #include <unistd.h> > @@ -59,6 +60,7 @@ > #include "openvswitch/ofp-parse.h" > #include "openvswitch/ofp-print.h" > #include "openvswitch/shash.h" > +#include "openvswitch/token-bucket.h" > #include "openvswitch/vlog.h" > #include "ovs-numa.h" > #include "ovs-rcu.h" > @@ -91,6 +93,8 @@ static bool per_port_memory = false; /* Status of per port > memory support */ > #define OVS_CACHE_LINE_SIZE CACHE_LINE_SIZE > #define OVS_VPORT_DPDK "ovs_dpdk" > > +#define MAX_KPKTS_PARAMETER 4294967U /* UINT32_MAX / 1000 */ > + > /* > * need to reserve tons of extra space in the mbufs so we can align the > * DMA addresses to 4KB. > @@ -400,6 +404,11 @@ struct dpdk_tx_queue { > ); > }; > > +enum policer_type { > + POLICER_BPS = 1 << 0, /* Rate value in bytes/sec. */ > + POLICER_PKTPS = 1 << 1, /* Rate value in packet/sec. */ > +}; > + > struct ingress_policer { > struct rte_meter_srtcm_params app_srtcm_params; > struct rte_meter_srtcm in_policer; > @@ -2335,6 +2344,22 @@ srtcm_policer_run_single_packet(struct rte_meter_srtcm > *meter, > return cnt; > } > > +static int > +pkts_policer_run_single_packet(struct token_bucket *tb, struct rte_mbuf > **pkts, > + int pkt_cnt, bool should_steal) Could the return type, and type pf pkt_cnt and cnt be unsigned int? That would match the type of the n (2nd) parameter of token_bucket_withdraw() and the count (2nd) parameter of rte_pktmbuf_free_bulk(). > +{ > + int cnt = 0; > + > + if (token_bucket_withdraw(tb, pkt_cnt)) { > + /* Handle packet by batch. */ > + cnt = pkt_cnt; > + } else if (should_steal) { > + rte_pktmbuf_free_bulk(pkts, pkt_cnt); > + } > + > + return cnt; > +} > + > static int > ingress_policer_run(struct ingress_policer *policer, struct rte_mbuf **pkts, > int pkt_cnt, bool should_steal) > @@ -4757,6 +4782,10 @@ netdev_dpdk_queue_dump_done(const struct netdev > *netdev OVS_UNUSED, > > struct egress_policer { > struct qos_conf qos_conf; > + enum policer_type type; > + uint32_t kpkts_rate; > + uint32_t kpkts_burst; > + struct token_bucket egress_tb; > struct rte_meter_srtcm_params app_srtcm_params; > struct rte_meter_srtcm egress_meter; > struct rte_meter_srtcm_profile egress_prof; > @@ -4776,26 +4805,53 @@ static int > egress_policer_qos_construct(const struct smap *details, > struct qos_conf **conf) > { > + uint32_t kpkts_burst, kpkts_rate; > struct egress_policer *policer; > int err = 0; > > - policer = xmalloc(sizeof *policer); > + policer = xzalloc(sizeof *policer); > + if (!policer) { xzmalloc never returns NULL - it abort()s on error. > + return ENOMEM; > + } > + > qos_conf_init(&policer->qos_conf, &egress_policer_ops); > egress_policer_details_to_param(details, &policer->app_srtcm_params); > err = rte_meter_srtcm_profile_config(&policer->egress_prof, > - &policer->app_srtcm_params); > + &policer->app_srtcm_params); The line above was correctly indented, now it is not. > if (!err) { > err = rte_meter_srtcm_config(&policer->egress_meter, > - &policer->egress_prof); > + &policer->egress_prof); Ditto. > + } > + if (err) { > + VLOG_ERR("Could not create rte meter for egress policer"); This function may or may not return an error for this condition. So I don't think it is right to log an error (always). Perhaps a debug message is appropriate. Or an error only if the function returns an error. > + err = -err; > + } else { > + policer->type |= POLICER_BPS; > } > > - if (!err) { > - *conf = &policer->qos_conf; > + kpkts_rate = smap_get_uint(details, "kpkts_rate", 0); > + kpkts_burst = smap_get_uint(details, "kpkts_burst", 0); I think this configuration should be read in trtcm_policer_details_to_param(), which is where the configuration for BPS rate limiting is read. > + if (kpkts_rate > MAX_KPKTS_PARAMETER || kpkts_burst > MAX_KPKTS_PARAMETER > + || kpkts_rate == 0 || kpkts_burst == 0) { > + /* Paramters between (1 ~ MAX_KPKTS_PARAMETER). */ s/Paramters/Parameters/ > + err = EINVAL; > + VLOG_ERR("Could not create tocken bucket for egress policer"); s/tocken/token/ 1. I think out of range values should be rejected elsewhere, perhaps in trtcm_policer_details_to_param(). 2. I think 0 probably means not configured, which is not n error condition. > } else { > - VLOG_ERR("Could not create rte meter for egress policer"); > + /* Rate in kilo-packets/second, bucket 1000 packets. */ > + /* msec * kilo-packets/sec = 1 packets. */ > + token_bucket_set(&policer->egress_tb, kpkts_rate, kpkts_burst * > 1000); > + policer->kpkts_burst = kpkts_burst; > + policer->kpkts_rate = kpkts_rate; > + policer->type |= POLICER_PKTPS; > + } > + > + if (!policer->type) { > + /* both bps and kpkts contrsruct failed.*/ s/contrsruct/construction/ > free(policer); > *conf = NULL; > - err = -err; > + } else { > + err = 0; > + *conf = &policer->qos_conf; > } > > return err; > @@ -4817,6 +4873,8 @@ egress_policer_qos_get(const struct qos_conf *conf, > struct smap *details) > > smap_add_format(details, "cir", "%"PRIu64, > policer->app_srtcm_params.cir); > smap_add_format(details, "cbs", "%"PRIu64, > policer->app_srtcm_params.cbs); > + smap_add_format(details, "kpkts_rate", "%"PRIu32, policer->kpkts_rate); > + smap_add_format(details, "kpkts_burst", "%"PRIu32, policer->kpkts_burst); > > return 0; > } > @@ -4828,25 +4886,37 @@ egress_policer_qos_is_equal(const struct qos_conf > *conf, > struct egress_policer *policer = > CONTAINER_OF(conf, struct egress_policer, qos_conf); > struct rte_meter_srtcm_params params; > + uint32_t kpkts_burst, kpkts_rate; > > egress_policer_details_to_param(details, ¶ms); > > - return !memcmp(¶ms, &policer->app_srtcm_params, sizeof params); > + kpkts_rate = smap_get_uint(details, "kpkts_rate", 0); > + kpkts_burst = smap_get_uint(details, "kpkts_burst", 0); As per my comment above, I think the for PPS parameters should be read in the same place as for BPS. Possibly trtcm_policer_details_to_param(). And that params should probably be expanded accordingly. Which may well lead to no need to update egress_policer_qos_is_equal() at all. > + > + return (!memcmp(¶ms, &policer->app_srtcm_params, sizeof params)) > + && (policer->kpkts_rate == kpkts_rate > + && policer->kpkts_burst == kpkts_burst); > } > > static int > egress_policer_run(struct qos_conf *conf, struct rte_mbuf **pkts, int > pkt_cnt, > bool should_steal) > { > - int cnt = 0; Using pkt_cnt in place of cnt doesn't seem strictly related to this patch. > struct egress_policer *policer = > CONTAINER_OF(conf, struct egress_policer, qos_conf); > > - cnt = srtcm_policer_run_single_packet(&policer->egress_meter, > - &policer->egress_prof, pkts, > - pkt_cnt, should_steal); > + if (policer->type & POLICER_BPS) { > + pkt_cnt = srtcm_policer_run_single_packet(&policer->egress_meter, > + &policer->egress_prof, > pkts, > + pkt_cnt, should_steal); > + } > > - return cnt; > + if (policer->type & POLICER_PKTPS) { > + pkt_cnt = pkts_policer_run_single_packet(&policer->egress_tb, pkts, > + pkt_cnt, should_steal); > + } If both BPS and PPS are configured then both srtcm_policer_run_single_packet() and pkts_policer_run_single_packet() are run. But: 1. The pkt_cnt input to the latter is the output of the former. 2. Only the pkt_cnt of the latter is returned. Is this correct? > + > + return pkt_cnt; > } > > static const struct dpdk_qos_ops egress_policer_ops = { ... _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
