Re: [ovs-dev] [PATCH v5 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer
On 1/15/2020 4:49 PM, Kevin Traynor wrote: On 15/01/2020 13:19, Stokes, Ian wrote: On 1/15/2020 12:19 PM, Ilya Maximets wrote: On 15.01.2020 12:28, Stokes, Ian wrote: On 1/14/2020 4:12 PM, Eelco Chaudron wrote: This patch adds a new policer to the DPDK datapath based on RFC 4115's Two-Rate, Three-Color marker. It's a two-level hierarchical policer which first does a color-blind marking of the traffic at the queue level, followed by a color-aware marking at the port level. At the end traffic marked as Green or Yellow is forwarded, Red is dropped. For details on how traffic is marked, see RFC 4115. This egress policer can be used to limit traffic at different rated based on the queues the traffic is in. In addition, it can also be used to prioritize certain traffic over others at a port level. For example, the following configuration will limit the traffic rate at a port level to a maximum of 2000 packets a second (64 bytes IPv4 packets). 100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess Information Rate). High priority traffic is routed to queue 10, which marks all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is marked as EIR, i.e. Yellow. ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \ --id=@myqos create qos type=trtcm-policer \ other-config:cir=52000 other-config:cbs=2048 \ other-config:eir=52000 other-config:ebs=2048 \ queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \ --id=@dpdk1Q10 create queue \ other-config:cir=4160 other-config:cbs=2048 \ other-config:eir=0 other-config:ebs=0 -- \ --id=@dpdk1Q20 create queue \ other-config:cir=0 other-config:cbs=0 \ other-config:eir=4160 other-config:ebs=2048 \ This configuration accomplishes that the high priority traffic has a guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also use the EIR, so a total of 2000pps at max. These additional 1000pps is shared with the low priority traffic. The low priority traffic can use at maximum 1000pps. Thanks for the patch Eelco, minor comment below. Ok, with the below discussion in mind I've pushed the series master. The understanding is that a better approach will be sought so that we can remove the experimental API exception in the future. I can raise this with the DPDK community call tomorrow and see what possibilities we have for removing the tag. If in the meantime, a new feature is submitted for OVS with DPDK that has an experimental tag for an API call it will be pushed back on until the experimental tag is removed. Thanks Ian Rate Limiting (Ingress Policing) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 128963f..1ed4a47 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -26,6 +26,12 @@ #include #include +/* Include rte_compat.h first to allow experimental API's needed for the + * rte_meter.h rfc4115 functions. Once they are no longer marked as + * experimental the #define and rte_compat.h include can be removed. + */ +#define ALLOW_EXPERIMENTAL_API +#include I guess the risk here, from what I understand, this approach is all or nothing in terms of experimental APIs from the included headers, other experimental APIs could be used from DPDK going forward without causing warning? If so, there would have to be extra dilligence taken when reviewing future patches and a discussion if an API is expereimental, should it wait until it is marked as non experimental. (In this case The TRTCM looks stable and is highly unlikey to be removed so it's not an issue IMO). @Ilya/Kevin: Would you agree with above? Thoughts? I think it make sense to wait for API to become non-experimental if we have no easy way to enable it only on functions we need. I agree that having widly enabled experimental api might produce additional issues and will require more careful review of all the DPDK related patches. So I'm ok with making an exception in this case for the feature, I think it was an oversight but the API is stable. But what I'm hearing is in general the rule should be for new features the API should not be experimental and if it is then it would have to have it's experimental tag removed by the next DPDK upgrade (in this case 20.11 for arguments sake) and then can be upstreamed to OVS master. In this case, it sounds ok to me. We know it will be non-experimental in the next hours/days and we will not be left with some zombie feature on upgrade. Plus, there might be an acceptable way of adding a non-experimental symbol for this to the v19.11 stable branch too (not guaranteed though), and if so we could turn this off on upgrade to v19.11.X. OTOH, I don't know how important it is to have this feature in OVS 2.13. > BTW, another thought I have in mind about all the release management is: Shouldn't we hold OVS updates to new DPDK LTS until the first correction release is out? I mean, for example, Ubuntu triggers updates from one
Re: [ovs-dev] [PATCH v5 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer
On 15/01/2020 13:19, Stokes, Ian wrote: > > > On 1/15/2020 12:19 PM, Ilya Maximets wrote: >> On 15.01.2020 12:28, Stokes, Ian wrote: >>> >>> >>> On 1/14/2020 4:12 PM, Eelco Chaudron wrote: This patch adds a new policer to the DPDK datapath based on RFC 4115's Two-Rate, Three-Color marker. It's a two-level hierarchical policer which first does a color-blind marking of the traffic at the queue level, followed by a color-aware marking at the port level. At the end traffic marked as Green or Yellow is forwarded, Red is dropped. For details on how traffic is marked, see RFC 4115. This egress policer can be used to limit traffic at different rated based on the queues the traffic is in. In addition, it can also be used to prioritize certain traffic over others at a port level. For example, the following configuration will limit the traffic rate at a port level to a maximum of 2000 packets a second (64 bytes IPv4 packets). 100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess Information Rate). High priority traffic is routed to queue 10, which marks all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is marked as EIR, i.e. Yellow. ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \ --id=@myqos create qos type=trtcm-policer \ other-config:cir=52000 other-config:cbs=2048 \ other-config:eir=52000 other-config:ebs=2048 \ queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \ --id=@dpdk1Q10 create queue \ other-config:cir=4160 other-config:cbs=2048 \ other-config:eir=0 other-config:ebs=0 -- \ --id=@dpdk1Q20 create queue \ other-config:cir=0 other-config:cbs=0 \ other-config:eir=4160 other-config:ebs=2048 \ This configuration accomplishes that the high priority traffic has a guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also use the EIR, so a total of 2000pps at max. These additional 1000pps is shared with the low priority traffic. The low priority traffic can use at maximum 1000pps. >>> Thanks for the patch Eelco, minor comment below. >>> >>> >>> Rate Limiting (Ingress Policing) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 128963f..1ed4a47 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -26,6 +26,12 @@ #include #include +/* Include rte_compat.h first to allow experimental API's needed for the + * rte_meter.h rfc4115 functions. Once they are no longer marked as + * experimental the #define and rte_compat.h include can be removed. + */ +#define ALLOW_EXPERIMENTAL_API +#include >>> >>> I guess the risk here, from what I understand, this approach is all or >>> nothing in terms of experimental APIs from the included headers, other >>> experimental APIs could be used from DPDK going forward without causing >>> warning? >>> >>> If so, there would have to be extra dilligence taken when reviewing future >>> patches and a discussion if an API is expereimental, should it wait until >>> it is marked as non experimental. (In this case The TRTCM looks stable and >>> is highly unlikey to be removed so it's not an issue IMO). >>> >>> @Ilya/Kevin: Would you agree with above? Thoughts? >> >> I think it make sense to wait for API to become non-experimental >> if we have no easy way to enable it only on functions we need. >> I agree that having widly enabled experimental api might produce >> additional issues and will require more careful review of all the >> DPDK related patches. > > So I'm ok with making an exception in this case for the feature, I think > it was an oversight but the API is stable. But what I'm hearing is in > general the rule should be for new features the API should not be > experimental and if it is then it would have to have it's experimental > tag removed by the next DPDK upgrade (in this case 20.11 for arguments > sake) and then can be upstreamed to OVS master. > In this case, it sounds ok to me. We know it will be non-experimental in the next hours/days and we will not be left with some zombie feature on upgrade. Plus, there might be an acceptable way of adding a non-experimental symbol for this to the v19.11 stable branch too (not guaranteed though), and if so we could turn this off on upgrade to v19.11.X. OTOH, I don't know how important it is to have this feature in OVS 2.13. >> >> BTW, another thought I have in mind about all the release management is: >> Shouldn't we hold OVS updates to new DPDK LTS until the first correction >> release is out? I mean, for example, Ubuntu triggers updates from one >> LTS release to another only after .1 correcting relese is out (users >> of Ubuntu 18.04 will receive upgrade notifications only after 20.04.1 >> is released). Shouldn't we do the same thing?
Re: [ovs-dev] [PATCH v5 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer
On 1/15/2020 12:19 PM, Ilya Maximets wrote: On 15.01.2020 12:28, Stokes, Ian wrote: On 1/14/2020 4:12 PM, Eelco Chaudron wrote: This patch adds a new policer to the DPDK datapath based on RFC 4115's Two-Rate, Three-Color marker. It's a two-level hierarchical policer which first does a color-blind marking of the traffic at the queue level, followed by a color-aware marking at the port level. At the end traffic marked as Green or Yellow is forwarded, Red is dropped. For details on how traffic is marked, see RFC 4115. This egress policer can be used to limit traffic at different rated based on the queues the traffic is in. In addition, it can also be used to prioritize certain traffic over others at a port level. For example, the following configuration will limit the traffic rate at a port level to a maximum of 2000 packets a second (64 bytes IPv4 packets). 100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess Information Rate). High priority traffic is routed to queue 10, which marks all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is marked as EIR, i.e. Yellow. ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \ --id=@myqos create qos type=trtcm-policer \ other-config:cir=52000 other-config:cbs=2048 \ other-config:eir=52000 other-config:ebs=2048 \ queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \ --id=@dpdk1Q10 create queue \ other-config:cir=4160 other-config:cbs=2048 \ other-config:eir=0 other-config:ebs=0 -- \ --id=@dpdk1Q20 create queue \ other-config:cir=0 other-config:cbs=0 \ other-config:eir=4160 other-config:ebs=2048 \ This configuration accomplishes that the high priority traffic has a guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also use the EIR, so a total of 2000pps at max. These additional 1000pps is shared with the low priority traffic. The low priority traffic can use at maximum 1000pps. Thanks for the patch Eelco, minor comment below. Rate Limiting (Ingress Policing) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 128963f..1ed4a47 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -26,6 +26,12 @@ #include #include +/* Include rte_compat.h first to allow experimental API's needed for the + * rte_meter.h rfc4115 functions. Once they are no longer marked as + * experimental the #define and rte_compat.h include can be removed. + */ +#define ALLOW_EXPERIMENTAL_API +#include I guess the risk here, from what I understand, this approach is all or nothing in terms of experimental APIs from the included headers, other experimental APIs could be used from DPDK going forward without causing warning? If so, there would have to be extra dilligence taken when reviewing future patches and a discussion if an API is expereimental, should it wait until it is marked as non experimental. (In this case The TRTCM looks stable and is highly unlikey to be removed so it's not an issue IMO). @Ilya/Kevin: Would you agree with above? Thoughts? I think it make sense to wait for API to become non-experimental if we have no easy way to enable it only on functions we need. I agree that having widly enabled experimental api might produce additional issues and will require more careful review of all the DPDK related patches. So I'm ok with making an exception in this case for the feature, I think it was an oversight but the API is stable. But what I'm hearing is in general the rule should be for new features the API should not be experimental and if it is then it would have to have it's experimental tag removed by the next DPDK upgrade (in this case 20.11 for arguments sake) and then can be upstreamed to OVS master. BTW, another thought I have in mind about all the release management is: Shouldn't we hold OVS updates to new DPDK LTS until the first correction release is out? I mean, for example, Ubuntu triggers updates from one LTS release to another only after .1 correcting relese is out (users of Ubuntu 18.04 will receive upgrade notifications only after 20.04.1 is released). Shouldn't we do the same thing? Shouldn't we upgrade to the next DPDK LTS only after XX.11.1 is ready? This might make sense in order to not have obviously broken functionality in OVS releases but at the same time might just defer actual revealing of DPDK issues, so I'm not fully sure about this. Since OVS is not the only user of DPDK, this still might make sense anyway. Would like to hear some thoughts. Yes, I've thought about this as well. There certainly is advantage to moving to a .1 release in terms of stability instead of .0. However when I thought about the release two things came to mind. (i) By moving to the .0 release, is OVS in a position to better contribute feedback for the .1 release and ensure relevant patches for OVS with DPDK fixes are upstreamed in the .1 (rather than a .2). Feedback coming not from just the developers but also from
Re: [ovs-dev] [PATCH v5 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer
On 15.01.2020 12:28, Stokes, Ian wrote: > > > On 1/14/2020 4:12 PM, Eelco Chaudron wrote: >> This patch adds a new policer to the DPDK datapath based on RFC 4115's >> Two-Rate, Three-Color marker. It's a two-level hierarchical policer >> which first does a color-blind marking of the traffic at the queue >> level, followed by a color-aware marking at the port level. At the end >> traffic marked as Green or Yellow is forwarded, Red is dropped. For >> details on how traffic is marked, see RFC 4115. >> >> This egress policer can be used to limit traffic at different rated >> based on the queues the traffic is in. In addition, it can also be used >> to prioritize certain traffic over others at a port level. >> >> For example, the following configuration will limit the traffic rate at a >> port level to a maximum of 2000 packets a second (64 bytes IPv4 packets). >> 100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess >> Information Rate). High priority traffic is routed to queue 10, which marks >> all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is >> marked as EIR, i.e. Yellow. >> >> ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \ >> --id=@myqos create qos type=trtcm-policer \ >> other-config:cir=52000 other-config:cbs=2048 \ >> other-config:eir=52000 other-config:ebs=2048 \ >> queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \ >> --id=@dpdk1Q10 create queue \ >> other-config:cir=4160 other-config:cbs=2048 \ >> other-config:eir=0 other-config:ebs=0 -- \ >> --id=@dpdk1Q20 create queue \ >> other-config:cir=0 other-config:cbs=0 \ >> other-config:eir=4160 other-config:ebs=2048 \ >> >> This configuration accomplishes that the high priority traffic has a >> guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also >> use the EIR, so a total of 2000pps at max. These additional 1000pps is >> shared with the low priority traffic. The low priority traffic can use at >> maximum 1000pps. >> > Thanks for the patch Eelco, minor comment below. > > > >> Rate Limiting (Ingress Policing) >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >> index 128963f..1ed4a47 100644 >> --- a/lib/netdev-dpdk.c >> +++ b/lib/netdev-dpdk.c >> @@ -26,6 +26,12 @@ >> #include >> #include >> +/* Include rte_compat.h first to allow experimental API's needed for the >> + * rte_meter.h rfc4115 functions. Once they are no longer marked as >> + * experimental the #define and rte_compat.h include can be removed. >> + */ >> +#define ALLOW_EXPERIMENTAL_API >> +#include > > I guess the risk here, from what I understand, this approach is all or > nothing in terms of experimental APIs from the included headers, other > experimental APIs could be used from DPDK going forward without causing > warning? > > If so, there would have to be extra dilligence taken when reviewing future > patches and a discussion if an API is expereimental, should it wait until it > is marked as non experimental. (In this case The TRTCM looks stable and is > highly unlikey to be removed so it's not an issue IMO). > > @Ilya/Kevin: Would you agree with above? Thoughts? I think it make sense to wait for API to become non-experimental if we have no easy way to enable it only on functions we need. I agree that having widly enabled experimental api might produce additional issues and will require more careful review of all the DPDK related patches. BTW, another thought I have in mind about all the release management is: Shouldn't we hold OVS updates to new DPDK LTS until the first correction release is out? I mean, for example, Ubuntu triggers updates from one LTS release to another only after .1 correcting relese is out (users of Ubuntu 18.04 will receive upgrade notifications only after 20.04.1 is released). Shouldn't we do the same thing? Shouldn't we upgrade to the next DPDK LTS only after XX.11.1 is ready? This might make sense in order to not have obviously broken functionality in OVS releases but at the same time might just defer actual revealing of DPDK issues, so I'm not fully sure about this. Since OVS is not the only user of DPDK, this still might make sense anyway. Would like to hear some thoughts. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer
On 1/14/2020 4:12 PM, Eelco Chaudron wrote: This patch adds a new policer to the DPDK datapath based on RFC 4115's Two-Rate, Three-Color marker. It's a two-level hierarchical policer which first does a color-blind marking of the traffic at the queue level, followed by a color-aware marking at the port level. At the end traffic marked as Green or Yellow is forwarded, Red is dropped. For details on how traffic is marked, see RFC 4115. This egress policer can be used to limit traffic at different rated based on the queues the traffic is in. In addition, it can also be used to prioritize certain traffic over others at a port level. For example, the following configuration will limit the traffic rate at a port level to a maximum of 2000 packets a second (64 bytes IPv4 packets). 100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess Information Rate). High priority traffic is routed to queue 10, which marks all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is marked as EIR, i.e. Yellow. ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \ --id=@myqos create qos type=trtcm-policer \ other-config:cir=52000 other-config:cbs=2048 \ other-config:eir=52000 other-config:ebs=2048 \ queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \ --id=@dpdk1Q10 create queue \ other-config:cir=4160 other-config:cbs=2048 \ other-config:eir=0 other-config:ebs=0 -- \ --id=@dpdk1Q20 create queue \ other-config:cir=0 other-config:cbs=0 \ other-config:eir=4160 other-config:ebs=2048 \ This configuration accomplishes that the high priority traffic has a guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also use the EIR, so a total of 2000pps at max. These additional 1000pps is shared with the low priority traffic. The low priority traffic can use at maximum 1000pps. Thanks for the patch Eelco, minor comment below. Rate Limiting (Ingress Policing) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 128963f..1ed4a47 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -26,6 +26,12 @@ #include #include +/* Include rte_compat.h first to allow experimental API's needed for the + * rte_meter.h rfc4115 functions. Once they are no longer marked as + * experimental the #define and rte_compat.h include can be removed. + */ +#define ALLOW_EXPERIMENTAL_API +#include I guess the risk here, from what I understand, this approach is all or nothing in terms of experimental APIs from the included headers, other experimental APIs could be used from DPDK going forward without causing warning? If so, there would have to be extra dilligence taken when reviewing future patches and a discussion if an API is expereimental, should it wait until it is marked as non experimental. (In this case The TRTCM looks stable and is highly unlikey to be removed so it's not an issue IMO). @Ilya/Kevin: Would you agree with above? Thoughts? Ian #include #include #include @@ -329,8 +335,9 @@ struct dpdk_qos_ops { struct netdev_dpdk_queue_state *state); }; -/* dpdk_qos_ops for each type of user space QoS implementation */ +/* dpdk_qos_ops for each type of user space QoS implementation. */ static const struct dpdk_qos_ops egress_policer_ops; +static const struct dpdk_qos_ops trtcm_policer_ops; /* * Array of dpdk_qos_ops, contains pointer to all supported QoS @@ -338,6 +345,7 @@ static const struct dpdk_qos_ops egress_policer_ops; */ static const struct dpdk_qos_ops *const qos_confs[] = { _policer_ops, +_policer_ops, NULL }; @@ -2162,9 +2170,9 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int qid, } static inline bool -netdev_dpdk_policer_pkt_handle(struct rte_meter_srtcm *meter, - struct rte_meter_srtcm_profile *profile, - struct rte_mbuf *pkt, uint64_t time) +netdev_dpdk_srtcm_policer_pkt_handle(struct rte_meter_srtcm *meter, + struct rte_meter_srtcm_profile *profile, + struct rte_mbuf *pkt, uint64_t time) { uint32_t pkt_len = rte_pktmbuf_pkt_len(pkt) - sizeof(struct rte_ether_hdr); @@ -2173,10 +2181,10 @@ netdev_dpdk_policer_pkt_handle(struct rte_meter_srtcm *meter, } static int -netdev_dpdk_policer_run(struct rte_meter_srtcm *meter, -struct rte_meter_srtcm_profile *profile, -struct rte_mbuf **pkts, int pkt_cnt, -bool should_steal) +srtcm_policer_run_single_packet(struct rte_meter_srtcm *meter, +struct rte_meter_srtcm_profile *profile, +struct rte_mbuf **pkts, int pkt_cnt, +bool should_steal) { int i = 0; int cnt = 0; @@ -2186,8 +2194,8 @@ netdev_dpdk_policer_run(struct rte_meter_srtcm *meter, for