Re: [ovs-dev] [PATCH v5 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer

2020-01-15 Thread Stokes, Ian



On 1/15/2020 4:49 PM, Kevin Traynor wrote:

On 15/01/2020 13:19, Stokes, Ian wrote:



On 1/15/2020 12:19 PM, Ilya Maximets wrote:

On 15.01.2020 12:28, Stokes, Ian wrote:



On 1/14/2020 4:12 PM, Eelco Chaudron wrote:

This patch adds a new policer to the DPDK datapath based on RFC 4115's
Two-Rate, Three-Color marker. It's a two-level hierarchical policer
which first does a color-blind marking of the traffic at the queue
level, followed by a color-aware marking at the port level. At the end
traffic marked as Green or Yellow is forwarded, Red is dropped. For
details on how traffic is marked, see RFC 4115.

This egress policer can be used to limit traffic at different rated
based on the queues the traffic is in. In addition, it can also be used
to prioritize certain traffic over others at a port level.

For example, the following configuration will limit the traffic rate at a
port level to a maximum of 2000 packets a second (64 bytes IPv4 packets).
100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess
Information Rate). High priority traffic is routed to queue 10, which marks
all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is
marked as EIR, i.e. Yellow.

ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \
     --id=@myqos create qos type=trtcm-policer \
     other-config:cir=52000 other-config:cbs=2048 \
     other-config:eir=52000 other-config:ebs=2048  \
     queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \
     --id=@dpdk1Q10 create queue \
   other-config:cir=4160 other-config:cbs=2048 \
   other-config:eir=0 other-config:ebs=0 -- \
     --id=@dpdk1Q20 create queue \
   other-config:cir=0 other-config:cbs=0 \
   other-config:eir=4160 other-config:ebs=2048 \

This configuration accomplishes that the high priority traffic has a
guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also
use the EIR, so a total of 2000pps at max. These additional 1000pps is
shared with the low priority traffic. The low priority traffic can use at
maximum 1000pps.


Thanks for the patch Eelco, minor comment below.


Ok, with the below discussion in mind I've pushed the series master. The 
understanding is that a better approach will be sought so that we can 
remove the experimental API exception in the future. I can raise this 
with the DPDK community call tomorrow and see what possibilities we have 
for removing the tag. If in the meantime, a new feature is submitted for 
OVS with DPDK that has an experimental tag for an API call it will be 
pushed back on until the experimental tag is removed.


Thanks
Ian






      Rate Limiting (Ingress Policing)
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 128963f..1ed4a47 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -26,6 +26,12 @@
    #include 
    #include 
    +/* Include rte_compat.h first to allow experimental API's needed for the
+ * rte_meter.h rfc4115 functions. Once they are no longer marked as
+ * experimental the #define and rte_compat.h include can be removed.
+ */
+#define ALLOW_EXPERIMENTAL_API
+#include 


I guess the risk here, from what I understand, this approach is all or nothing 
in terms of experimental APIs from the included headers, other experimental 
APIs could be used from DPDK going forward without causing warning?

If so, there would have to be extra dilligence taken when reviewing future 
patches and a discussion if an API is expereimental, should it wait until it is 
marked as non experimental. (In this case The TRTCM looks stable and is highly 
unlikey to be removed so it's not an issue IMO).

@Ilya/Kevin: Would you agree with above? Thoughts?


I think it make sense to wait for API to become non-experimental
if we have no easy way to enable it only on functions we need.
I agree that having widly enabled experimental api might produce
additional issues and will require more careful review of all the
DPDK related patches.


So I'm ok with making an exception in this case for the feature, I think
it was an oversight but the API is stable. But what I'm hearing is in
general the rule should be for new features the API should not be
experimental and if it is then it would have to have it's experimental
tag removed by the next DPDK upgrade (in this case 20.11 for arguments
sake) and then can be upstreamed to OVS master.



In this case, it sounds ok to me. We know it will be non-experimental in
the next hours/days and we will not be left with some zombie feature on
upgrade. Plus, there might be an acceptable way of adding a
non-experimental symbol for this to the v19.11 stable branch too (not
guaranteed though), and if so we could turn this off on upgrade to
v19.11.X. OTOH, I don't know how important it is to have this feature in
OVS 2.13. >


BTW, another thought I have in mind about all the release management is:
Shouldn't we hold OVS updates to new DPDK LTS until the first correction
release is out?  I mean, for example, Ubuntu triggers updates from one

Re: [ovs-dev] [PATCH v5 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer

2020-01-15 Thread Kevin Traynor
On 15/01/2020 13:19, Stokes, Ian wrote:
> 
> 
> On 1/15/2020 12:19 PM, Ilya Maximets wrote:
>> On 15.01.2020 12:28, Stokes, Ian wrote:
>>>
>>>
>>> On 1/14/2020 4:12 PM, Eelco Chaudron wrote:
 This patch adds a new policer to the DPDK datapath based on RFC 4115's
 Two-Rate, Three-Color marker. It's a two-level hierarchical policer
 which first does a color-blind marking of the traffic at the queue
 level, followed by a color-aware marking at the port level. At the end
 traffic marked as Green or Yellow is forwarded, Red is dropped. For
 details on how traffic is marked, see RFC 4115.

 This egress policer can be used to limit traffic at different rated
 based on the queues the traffic is in. In addition, it can also be used
 to prioritize certain traffic over others at a port level.

 For example, the following configuration will limit the traffic rate at a
 port level to a maximum of 2000 packets a second (64 bytes IPv4 packets).
 100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess
 Information Rate). High priority traffic is routed to queue 10, which marks
 all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is
 marked as EIR, i.e. Yellow.

 ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \
     --id=@myqos create qos type=trtcm-policer \
     other-config:cir=52000 other-config:cbs=2048 \
     other-config:eir=52000 other-config:ebs=2048  \
     queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \
     --id=@dpdk1Q10 create queue \
   other-config:cir=4160 other-config:cbs=2048 \
   other-config:eir=0 other-config:ebs=0 -- \
     --id=@dpdk1Q20 create queue \
   other-config:cir=0 other-config:cbs=0 \
   other-config:eir=4160 other-config:ebs=2048 \

 This configuration accomplishes that the high priority traffic has a
 guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also
 use the EIR, so a total of 2000pps at max. These additional 1000pps is
 shared with the low priority traffic. The low priority traffic can use at
 maximum 1000pps.

>>> Thanks for the patch Eelco, minor comment below.
>>>
>>> 
>>>
      Rate Limiting (Ingress Policing)
 diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
 index 128963f..1ed4a47 100644
 --- a/lib/netdev-dpdk.c
 +++ b/lib/netdev-dpdk.c
 @@ -26,6 +26,12 @@
    #include 
    #include 
    +/* Include rte_compat.h first to allow experimental API's needed for 
 the
 + * rte_meter.h rfc4115 functions. Once they are no longer marked as
 + * experimental the #define and rte_compat.h include can be removed.
 + */
 +#define ALLOW_EXPERIMENTAL_API
 +#include 
>>>
>>> I guess the risk here, from what I understand, this approach is all or 
>>> nothing in terms of experimental APIs from the included headers, other 
>>> experimental APIs could be used from DPDK going forward without causing 
>>> warning?
>>>
>>> If so, there would have to be extra dilligence taken when reviewing future 
>>> patches and a discussion if an API is expereimental, should it wait until 
>>> it is marked as non experimental. (In this case The TRTCM looks stable and 
>>> is highly unlikey to be removed so it's not an issue IMO).
>>>
>>> @Ilya/Kevin: Would you agree with above? Thoughts?
>>
>> I think it make sense to wait for API to become non-experimental
>> if we have no easy way to enable it only on functions we need.
>> I agree that having widly enabled experimental api might produce
>> additional issues and will require more careful review of all the
>> DPDK related patches.
> 
> So I'm ok with making an exception in this case for the feature, I think 
> it was an oversight but the API is stable. But what I'm hearing is in 
> general the rule should be for new features the API should not be 
> experimental and if it is then it would have to have it's experimental 
> tag removed by the next DPDK upgrade (in this case 20.11 for arguments 
> sake) and then can be upstreamed to OVS master.
> 

In this case, it sounds ok to me. We know it will be non-experimental in
the next hours/days and we will not be left with some zombie feature on
upgrade. Plus, there might be an acceptable way of adding a
non-experimental symbol for this to the v19.11 stable branch too (not
guaranteed though), and if so we could turn this off on upgrade to
v19.11.X. OTOH, I don't know how important it is to have this feature in
OVS 2.13.

>>
>> BTW, another thought I have in mind about all the release management is:
>> Shouldn't we hold OVS updates to new DPDK LTS until the first correction
>> release is out?  I mean, for example, Ubuntu triggers updates from one
>> LTS release to another only after .1 correcting relese is out (users
>> of Ubuntu 18.04 will receive upgrade notifications only after 20.04.1
>> is released).  Shouldn't we do the same thing?  

Re: [ovs-dev] [PATCH v5 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer

2020-01-15 Thread Stokes, Ian



On 1/15/2020 12:19 PM, Ilya Maximets wrote:

On 15.01.2020 12:28, Stokes, Ian wrote:



On 1/14/2020 4:12 PM, Eelco Chaudron wrote:

This patch adds a new policer to the DPDK datapath based on RFC 4115's
Two-Rate, Three-Color marker. It's a two-level hierarchical policer
which first does a color-blind marking of the traffic at the queue
level, followed by a color-aware marking at the port level. At the end
traffic marked as Green or Yellow is forwarded, Red is dropped. For
details on how traffic is marked, see RFC 4115.

This egress policer can be used to limit traffic at different rated
based on the queues the traffic is in. In addition, it can also be used
to prioritize certain traffic over others at a port level.

For example, the following configuration will limit the traffic rate at a
port level to a maximum of 2000 packets a second (64 bytes IPv4 packets).
100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess
Information Rate). High priority traffic is routed to queue 10, which marks
all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is
marked as EIR, i.e. Yellow.

ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \
    --id=@myqos create qos type=trtcm-policer \
    other-config:cir=52000 other-config:cbs=2048 \
    other-config:eir=52000 other-config:ebs=2048  \
    queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \
    --id=@dpdk1Q10 create queue \
  other-config:cir=4160 other-config:cbs=2048 \
  other-config:eir=0 other-config:ebs=0 -- \
    --id=@dpdk1Q20 create queue \
  other-config:cir=0 other-config:cbs=0 \
  other-config:eir=4160 other-config:ebs=2048 \

This configuration accomplishes that the high priority traffic has a
guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also
use the EIR, so a total of 2000pps at max. These additional 1000pps is
shared with the low priority traffic. The low priority traffic can use at
maximum 1000pps.


Thanks for the patch Eelco, minor comment below.




     Rate Limiting (Ingress Policing)
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 128963f..1ed4a47 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -26,6 +26,12 @@
   #include 
   #include 
   +/* Include rte_compat.h first to allow experimental API's needed for the
+ * rte_meter.h rfc4115 functions. Once they are no longer marked as
+ * experimental the #define and rte_compat.h include can be removed.
+ */
+#define ALLOW_EXPERIMENTAL_API
+#include 


I guess the risk here, from what I understand, this approach is all or nothing 
in terms of experimental APIs from the included headers, other experimental 
APIs could be used from DPDK going forward without causing warning?

If so, there would have to be extra dilligence taken when reviewing future 
patches and a discussion if an API is expereimental, should it wait until it is 
marked as non experimental. (In this case The TRTCM looks stable and is highly 
unlikey to be removed so it's not an issue IMO).

@Ilya/Kevin: Would you agree with above? Thoughts?


I think it make sense to wait for API to become non-experimental
if we have no easy way to enable it only on functions we need.
I agree that having widly enabled experimental api might produce
additional issues and will require more careful review of all the
DPDK related patches.


So I'm ok with making an exception in this case for the feature, I think 
it was an oversight but the API is stable. But what I'm hearing is in 
general the rule should be for new features the API should not be 
experimental and if it is then it would have to have it's experimental 
tag removed by the next DPDK upgrade (in this case 20.11 for arguments 
sake) and then can be upstreamed to OVS master.




BTW, another thought I have in mind about all the release management is:
Shouldn't we hold OVS updates to new DPDK LTS until the first correction
release is out?  I mean, for example, Ubuntu triggers updates from one
LTS release to another only after .1 correcting relese is out (users
of Ubuntu 18.04 will receive upgrade notifications only after 20.04.1
is released).  Shouldn't we do the same thing?  Shouldn't we upgrade
to the next DPDK LTS only after XX.11.1 is ready?  This might make sense
in order to not have obviously broken functionality in OVS releases but
at the same time might just defer actual revealing of DPDK issues, so
I'm not fully sure about this.  Since OVS is not the only user of DPDK,
this still might make sense anyway.  Would like to hear some thoughts.



Yes, I've thought about this as well. There certainly is advantage to 
moving to a .1 release in terms of stability instead of .0. However when 
I thought about the release two things came to mind.


(i) By moving to the .0 release, is OVS in a position to better 
contribute feedback for the .1 release and ensure relevant patches for 
OVS with DPDK fixes are upstreamed in the .1 (rather than a .2). 
Feedback coming not from just the developers but also from 

Re: [ovs-dev] [PATCH v5 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer

2020-01-15 Thread Ilya Maximets
On 15.01.2020 12:28, Stokes, Ian wrote:
> 
> 
> On 1/14/2020 4:12 PM, Eelco Chaudron wrote:
>> This patch adds a new policer to the DPDK datapath based on RFC 4115's
>> Two-Rate, Three-Color marker. It's a two-level hierarchical policer
>> which first does a color-blind marking of the traffic at the queue
>> level, followed by a color-aware marking at the port level. At the end
>> traffic marked as Green or Yellow is forwarded, Red is dropped. For
>> details on how traffic is marked, see RFC 4115.
>>
>> This egress policer can be used to limit traffic at different rated
>> based on the queues the traffic is in. In addition, it can also be used
>> to prioritize certain traffic over others at a port level.
>>
>> For example, the following configuration will limit the traffic rate at a
>> port level to a maximum of 2000 packets a second (64 bytes IPv4 packets).
>> 100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess
>> Information Rate). High priority traffic is routed to queue 10, which marks
>> all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is
>> marked as EIR, i.e. Yellow.
>>
>> ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \
>>    --id=@myqos create qos type=trtcm-policer \
>>    other-config:cir=52000 other-config:cbs=2048 \
>>    other-config:eir=52000 other-config:ebs=2048  \
>>    queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \
>>    --id=@dpdk1Q10 create queue \
>>  other-config:cir=4160 other-config:cbs=2048 \
>>  other-config:eir=0 other-config:ebs=0 -- \
>>    --id=@dpdk1Q20 create queue \
>>  other-config:cir=0 other-config:cbs=0 \
>>  other-config:eir=4160 other-config:ebs=2048 \
>>
>> This configuration accomplishes that the high priority traffic has a
>> guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also
>> use the EIR, so a total of 2000pps at max. These additional 1000pps is
>> shared with the low priority traffic. The low priority traffic can use at
>> maximum 1000pps.
>>
> Thanks for the patch Eelco, minor comment below.
> 
> 
> 
>>     Rate Limiting (Ingress Policing)
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index 128963f..1ed4a47 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -26,6 +26,12 @@
>>   #include 
>>   #include 
>>   +/* Include rte_compat.h first to allow experimental API's needed for the
>> + * rte_meter.h rfc4115 functions. Once they are no longer marked as
>> + * experimental the #define and rte_compat.h include can be removed.
>> + */
>> +#define ALLOW_EXPERIMENTAL_API
>> +#include 
> 
> I guess the risk here, from what I understand, this approach is all or 
> nothing in terms of experimental APIs from the included headers, other 
> experimental APIs could be used from DPDK going forward without causing 
> warning?
> 
> If so, there would have to be extra dilligence taken when reviewing future 
> patches and a discussion if an API is expereimental, should it wait until it 
> is marked as non experimental. (In this case The TRTCM looks stable and is 
> highly unlikey to be removed so it's not an issue IMO).
> 
> @Ilya/Kevin: Would you agree with above? Thoughts?

I think it make sense to wait for API to become non-experimental
if we have no easy way to enable it only on functions we need.
I agree that having widly enabled experimental api might produce
additional issues and will require more careful review of all the
DPDK related patches.

BTW, another thought I have in mind about all the release management is:
Shouldn't we hold OVS updates to new DPDK LTS until the first correction
release is out?  I mean, for example, Ubuntu triggers updates from one
LTS release to another only after .1 correcting relese is out (users
of Ubuntu 18.04 will receive upgrade notifications only after 20.04.1
is released).  Shouldn't we do the same thing?  Shouldn't we upgrade
to the next DPDK LTS only after XX.11.1 is ready?  This might make sense
in order to not have obviously broken functionality in OVS releases but
at the same time might just defer actual revealing of DPDK issues, so
I'm not fully sure about this.  Since OVS is not the only user of DPDK,
this still might make sense anyway.  Would like to hear some thoughts.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 2/2] netdev-dpdk: Add new DPDK RFC 4115 egress policer

2020-01-15 Thread Stokes, Ian




On 1/14/2020 4:12 PM, Eelco Chaudron wrote:

This patch adds a new policer to the DPDK datapath based on RFC 4115's
Two-Rate, Three-Color marker. It's a two-level hierarchical policer
which first does a color-blind marking of the traffic at the queue
level, followed by a color-aware marking at the port level. At the end
traffic marked as Green or Yellow is forwarded, Red is dropped. For
details on how traffic is marked, see RFC 4115.

This egress policer can be used to limit traffic at different rated
based on the queues the traffic is in. In addition, it can also be used
to prioritize certain traffic over others at a port level.

For example, the following configuration will limit the traffic rate at a
port level to a maximum of 2000 packets a second (64 bytes IPv4 packets).
100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess
Information Rate). High priority traffic is routed to queue 10, which marks
all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is
marked as EIR, i.e. Yellow.

ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \
   --id=@myqos create qos type=trtcm-policer \
   other-config:cir=52000 other-config:cbs=2048 \
   other-config:eir=52000 other-config:ebs=2048  \
   queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \
   --id=@dpdk1Q10 create queue \
 other-config:cir=4160 other-config:cbs=2048 \
 other-config:eir=0 other-config:ebs=0 -- \
   --id=@dpdk1Q20 create queue \
 other-config:cir=0 other-config:cbs=0 \
 other-config:eir=4160 other-config:ebs=2048 \

This configuration accomplishes that the high priority traffic has a
guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also
use the EIR, so a total of 2000pps at max. These additional 1000pps is
shared with the low priority traffic. The low priority traffic can use at
maximum 1000pps.


Thanks for the patch Eelco, minor comment below.



  
  Rate Limiting (Ingress Policing)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 128963f..1ed4a47 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -26,6 +26,12 @@
  #include 
  #include 
  
+/* Include rte_compat.h first to allow experimental API's needed for the

+ * rte_meter.h rfc4115 functions. Once they are no longer marked as
+ * experimental the #define and rte_compat.h include can be removed.
+ */
+#define ALLOW_EXPERIMENTAL_API
+#include 


I guess the risk here, from what I understand, this approach is all or 
nothing in terms of experimental APIs from the included headers, other 
experimental APIs could be used from DPDK going forward without causing 
warning?


If so, there would have to be extra dilligence taken when reviewing 
future patches and a discussion if an API is expereimental, should it 
wait until it is marked as non experimental. (In this case The TRTCM 
looks stable and is highly unlikey to be removed so it's not an issue IMO).


@Ilya/Kevin: Would you agree with above? Thoughts?

Ian

  #include 
  #include 
  #include 
@@ -329,8 +335,9 @@ struct dpdk_qos_ops {
   struct netdev_dpdk_queue_state *state);
  };
  
-/* dpdk_qos_ops for each type of user space QoS implementation */

+/* dpdk_qos_ops for each type of user space QoS implementation. */
  static const struct dpdk_qos_ops egress_policer_ops;
+static const struct dpdk_qos_ops trtcm_policer_ops;
  
  /*

   * Array of dpdk_qos_ops, contains pointer to all supported QoS
@@ -338,6 +345,7 @@ static const struct dpdk_qos_ops egress_policer_ops;
   */
  static const struct dpdk_qos_ops *const qos_confs[] = {
  _policer_ops,
+_policer_ops,
  NULL
  };
  
@@ -2162,9 +2170,9 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int qid,

  }
  
  static inline bool

-netdev_dpdk_policer_pkt_handle(struct rte_meter_srtcm *meter,
-   struct rte_meter_srtcm_profile *profile,
-   struct rte_mbuf *pkt, uint64_t time)
+netdev_dpdk_srtcm_policer_pkt_handle(struct rte_meter_srtcm *meter,
+ struct rte_meter_srtcm_profile *profile,
+ struct rte_mbuf *pkt, uint64_t time)
  {
  uint32_t pkt_len = rte_pktmbuf_pkt_len(pkt) - sizeof(struct 
rte_ether_hdr);
  
@@ -2173,10 +2181,10 @@ netdev_dpdk_policer_pkt_handle(struct rte_meter_srtcm *meter,

  }
  
  static int

-netdev_dpdk_policer_run(struct rte_meter_srtcm *meter,
-struct rte_meter_srtcm_profile *profile,
-struct rte_mbuf **pkts, int pkt_cnt,
-bool should_steal)
+srtcm_policer_run_single_packet(struct rte_meter_srtcm *meter,
+struct rte_meter_srtcm_profile *profile,
+struct rte_mbuf **pkts, int pkt_cnt,
+bool should_steal)
  {
  int i = 0;
  int cnt = 0;
@@ -2186,8 +2194,8 @@ netdev_dpdk_policer_run(struct rte_meter_srtcm *meter,
  for