[dpdk-dev] how to design high performance QoS support for a large amount of subscribers

2016-08-04 Thread Yuyong Zhang
Thank you very much Cristian for the insightful response. 

Very much appreciated.

Regards,

Yuyong

-Original Message-
From: Dumitrescu, Cristian [mailto:cristian.dumitre...@intel.com] 
Sent: Thursday, August 4, 2016 9:01 AM
To: Yuyong Zhang ; dev at dpdk.org; users at 
dpdk.org
Subject: RE: how to design high performance QoS support for a large amount of 
subscribers

Hi Yuyong,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Yuyong Zhang
> Sent: Tuesday, August 2, 2016 4:26 PM
> To: dev at dpdk.org; users at dpdk.org
> Subject: [dpdk-dev] how to design high performance QoS support for a 
> large amount of subscribers
> 
> Hi,
> 
> I am trying to add QoS support for a high performance VNF with large 
> amount of subscribers (millions).

Welcome to the world of DPDK QoS users!

It requires to support guaranteed bit rate
> for different service level of subscribers. I.e. four service levels 
> need to be
> supported:
> 
> * Diamond, 500M
> 
> * Gold, 100M
> 
> * Silver, 50M
> 
> * Bronze, 10M

Service levels translate to pipe profiles in our DPDK implementation. The set 
of pipe profiles is defined per port.

> 
> Here is the current pipeline design using DPDK:
> 
> 
> * 4 RX threads, does packet classification and load balancing
> 
> * 10-20 worker thread, does application subscriber management
> 
> * 4 TX threads, sends packets to TX NICs.
> 
> * Ring buffers used among RX threads, Worker threads, and TX threads
> 
> I read DPDK program guide for QoS framework regarding  hierarchical
> scheduler: Port, sub-port, pipe, TC and queues, I am looking for 
> advice on how to design QoS scheduler to support millions of 
> subscribers (pipes) which traffic are processed in tens of worker 
> threads where subscriber management processing are handled?

Having millions of pipes per port poses some challenges:
1. Does it actually make sense? Assuming the port rate is 10GbE, looking at the 
smallest user rate you mention above (Bronze, 10Mbps/user), this means that 
fully provisioning all users (i.e. making sure you can fully handle each user 
in worst case scenario) results in a maximum of 1000 users per port. Assuming 
overprovisioning of 50:1, this means a maximum of 50K users per port.
2. Memory challenge. The number of pipes per port is configurable -- hey, this 
is SW! :) -- but each of these pipes has 16 queues. For 4K pipes per port, this 
is 64K queues per port; for typical value of 64 packets per queue, this is 4M 
packets per port, so worst case scenario we need to provision 4M packets in the 
buffer pool for each output port that has hierarchical scheduler enabled; for 
buffer size of ~2KB each, this means ~8GB of memory for each output port. If 
you go from 4k pipes per port to 4M pipes per port, this means 8TB of memory 
per port. Do you have enough memory in your system? :)

One thing to realize is that even for millions of users in your system, not all 
of them are active at the same time. So maybe have a smaller number of pipes 
and only map the active users (those that have any packets to send now) to them 
(a fraction of the total set of users), with the set of active users changing 
over time.

You can also consider mapping several users to the same pipe.

> 
> One design thought is as the following:
> 
> 8 ports (each one is associated with one physical port), 16-20 
> sub-ports (each is used by one Worker thread), each sub-port supports 
> 250K pipes for subscribers. Each worker thread manages one sub-port 
> and does metering for the sub-port to get color, and after identity 
> subscriber flow pick a unused pipe, and do sched enqueuer/de-queue and 
> then put into TX rings to TX threads, and TX threads send the packets to TX 
> NICs.
> 

In the current implementation, each port scheduler object has to be owned by a 
single thread, i.e. you cannot slit a port across multiple threads, therefore 
is not straightforward to have different sub-ports handled by different 
threads. The workaround is to split yourself the physical NIC port into 
multiple port scheduler objects: for example, create 8 port scheduler objects, 
set the rate of each to 1/8 of 10GbE, have each of them feed a different NIC TX 
queue of the same physical NIC port.

You can probably get this scenario (or very similar) up pretty quickly just by 
handcrafting yourself a configuration file for examples/ip_pipeline application.

> Are there functional and performance issues with above approach?
> 
> Any advice and input are appreciated.
> 
> Regards,
> 
> Yuyong
> 
> 
> 

Regards,
Cristian



[dpdk-dev] how to design high performance QoS support for a large amount of subscribers

2016-08-04 Thread Dumitrescu, Cristian
Hi Yuyong,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Yuyong Zhang
> Sent: Tuesday, August 2, 2016 4:26 PM
> To: dev at dpdk.org; users at dpdk.org
> Subject: [dpdk-dev] how to design high performance QoS support for a large
> amount of subscribers
> 
> Hi,
> 
> I am trying to add QoS support for a high performance VNF with large
> amount of subscribers (millions).

Welcome to the world of DPDK QoS users!

It requires to support guaranteed bit rate
> for different service level of subscribers. I.e. four service levels need to 
> be
> supported:
> 
> * Diamond, 500M
> 
> * Gold, 100M
> 
> * Silver, 50M
> 
> * Bronze, 10M

Service levels translate to pipe profiles in our DPDK implementation. The set 
of pipe profiles is defined per port.

> 
> Here is the current pipeline design using DPDK:
> 
> 
> * 4 RX threads, does packet classification and load balancing
> 
> * 10-20 worker thread, does application subscriber management
> 
> * 4 TX threads, sends packets to TX NICs.
> 
> * Ring buffers used among RX threads, Worker threads, and TX threads
> 
> I read DPDK program guide for QoS framework regarding  hierarchical
> scheduler: Port, sub-port, pipe, TC and queues, I am looking for advice on
> how to design QoS scheduler to support millions of subscribers (pipes) which
> traffic are processed in tens of worker threads where subscriber
> management processing are handled?

Having millions of pipes per port poses some challenges:
1. Does it actually make sense? Assuming the port rate is 10GbE, looking at the 
smallest user rate you mention above (Bronze, 10Mbps/user), this means that 
fully provisioning all users (i.e. making sure you can fully handle each user 
in worst case scenario) results in a maximum of 1000 users per port. Assuming 
overprovisioning of 50:1, this means a maximum of 50K users per port.
2. Memory challenge. The number of pipes per port is configurable -- hey, this 
is SW! :) -- but each of these pipes has 16 queues. For 4K pipes per port, this 
is 64K queues per port; for typical value of 64 packets per queue, this is 4M 
packets per port, so worst case scenario we need to provision 4M packets in the 
buffer pool for each output port that has hierarchical scheduler enabled; for 
buffer size of ~2KB each, this means ~8GB of memory for each output port. If 
you go from 4k pipes per port to 4M pipes per port, this means 8TB of memory 
per port. Do you have enough memory in your system? :)

One thing to realize is that even for millions of users in your system, not all 
of them are active at the same time. So maybe have a smaller number of pipes 
and only map the active users (those that have any packets to send now) to them 
(a fraction of the total set of users), with the set of active users changing 
over time.

You can also consider mapping several users to the same pipe.

> 
> One design thought is as the following:
> 
> 8 ports (each one is associated with one physical port), 16-20 sub-ports (each
> is used by one Worker thread), each sub-port supports 250K pipes for
> subscribers. Each worker thread manages one sub-port and does metering
> for the sub-port to get color, and after identity subscriber flow pick a 
> unused
> pipe, and do sched enqueuer/de-queue and then put into TX rings to TX
> threads, and TX threads send the packets to TX NICs.
> 

In the current implementation, each port scheduler object has to be owned by a 
single thread, i.e. you cannot slit a port across multiple threads, therefore 
is not straightforward to have different sub-ports handled by different 
threads. The workaround is to split yourself the physical NIC port into 
multiple port scheduler objects: for example, create 8 port scheduler objects, 
set the rate of each to 1/8 of 10GbE, have each of them feed a different NIC TX 
queue of the same physical NIC port.

You can probably get this scenario (or very similar) up pretty quickly just by 
handcrafting yourself a configuration file for examples/ip_pipeline application.

> Are there functional and performance issues with above approach?
> 
> Any advice and input are appreciated.
> 
> Regards,
> 
> Yuyong
> 
> 
> 

Regards,
Cristian



[dpdk-dev] how to design high performance QoS support for a large amount of subscribers

2016-08-02 Thread Yuyong Zhang
Hi,

I am trying to add QoS support for a high performance VNF with large amount of 
subscribers (millions). It requires to support guaranteed bit rate for 
different service level of subscribers. I.e. four service levels need to be 
supported:

* Diamond, 500M

* Gold, 100M

* Silver, 50M

* Bronze, 10M

Here is the current pipeline design using DPDK:


* 4 RX threads, does packet classification and load balancing

* 10-20 worker thread, does application subscriber management

* 4 TX threads, sends packets to TX NICs.

* Ring buffers used among RX threads, Worker threads, and TX threads

I read DPDK program guide for QoS framework regarding  hierarchical scheduler: 
Port, sub-port, pipe, TC and queues, I am looking for advice on how to design 
QoS scheduler to support millions of subscribers (pipes) which traffic are 
processed in tens of worker threads where subscriber management processing are 
handled?

One design thought is as the following:

8 ports (each one is associated with one physical port), 16-20 sub-ports (each 
is used by one Worker thread), each sub-port supports 250K pipes for 
subscribers. Each worker thread manages one sub-port and does metering for the 
sub-port to get color, and after identity subscriber flow pick a unused pipe, 
and do sched enqueuer/de-queue and then put into TX rings to TX threads, and TX 
threads send the packets to TX NICs.

Are there functional and performance issues with above approach?

Any advice and input are appreciated.

Regards,

Yuyong