From: Nikhil Jagtap [mailto:nikhil.jag...@gmail.com]
Sent: Wednesday, October 5, 2016 8:10 AM
To: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>
Cc: dev at dpdk.org; users at dpdk.org
Subject: Re: qos: traffic shaping at queue level

Hi Cristian,

Thanks for the info. A few more comments/questions inline.

On 3 October 2016 at 23:42, Dumitrescu, Cristian <cristian.dumitrescu at 
intel.com<mailto:cristian.dumitrescu at intel.com>> wrote:


From: Nikhil Jagtap [mailto:nikhil.jagtap at 
gmail.com<mailto:nikhil.jag...@gmail.com>]
Sent: Friday, September 30, 2016 7:12 AM
To: dev at dpdk.org<mailto:dev at dpdk.org>; Dumitrescu, Cristian 
<cristian.dumitrescu at intel.com<mailto:cristian.dumitrescu at intel.com>>; 
users at dpdk.org<mailto:users at dpdk.org>
Subject: Re: qos: traffic shaping at queue level

Hi,
Can someone please answer my queries?
I tried using queue weights to distribute traffic-class bandwidth among the 
child queues, but did not get the desired results.
[Cristian] Can you please describe what issues you see?
[Nikhil] At the end of a 20 minute test, the total number of packets dequeued 
from the respective queues were not in the ratio 1:5.
In one other test where 4 equal-rate traffic-streams were hitting 4 different 
queues of the same TC configured with weights 1:2:4:8, I observed that the 
queue with highest weight had the least number of dequeued packets when in 
theory it should have been the one with highest packet count.

[Cristian] No idea why you get into this issue ? Please keep me posted once you 
find the root cause of your issue, maybe there is something that we can improve 
here.

Regards,
Nikhil

On 27 September 2016 at 15:34, Nikhil Jagtap <nikhil.jagtap at 
gmail.com<mailto:nikhil.jagtap at gmail.com>> wrote:
Hi,

I have a few questions about the hierarchical scheduler. I am taking a simple 
example here to get a better understanding.

Reference example:
  pipe rate = 30 mbps
  tc 0 rate = 30 mbps
  traffic-type 0 being queued to queue 0, tc 0.
  traffic-type 1 being queued to queue 1, tc 0.
  Assume traffic-type 0 is being received at the rate of 25 mbps.
  Assume traffic-type 1 is also being received at the rate of 25 mbps.

Requirement:
  To limit traffic-type 0 to (CIR =  5 mbps, PIR = 30 mbps), AND
      limit traffic-type 1 to (CIR = 25 mbps, PIR = 30 mbps).

The questions:
1) I understand that with the scheduler, it is possible to do rate limiting 
only at the sub-port and pipe levels and not at the individual queue level.
[Cristian] Yes, correct, only subports and pipes own token buckets, with all 
the pipe traffic classes and queues sharing their pipe token bucket.

Is it possible to achieve rate limiting using the notion of queue weights? For 
the above example, will assigning weights in 1:5 ratio to the two queues help 
achieve shaping the two traffic-types at the two different rates?
[Cristian] Yes. However, getting the weight observed accurately relies on all 
the queues being backlogged (always having packets to dequeue). When a pipe and 
certain TC is examined for dequeuing, the relative weights are enforced between 
the queues that have packets at that precise moment in time, with the empty 
queues being ignored. The fully backlogged scenario is not taking place in 
practice, and the set of non-empty queues changes over time. As said it the 
past, having big relative weight ratios between queues helps (1:5 should be 
good).
[Nikhil] I see. So I guess not having fully backlogged queues could be one of 
the reasons for the observations I mentioned above where the weights-ratio does 
not directly translate into rate-ratio. I think I should also mention that 
there was no pipelining i.e. packet-processing, queueing, dequeing was all 
being done inline in a run-to-completion model.
a) Would having some kind of pipelining help achieve better rate-ratio? May be 
say atleast splitting the enqueue and dequeue operations?
b) If pipelining is not an option, what would be the recommended values for 
enqueue and dequeue packet count in the run-to-completion model? You have 
mentioned in one of your presentations to use different values for these two. 
If I go with (enqueue# > dequeue#), don't I run the risk of filling up the 
scheduler queues and failed enqueues even at rates lower than the scheduler 
pipe rates? In the other case where (dequeue# > enqueue#), we would end up 
dequeing all packets that were enqueued every time.

[Cristian]
a) In order to provide determinism for the hierarchical scheduler (e.g. 
frequent-enough calls of the enqueue and dequeue operations), I recommend 
dedicating a separate CPU core to run it, as opposed to running a lot of other 
stuff on the same core, which might result in the scheduler not being called 
regularly. This requires a pipeline of at least 2x CPU cores, i.e. one running 
your worker (run-to-completion) which feeds the second core running the 
scheduler.
b) As documented, for performance reasons, the API is not thread safe, so you 
need to run enqueue and dequeue of a given port on the same CPU core. Any 
(enqueue, dequeue) pair with enqueue > dequeue works. For DPDK apps using 
vector PMD, the burst size is usually 32, then we typically use e.g (32, 28), 
(32, 24); for apps not using vector PMD, we used in the past (64, 48), (64, 
32); recently, in Cisco VPP we used (256, 240), as VPP typical burst size is 
256 packets.

2) In continuation to previous question: if queue weights don't help, would it 
be possible to use metering to achieve rate limiting? Assume we meter 
individual traffic-types (using CIR-PIR config mentioned above) before queuing 
it to the scheduler queues. So to achieve the respective queue rates, the 
dequeuer would be expected to prioritise green packets over yellow.
Looking into the code, the packet color is used as an input to the dropper 
block, but does not seem to be used anywhere in the scheduler. So I guess it is 
not possible to prioritise green packets when dequeing?
[Cristian] Packet color is used by Weighted RED (WRED) congestion management 
scheme on the enqueue side, not on the dequeue side. Once the packet has been 
enqueued, it cannot be dropped (i.e. every enqueued packet will eventually be 
dequeued), so rate limiting cannot be enforced on the dequeue side.

Regards,
Nikhil


Thanks.
Nikhil

Reply via email to