Hi Alan,

Thanks for your comments!


> Hi Cristian,

> Looking at points 10 and 11 it's good to hear nodes can be dynamically added.

Yes, many implementations allow on-the-fly remapping a node from one parent to 
another
one, or simply adding more nodes post-initialization, so it is natural for the 
API to provide this.


> We've been trying to decide the best way to do this for support of qos on 
> tunnels for
> some time now and the existing implementation doesn't allow this so 
> effectively ruled
> out hierarchical queueing for tunnel targets on the output interface.

> Having said that, has thought been given to separating the queueing from 
> being so closely
> tied to the Ethernet transmit process ?   When queueing on a tunnel for 
> example we may
> be working with encryption.   When running with an anti-reply window it is 
> really much
> better to do the QOS (packet reordering) before the encryption.  To support 
> this would
> it be possible to have a separate scheduler structure which can be passed 
> into the
> scheduling API ?  This means the calling code can hang the structure of 
> whatever entity
> it wishes to perform qos on, and we get dynamic target support 
> (sessions/tunnels etc).

Yes, this is one point where we need to look for a better solution. Current 
proposal attaches
the hierarchical scheduler function to an ethdev, so scheduling traffic for 
tunnels that have a
pre-defined bandwidth is not supported nicely. This question was also raised in 
VPP, but
there tunnels are supported as a type of output interfaces, so attaching 
scheduling to an
output interface also covers the tunnels case.

Looks to me that nice tunnel abstractions are a gap in DPDK as well. Any 
thoughts about
how tunnels should be supported in DPDK? What do other people think about this?


> Regarding the structure allocation, would it be possible to make the number 
> of queues
> associated with a TC a compile time option which the scheduler would 
> accommodate ?
> We frequently only use one queue per tc which means 75% of the space 
> allocated at
> the queueing layer for that tc is never used.  This may be specific to our 
> implementation
> but if other implementations do the same if folks could say we may get a 
> better idea
> if this is a common case.

> Whilst touching on the scheduler, the token replenishment works using a 
> division and
> multiplication obviously to cater for the fact that it may be run after 
> several tc windows
> have passed.  The most commonly used industrial scheduler simply does a 
> lapsed on the tc
> and then adds the bc.   This relies on the scheduler being called within the 
> tc window
> though.  It would be nice to have this as a configurable option since it's 
> much for efficient
> assuming the infra code from which it's called can guarantee the calling 
> frequency.

This is probably feedback for librte_sched as opposed to the current API 
proposal, as the
Latter is intended to be generic/implementation-agnostic and therefor its scope 
far
exceeds the existing set of librte_sched features.

Btw, we do plan using the librte_sched feature as the default fall-back when 
the HW
ethdev is not scheduler-enabled, as well as the implementation of choice for a 
lot of
use-cases where it fits really well, so we do have to continue evolve and 
improve
librte_sched feature-wise and performance-wise.


> I hope you'll consider these points for inclusion into a future road map.  
> Hopefully in the
> future my employer will increase the priority of some of the tasks and a PR 
> may appear
> on the mailing list.

> Thanks,
> Alan.

Reply via email to