Re: [RFC v2 net-next 06/10] net/sched: Introduce the TBS Qdisc

2018-01-23 Thread Jesus Sanchez-Palencia
Hi,


On 01/18/2018 05:35 AM, Jamal Hadi Salim wrote:
> On 18-01-17 06:06 PM, Jesus Sanchez-Palencia wrote:
>> From: Vinicius Costa Gomes 
>>
>> TBS (Time Based Scheduler) uses the information added earlier in this
>> series (the socket option SO_TXTIME and the new role of
>> sk_buff->tstamp) to schedule traffic transmission based on absolute
>> time.
>>
>> For some workloads, just bandwidth enforcement is not enough, and
>> precise control of the transmission of packets is necessary.
>>
>> Example:
>>
>> $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \
> 
> handle 100:0 ?
> 
>>     map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0
>>
>> $ tc qdisc add dev enp2s0 parent 100:1 tbs delta 6 clockid 11 offload 1
>>
>>
>> In this example, the Qdisc will try to enable offloading (offload 1)
>> the control of the transmission time to the network adapter, the
>> time stamp in socket are in reference to the clockid '11' (CLOCK_TAI)
>> and packets leave the Qdisc "delta" (6) nanoseconds before its
>> transmission time.
>>
> 
> 
>> When offloading is disabled, the network adapter will ignore the
>> sk_buff time stamp, and so, the transmission time will be only "best
>> effort" from the Qdisc.
>>
> 
> General comments:
> 1) iproute2: Avoid magic numbers like 1 or 11 please; "offload"
> (without 1) and "TAI" will be more human friendly.


Sure, we'll change both parameters.


> 
> 2) Experience shows that adding padding fields in the control structs
> implies they will _never ever_ be used. That was not design intent
> for netlink but over years shit like that has happened.
> Maybe look at using a 32 bitmap? It is more "future proof".
> You seem to only have 2-3 flags but it gives you opportunity
> to add more changes later. If you are 100% sure youll never need
> it - then maybe just move the tc_tbs_qopt::offload to the end of
> of the struct.


Ok, we are looking into it.


> 
> 3)It would be helpful for debugging to increment some stats other
> than drop counters on enqueu/dequeue obsolete packet drop. Maybe use
> overlimits for the dequeu drops (in addition)?


Yes, sure, that's a good idea.


> 
> 4) I may be misreading things - but did you need to reset the
> watchdog on dequeue? It is already being kicked for every incoming packet.


The watchdog timer must always be set to the next deadline. Since both enqueue()
and dequeue() modify the tree structure and may have it rebalanced, we need to
make sure the watchdog has been 're-set' to the next deadline. i.e. After a
dequeue, we need to re-arm the timer for the next head of the queue, so that is
why. Does that make sense?

I'm now thinking that naming that helper as reset_watchdog() can be misleading,
so we are considering naming it as rearm_watchdog() instead.


Thanks,
Jesus


> 
> cheers,
> jamal


Re: [RFC v2 net-next 06/10] net/sched: Introduce the TBS Qdisc

2018-01-23 Thread Jesus Sanchez-Palencia
Hi,


On 01/18/2018 05:44 AM, Jamal Hadi Salim wrote:
> One more comment:
> Probably try to run a test with a very small delta with
> no offload (probably using something like prio as the root qdisc)
> and dump the stats.
> My gut feeling is your accounting of the backlog in particular is off.


You were right, thanks. It'll be fixed on our next version.

Regards,
Jesus


> 
> cheers,
> jamal
> 
> On 18-01-18 08:35 AM, Jamal Hadi Salim wrote:
>> On 18-01-17 06:06 PM, Jesus Sanchez-Palencia wrote:
>>> From: Vinicius Costa Gomes 
>>>
>>> TBS (Time Based Scheduler) uses the information added earlier in this
>>> series (the socket option SO_TXTIME and the new role of
>>> sk_buff->tstamp) to schedule traffic transmission based on absolute
>>> time.
>>>
>>> For some workloads, just bandwidth enforcement is not enough, and
>>> precise control of the transmission of packets is necessary.
>>>
>>> Example:
>>>
>>> $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \
>>
>> handle 100:0 ?
>>
>>>     map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0
>>>
>>> $ tc qdisc add dev enp2s0 parent 100:1 tbs delta 6 clockid 11 offload 1
>>>
>>>
>>> In this example, the Qdisc will try to enable offloading (offload 1)
>>> the control of the transmission time to the network adapter, the
>>> time stamp in socket are in reference to the clockid '11' (CLOCK_TAI)
>>> and packets leave the Qdisc "delta" (6) nanoseconds before its
>>> transmission time.
>>>
>>
>>
>>> When offloading is disabled, the network adapter will ignore the
>>> sk_buff time stamp, and so, the transmission time will be only "best
>>> effort" from the Qdisc.
>>>
>>
>> General comments:
>> 1) iproute2: Avoid magic numbers like 1 or 11 please; "offload"
>> (without 1) and "TAI" will be more human friendly.
>>
>> 2) Experience shows that adding padding fields in the control structs
>> implies they will _never ever_ be used. That was not design intent
>> for netlink but over years shit like that has happened.
>> Maybe look at using a 32 bitmap? It is more "future proof".
>> You seem to only have 2-3 flags but it gives you opportunity
>> to add more changes later. If you are 100% sure youll never need
>> it - then maybe just move the tc_tbs_qopt::offload to the end of
>> of the struct.
>>
>> 3)It would be helpful for debugging to increment some stats other
>> than drop counters on enqueu/dequeue obsolete packet drop. Maybe use
>> overlimits for the dequeu drops (in addition)?
>>
>> 4) I may be misreading things - but did you need to reset the
>> watchdog on dequeue? It is already being kicked for every incoming packet.
>>
>> cheers,
>> jamal
> 


Re: [RFC v2 net-next 06/10] net/sched: Introduce the TBS Qdisc

2018-01-19 Thread Willem de Bruijn
On Wed, Jan 17, 2018 at 6:06 PM, Jesus Sanchez-Palencia
 wrote:
> From: Vinicius Costa Gomes 
>
> TBS (Time Based Scheduler) uses the information added earlier in this
> series (the socket option SO_TXTIME and the new role of
> sk_buff->tstamp) to schedule traffic transmission based on absolute
> time.
>
> For some workloads, just bandwidth enforcement is not enough, and
> precise control of the transmission of packets is necessary.
>
> Example:
>
> $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \
>map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0
>
> $ tc qdisc add dev enp2s0 parent 100:1 tbs delta 6 clockid 11 offload 1
>
> In this example, the Qdisc will try to enable offloading (offload 1)
> the control of the transmission time to the network adapter, the
> time stamp in socket are in reference to the clockid '11' (CLOCK_TAI)
> and packets leave the Qdisc "delta" (6) nanoseconds before its
> transmission time.
>
> When offloading is disabled, the network adapter will ignore the
> sk_buff time stamp, and so, the transmission time will be only "best
> effort" from the Qdisc.
>
> Signed-off-by: Vinicius Costa Gomes 
> Signed-off-by: Jesus Sanchez-Palencia 
> ---

> +static struct sk_buff *timerqueue_erase(struct Qdisc *sch,
> +   struct sk_buff *skb, bool drop)
> +{
> +   struct tbs_sched_data *q = qdisc_priv(sch);
> +
> +   rb_erase(>rbnode, >head);
> +
> +   if (drop) {
> +   struct sk_buff *to_free = NULL;
> +
> +   qdisc_drop(skb, sch, _free);
> +   kfree_skb_list(to_free);
> +   } else {
> +   qdisc_qstats_backlog_dec(sch, skb);
> +   qdisc_bstats_update(sch, skb);
> +
> +   q->last = skb->tstamp;
> +   }
> +
> +   sch->q.qlen--;
> +
> +   /* The rbnode field in the skb re-uses these fields, now that
> +* we are done with the rbnode, reset them.
> +*/
> +   skb->next = NULL;
> +   skb->prev = NULL;
> +   skb->dev = qdisc_dev(sch);
> +
> +   return skb;
> +}

Return value is not used in either caller.


Re: [RFC v2 net-next 06/10] net/sched: Introduce the TBS Qdisc

2018-01-18 Thread Richard Cochran
On Thu, Jan 18, 2018 at 08:35:18AM -0500, Jamal Hadi Salim wrote:
> 1) iproute2: Avoid magic numbers like 1 or 11 please; "offload"
> (without 1) and "TAI" will be more human friendly.

Yes, and for the clockid, the program should accept CLOCK_REALTIME or
CLOCK_TAI for the hard coded SYS-V IDs or /dev/ptp0 for dynamic IDs.

Thanks,
Richard


Re: [RFC v2 net-next 06/10] net/sched: Introduce the TBS Qdisc

2018-01-18 Thread Jamal Hadi Salim

One more comment:
Probably try to run a test with a very small delta with
no offload (probably using something like prio as the root qdisc)
and dump the stats.
My gut feeling is your accounting of the backlog in particular is off.

cheers,
jamal

On 18-01-18 08:35 AM, Jamal Hadi Salim wrote:

On 18-01-17 06:06 PM, Jesus Sanchez-Palencia wrote:

From: Vinicius Costa Gomes 

TBS (Time Based Scheduler) uses the information added earlier in this
series (the socket option SO_TXTIME and the new role of
sk_buff->tstamp) to schedule traffic transmission based on absolute
time.

For some workloads, just bandwidth enforcement is not enough, and
precise control of the transmission of packets is necessary.

Example:

$ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \


handle 100:0 ?


    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0

$ tc qdisc add dev enp2s0 parent 100:1 tbs delta 6 clockid 11 
offload 1



In this example, the Qdisc will try to enable offloading (offload 1)
the control of the transmission time to the network adapter, the
time stamp in socket are in reference to the clockid '11' (CLOCK_TAI)
and packets leave the Qdisc "delta" (6) nanoseconds before its
transmission time.





When offloading is disabled, the network adapter will ignore the
sk_buff time stamp, and so, the transmission time will be only "best
effort" from the Qdisc.



General comments:
1) iproute2: Avoid magic numbers like 1 or 11 please; "offload"
(without 1) and "TAI" will be more human friendly.

2) Experience shows that adding padding fields in the control structs
implies they will _never ever_ be used. That was not design intent
for netlink but over years shit like that has happened.
Maybe look at using a 32 bitmap? It is more "future proof".
You seem to only have 2-3 flags but it gives you opportunity
to add more changes later. If you are 100% sure youll never need
it - then maybe just move the tc_tbs_qopt::offload to the end of
of the struct.

3)It would be helpful for debugging to increment some stats other
than drop counters on enqueu/dequeue obsolete packet drop. Maybe use
overlimits for the dequeu drops (in addition)?

4) I may be misreading things - but did you need to reset the
watchdog on dequeue? It is already being kicked for every incoming packet.

cheers,
jamal




Re: [RFC v2 net-next 06/10] net/sched: Introduce the TBS Qdisc

2018-01-18 Thread Jamal Hadi Salim

On 18-01-17 06:06 PM, Jesus Sanchez-Palencia wrote:

From: Vinicius Costa Gomes 

TBS (Time Based Scheduler) uses the information added earlier in this
series (the socket option SO_TXTIME and the new role of
sk_buff->tstamp) to schedule traffic transmission based on absolute
time.

For some workloads, just bandwidth enforcement is not enough, and
precise control of the transmission of packets is necessary.

Example:

$ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \


handle 100:0 ?


map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0

$ tc qdisc add dev enp2s0 parent 100:1 tbs delta 6 clockid 11 offload 1


In this example, the Qdisc will try to enable offloading (offload 1)
the control of the transmission time to the network adapter, the
time stamp in socket are in reference to the clockid '11' (CLOCK_TAI)
and packets leave the Qdisc "delta" (6) nanoseconds before its
transmission time.





When offloading is disabled, the network adapter will ignore the
sk_buff time stamp, and so, the transmission time will be only "best
effort" from the Qdisc.



General comments:
1) iproute2: Avoid magic numbers like 1 or 11 please; "offload"
(without 1) and "TAI" will be more human friendly.

2) Experience shows that adding padding fields in the control structs
implies they will _never ever_ be used. That was not design intent
for netlink but over years shit like that has happened.
Maybe look at using a 32 bitmap? It is more "future proof".
You seem to only have 2-3 flags but it gives you opportunity
to add more changes later. If you are 100% sure youll never need
it - then maybe just move the tc_tbs_qopt::offload to the end of
of the struct.

3)It would be helpful for debugging to increment some stats other
than drop counters on enqueu/dequeue obsolete packet drop. Maybe use
overlimits for the dequeu drops (in addition)?

4) I may be misreading things - but did you need to reset the
watchdog on dequeue? It is already being kicked for every incoming packet.

cheers,
jamal