Re: [Cake] Cake paper accepted at IEEE LANMAN

2018-05-25 Thread Marcelo Ricardo Leitner
On Fri, May 25, 2018 at 04:02:47PM +0200, Toke Høiland-Jørgensen wrote:
> Hey everyone
> 
> Just a note to let you all know that the paper describing Cake has been
> accepted and will be presented at the IEEE LANMAN conference next month.

Congrats! \o/

> I'm attaching the final version of the paper. The arxiv.org listing
> should be updated once it's gone through the submission process (i.e.,
> on Monday). So if you want to link to it, use this:
> https://arxiv.org/abs/1804.07617
> 
> -Toke
> 


> ___
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake

___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] [PATCH net-next v14 6/7] sch_cake: Add overhead compensation support to the rate shaper

2018-05-22 Thread Marcelo Ricardo Leitner
On Tue, May 22, 2018 at 10:44:53AM +0200, Toke Høiland-Jørgensen wrote:
> 
> 
> On 22 May 2018 01:45:13 CEST, Marcelo Ricardo Leitner 
> <marcelo.leit...@gmail.com> wrote:
> >On Mon, May 21, 2018 at 10:35:58PM +0200, Toke Høiland-Jørgensen wrote:
> >> +static u32 cake_overhead(struct cake_sched_data *q, const struct
> >sk_buff *skb)
> >> +{
> >> +  const struct skb_shared_info *shinfo = skb_shinfo(skb);
> >> +  unsigned int hdr_len, last_len = 0;
> >> +  u32 off = skb_network_offset(skb);
> >> +  u32 len = qdisc_pkt_len(skb);
> >> +  u16 segs = 1;
> >> +
> >> +  q->avg_netoff = cake_ewma(q->avg_netoff, off << 16, 8);
> >> +
> >> +  if (!shinfo->gso_size)
> >> +  return cake_calc_overhead(q, len, off);
> >> +
> >> +  /* borrowed from qdisc_pkt_len_init() */
> >> +  hdr_len = skb_transport_header(skb) - skb_mac_header(skb);
> >> +
> >> +  /* + transport layer */
> >> +  if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 |
> >> +  SKB_GSO_TCPV6))) {
> >> +  const struct tcphdr *th;
> >> +  struct tcphdr _tcphdr;
> >> +
> >> +  th = skb_header_pointer(skb, skb_transport_offset(skb),
> >> +  sizeof(_tcphdr), &_tcphdr);
> >> +  if (likely(th))
> >> +  hdr_len += __tcp_hdrlen(th);
> >> +  } else {
> >
> >I didn't see some code limiting GSO packets to just TCP or UDP. Is it
> >safe to assume that this packet is an UDP one, and not SCTP or ESP,
> >for example?
> 
> As the comment says, I nicked this from the qdisc init code.
> So I assume it's safe? :)

As long as it doesn't go further than this, it is. As in, it is just
validating if it can contain an UDP header, and if so, account for its
size, without actually reading the header.

Considering everything !TCP as UDP work as an approximation, which is
quite accurate. SCTP header is just 4 bytes bigger than UDP header and
is equal to ESP header size.

> 
> >> +  struct udphdr _udphdr;
> >> +
> >> +  if (skb_header_pointer(skb, skb_transport_offset(skb),
> >> + sizeof(_udphdr), &_udphdr))
> >> +  hdr_len += sizeof(struct udphdr);
> >> +  }
> >> +
> >> +  if (unlikely(shinfo->gso_type & SKB_GSO_DODGY))
> >> +  segs = DIV_ROUND_UP(skb->len - hdr_len,
> >> +  shinfo->gso_size);
> >> +  else
> >> +  segs = shinfo->gso_segs;
> >> +
> >> +  len = shinfo->gso_size + hdr_len;
> >> +  last_len = skb->len - shinfo->gso_size * (segs - 1);
> >> +
> >> +  return (cake_calc_overhead(q, len, off) * (segs - 1) +
> >> +  cake_calc_overhead(q, last_len, off));
> >> +}
> >> +
> 
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] [PATCH net-next v14 6/7] sch_cake: Add overhead compensation support to the rate shaper

2018-05-21 Thread Marcelo Ricardo Leitner
On Mon, May 21, 2018 at 10:35:58PM +0200, Toke Høiland-Jørgensen wrote:
> +static u32 cake_overhead(struct cake_sched_data *q, const struct sk_buff 
> *skb)
> +{
> + const struct skb_shared_info *shinfo = skb_shinfo(skb);
> + unsigned int hdr_len, last_len = 0;
> + u32 off = skb_network_offset(skb);
> + u32 len = qdisc_pkt_len(skb);
> + u16 segs = 1;
> +
> + q->avg_netoff = cake_ewma(q->avg_netoff, off << 16, 8);
> +
> + if (!shinfo->gso_size)
> + return cake_calc_overhead(q, len, off);
> +
> + /* borrowed from qdisc_pkt_len_init() */
> + hdr_len = skb_transport_header(skb) - skb_mac_header(skb);
> +
> + /* + transport layer */
> + if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 |
> + SKB_GSO_TCPV6))) {
> + const struct tcphdr *th;
> + struct tcphdr _tcphdr;
> +
> + th = skb_header_pointer(skb, skb_transport_offset(skb),
> + sizeof(_tcphdr), &_tcphdr);
> + if (likely(th))
> + hdr_len += __tcp_hdrlen(th);
> + } else {

I didn't see some code limiting GSO packets to just TCP or UDP. Is it
safe to assume that this packet is an UDP one, and not SCTP or ESP,
for example?

> + struct udphdr _udphdr;
> +
> + if (skb_header_pointer(skb, skb_transport_offset(skb),
> +sizeof(_udphdr), &_udphdr))
> + hdr_len += sizeof(struct udphdr);
> + }
> +
> + if (unlikely(shinfo->gso_type & SKB_GSO_DODGY))
> + segs = DIV_ROUND_UP(skb->len - hdr_len,
> + shinfo->gso_size);
> + else
> + segs = shinfo->gso_segs;
> +
> + len = shinfo->gso_size + hdr_len;
> + last_len = skb->len - shinfo->gso_size * (segs - 1);
> +
> + return (cake_calc_overhead(q, len, off) * (segs - 1) +
> + cake_calc_overhead(q, last_len, off));
> +}
> +
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] [PATCH net-next v14 4/7] sch_cake: Add NAT awareness to packet classifier

2018-05-21 Thread Marcelo Ricardo Leitner
[Cc'ing netfilter-devel@ for awareness]

On Mon, May 21, 2018 at 10:35:58PM +0200, Toke Høiland-Jørgensen wrote:
> When CAKE is deployed on a gateway that also performs NAT (which is a
> common deployment mode), the host fairness mechanism cannot distinguish
> internal hosts from each other, and so fails to work correctly.
> 
> To fix this, we add an optional NAT awareness mode, which will query the
> kernel conntrack mechanism to obtain the pre-NAT addresses for each packet
> and use that in the flow and host hashing.
> 
> When the shaper is enabled and the host is already performing NAT, the cost
> of this lookup is negligible. However, in unlimited mode with no NAT being
> performed, there is a significant CPU cost at higher bandwidths. For this
> reason, the feature is turned off by default.
> 
> Signed-off-by: Toke Høiland-Jørgensen 
> ---
>  net/sched/sch_cake.c |   79 
> ++
>  1 file changed, 79 insertions(+)
> 
> diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
> index 92623160d43e..04364993ce19 100644
> --- a/net/sched/sch_cake.c
> +++ b/net/sched/sch_cake.c
> @@ -71,6 +71,12 @@
>  #include 
>  #include 
>  
> +#if IS_REACHABLE(CONFIG_NF_CONNTRACK)
> +#include 
> +#include 
> +#include 
> +#endif
> +
>  #define CAKE_SET_WAYS (8)
>  #define CAKE_MAX_TINS (8)
>  #define CAKE_QUEUES (1024)
> @@ -516,6 +522,60 @@ static bool cobalt_should_drop(struct cobalt_vars *vars,
>   return drop;
>  }
>  
> +#if IS_REACHABLE(CONFIG_NF_CONNTRACK)
> +
> +static void cake_update_flowkeys(struct flow_keys *keys,
> +  const struct sk_buff *skb)
> +{
> + const struct nf_conntrack_tuple *tuple;
> + enum ip_conntrack_info ctinfo;
> + struct nf_conn *ct;
> + bool rev = false;
> +
> + if (tc_skb_protocol(skb) != htons(ETH_P_IP))
> + return;
> +
> + ct = nf_ct_get(skb, );
> + if (ct) {
> + tuple = nf_ct_tuple(ct, CTINFO2DIR(ctinfo));
> + } else {
> + const struct nf_conntrack_tuple_hash *hash;
> + struct nf_conntrack_tuple srctuple;
> +
> + if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb),
> +NFPROTO_IPV4, dev_net(skb->dev),
> +))
> + return;
> +
> + hash = nf_conntrack_find_get(dev_net(skb->dev),
> +  _ct_zone_dflt,
> +  );
> + if (!hash)
> + return;
> +
> + rev = true;
> + ct = nf_ct_tuplehash_to_ctrack(hash);
> + tuple = nf_ct_tuple(ct, !hash->tuple.dst.dir);
> + }
> +
> + keys->addrs.v4addrs.src = rev ? tuple->dst.u3.ip : tuple->src.u3.ip;
> + keys->addrs.v4addrs.dst = rev ? tuple->src.u3.ip : tuple->dst.u3.ip;
> +
> + if (keys->ports.ports) {
> + keys->ports.src = rev ? tuple->dst.u.all : tuple->src.u.all;
> + keys->ports.dst = rev ? tuple->src.u.all : tuple->dst.u.all;
> + }
> + if (rev)
> + nf_ct_put(ct);
> +}
> +#else
> +static void cake_update_flowkeys(struct flow_keys *keys,
> +  const struct sk_buff *skb)
> +{
> + /* There is nothing we can do here without CONNTRACK */
> +}
> +#endif
> +
>  /* Cake has several subtle multiple bit settings. In these cases you
>   *  would be matching triple isolate mode as well.
>   */
> @@ -543,6 +603,9 @@ static u32 cake_hash(struct cake_tin_data *q, const 
> struct sk_buff *skb,
>   skb_flow_dissect_flow_keys(skb, ,
>  FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL);
>  
> + if (flow_mode & CAKE_FLOW_NAT_FLAG)
> + cake_update_flowkeys(, skb);
> +
>   /* flow_hash_from_keys() sorts the addresses by value, so we have
>* to preserve their order in a separate data structure to treat
>* src and dst host addresses as independently selectable.
> @@ -1894,6 +1957,18 @@ static int cake_change(struct Qdisc *sch, struct 
> nlattr *opt,
>   if (err < 0)
>   return err;
>  
> + if (tb[TCA_CAKE_NAT]) {
> +#if IS_REACHABLE(CONFIG_NF_CONNTRACK)
> + q->flow_mode &= ~CAKE_FLOW_NAT_FLAG;
> + q->flow_mode |= CAKE_FLOW_NAT_FLAG *
> + !!nla_get_u32(tb[TCA_CAKE_NAT]);
> +#else
> + NL_SET_ERR_MSG_ATTR(extack, "No conntrack support in kernel",
> + tb[TCA_CAKE_NAT]);
> + return -EOPNOTSUPP;
> +#endif
> + }
> +
>   if (tb[TCA_CAKE_BASE_RATE64])
>   q->rate_bps = nla_get_u64(tb[TCA_CAKE_BASE_RATE64]);
>  
> @@ -2066,6 +2141,10 @@ static int cake_dump(struct Qdisc *sch, struct sk_buff 
> *skb)
>   if (nla_put_u32(skb, TCA_CAKE_ACK_FILTER, q->ack_filter))
>   goto nla_put_failure;
>  
> + if (nla_put_u32(skb, TCA_CAKE_NAT,
> + 

Re: [Cake] Testing wanted: Statistics API rework

2018-04-24 Thread Marcelo Ricardo Leitner
On Tue, Apr 24, 2018 at 11:30:35PM +0200, Toke Høiland-Jørgensen wrote:
> Hey everyone
>
> As you have probably seen, the posting of CAKE to netdev resulted in
> feedback saying that versioned structs are a no-go. Well, I've just
> pushed a change to both the cake and tc repos that changes the
> statistics reporting to used nested netlink attributes.
>
> If someone has time to test it (especially on mips!), that would be
> great; I think we still have a few days left of net-next being open, so

We should still have ~4 weeks of net-next open. It will close when
Linus releases the v4.17, which is when v4.18 merge window starts (and
thus why net-next gets closed).  Considering 1 -rc per week and at
least 4 more -rc's, ~4 weeks.

To be more precise, if that is possible, we can use the crystal ball
:-)
http://phb-crystal-ball.org/
Jun 10th.

> if it works for you, I'll resubmit tomorrow...
>
> -Toke
> ___
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] [PATCH 1/3] pkt_sched.h: add support for sch_cake API

2017-11-24 Thread Marcelo Ricardo Leitner
On Tue, Nov 21, 2017 at 04:58:54PM -0800, Dave Taht wrote:
> Stephen Hemminger  writes:
...
> > Also, when I see multiple arrays of same size. It seems the API should
> > be:
> > struct tc_cake_tin_stats {
> > __u32 threshold_rate;
> > __u32 target_us;
> > struct tc_cake_traffic_stats sent;
> > ...
> >
> > What if you want to change number of TINS, the ABI shouldn't have to change.
> 
> It is hard to imagine ever wanting more than 8 tins. Using up 3, sanely,
> has proven hard.

Which means each stats dump will almost always dump 5 empty TINS,
while if it were dynamic, it could dump only the 1, 2 or 3 in use and
probably help with the stats dump performance. Smaller slab, less
bytes to copy, ...

  Marcelo
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake


Re: [Cake] lan keyword affects host fairness

2017-11-24 Thread Marcelo Ricardo Leitner
On Fri, Nov 24, 2017 at 01:06:12PM +0100, Sebastian Moeller wrote:
> 
> > On Nov 24, 2017, at 12:21, Toke Høiland-Jørgensen  wrote:
> > 
> > Dave Taht  writes:
> > 
> >> Pete Heist  writes:
> >> 
> >>>On Nov 23, 2017, at 10:44 AM, Jonathan Morton  
> >>> wrote:
> >>> 
> >>>This is most likely an interaction of the AQM with Linux' scheduling
> >>>latency.
> >>> 
> >>>At the 'lan' setting, the time comstants are similar in magnitude to 
> >>> the
> >>>delays induced by Linux itself, so congestion might be signalled
> >>>prematurely. The flows will then become sparse and total throughput 
> >>> reduced,
> >>>leaving little or no back-pressure for the fairness logic to work 
> >>> against.
> >> 
> >> Agreed. 
> >> 
> >> man page add:
> >> 
> >> At the 'lan' setting(1ms), the time constants are similar in magnitude
> >> to the jitter in the Linux kernel itself, so congestion might be
> >> signalled prematurely. The flows will then become sparse and total
> >> throughput reduced, leaving little or no back-pressure for the fairness
> >> logic to work against. Use the "metro" setting for local lans unless you
> >> have a custom kernel.
> > 
> > Erm, doesn't this make the 'lan' keyword pretty much useless? So why not
> > just remove it? Or redefine it to something that actually works? 3ms?
> 
> The same applies for datacentre (0.1 ms), no? But I agree, let's not expose 
> these as explicit keywords, one can always use "rtt [100us|1ms]" I assume...

Which should also contain such disclaimer with it.
"Values smaller than 10ms requires special handling.", for example.

  Marcelo
___
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake