Sent from my iPhone
> On Aug 1, 2018, at 3:48 PM, [email protected] wrote: > > Send Cake mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.bufferbloat.net/listinfo/cake > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Cake digest..." > > > Today's Topics: > > 1. passing args to bpf programs (Dave Taht) > 2. Re: passing args to bpf programs (Stephen Hemminger) > 3. Re: passing args to bpf programs (Dave Taht) > 4. Re: passing args to bpf programs (Jonathan Morton) > 5. Re: passing args to bpf programs (Dave Taht) > 6. Re: passing args to bpf programs (Dave Taht) > 7. codel in ebpf? (Dave Taht) > 8. fq_codel on netronome's NICs? (Dave Taht) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 1 Aug 2018 09:22:41 -0700 > From: Dave Taht <[email protected]> > To: Cake List <[email protected]> > Subject: [Cake] passing args to bpf programs > Message-ID: > <CAA93jw4YyAfgyFX-6_HTMaCdhfsWVt=v3eq5uuzh78wuunv...@mail.gmail.com> > Content-Type: text/plain; charset="UTF-8" > > this really isn't the right list for this... but I wanted to build on > the ack_filter bpf code I had, to create impairments, like dropping > acks every X packets, or randomly, or when a specific pattern is seen > (like timestamps or sack). This was sort of the reverse complement to > getting the cake ack-filter right, now that I know everything that can > go wrong... > > I see I can return ACT_SHOT, so I can drop packets. > > But what I can't quite figure out is how to pass args to an tc ebpf > program. Do I have to pass those via a file descriptor? A map > generated elsewhere? what? Sure as heck don't want to compile one > program per opt.... > > Simplest args would be: > > max 16 - drop every 16th ack packet > random 24 - drop randomly between 0 24 > match only certain flags > > followed by more gnarly ones like: > > miscalculate if I have a payload or not > drop sack > mangle timestamps > > -- > > Dave Täht > CEO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-669-226-2619 > > > ------------------------------ > > Message: 2 > Date: Wed, 1 Aug 2018 09:35:22 -0700 > From: Stephen Hemminger <[email protected]> > To: Dave Taht <[email protected]> > Cc: Cake List <[email protected]> > Subject: Re: [Cake] passing args to bpf programs > Message-ID: <20180801093522.22c1f043@xeon-e3> > Content-Type: text/plain; charset=US-ASCII > > On Wed, 1 Aug 2018 09:22:41 -0700 > Dave Taht <[email protected]> wrote: > >> this really isn't the right list for this... but I wanted to build on >> the ack_filter bpf code I had, to create impairments, like dropping >> acks every X packets, or randomly, or when a specific pattern is seen >> (like timestamps or sack). This was sort of the reverse complement to >> getting the cake ack-filter right, now that I know everything that can >> go wrong... >> >> I see I can return ACT_SHOT, so I can drop packets. >> >> But what I can't quite figure out is how to pass args to an tc ebpf >> program. Do I have to pass those via a file descriptor? A map >> generated elsewhere? what? Sure as heck don't want to compile one >> program per opt.... >> >> Simplest args would be: >> >> max 16 - drop every 16th ack packet >> random 24 - drop randomly between 0 24 >> match only certain flags >> >> followed by more gnarly ones like: >> >> miscalculate if I have a payload or not >> drop sack >> mangle timestamps >> > > With Xnetem, I ended up creating a map of config options. > > > ------------------------------ > > Message: 3 > Date: Wed, 1 Aug 2018 09:36:32 -0700 > From: Dave Taht <[email protected]> > To: Cake List <[email protected]> > Subject: Re: [Cake] passing args to bpf programs > Message-ID: > <CAA93jw6yZ3=Nrt6uRVp=c94TfXWt4q5bwaUM=eq1k4dongg...@mail.gmail.com> > Content-Type: text/plain; charset="UTF-8" > > A somewhat related goal would be to apply the codel algorithm via bpf. > We'd take advantage of hardware > multiqueue for the fq part, ensure a good timestamp always existed on > all ingress ports, check it on egress. > > The one major loop in codel we could unroll to be a fixed unroll (and > just give up), and we're done there. > > > ------------------------------ > > Message: 4 > Date: Wed, 1 Aug 2018 19:42:02 +0300 > From: Jonathan Morton <[email protected]> > To: Dave Taht <[email protected]> > Cc: Cake List <[email protected]> > Subject: Re: [Cake] passing args to bpf programs > Message-ID: <[email protected]> > Content-Type: text/plain; charset=us-ascii > >> On 1 Aug, 2018, at 7:36 pm, Dave Taht <[email protected]> wrote: >> >> The one major loop in codel we could unroll to be a fixed unroll (and >> just give up), and we're done there. > > The COBALT version only has a loop in the recovery phase, and that mainly to > handle long pauses immediately following heavy congestion. The idle and > marking phases do not loop. > > - Jonathan Morton > > > > ------------------------------ > > Message: 5 > Date: Wed, 1 Aug 2018 09:54:02 -0700 > From: Dave Taht <[email protected]> > To: Jonathan Morton <[email protected]> > Cc: Cake List <[email protected]> > Subject: Re: [Cake] passing args to bpf programs > Message-ID: > <CAA93jw5M3VeL0Q3NeDg7YphrcTp7e=zruf1h5yh4lhrlawe...@mail.gmail.com> > Content-Type: text/plain; charset="UTF-8" > > the other thing I noticed while fiddling with bql and cake unshaped is > that bql, too, had gained the ability to limit rates at mbit > granularity, when I wasn't looking. I am not sure if additional > hardware support is required, but: > > https://patchwork.ozlabs.org/patch/449002/ > > >> On Wed, Aug 1, 2018 at 9:42 AM Jonathan Morton <[email protected]> wrote: >> >>> On 1 Aug, 2018, at 7:36 pm, Dave Taht <[email protected]> wrote: >>> >>> The one major loop in codel we could unroll to be a fixed unroll (and >>> just give up), and we're done there. >> >> The COBALT version only has a loop in the recovery phase, and that mainly to >> handle long pauses immediately following heavy congestion. The idle and >> marking phases do not loop. >> >> - Jonathan Morton >> > > > -- > > Dave Täht > CEO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-669-226-2619 > > > ------------------------------ > > Message: 6 > Date: Wed, 1 Aug 2018 10:25:52 -0700 > From: Dave Taht <[email protected]> > To: Jonathan Morton <[email protected]> > Cc: Cake List <[email protected]> > Subject: Re: [Cake] passing args to bpf programs > Message-ID: > <caa93jw4_jmggeuukfylbdylf14vp5hfcotmhltvhwtf98mn...@mail.gmail.com> > Content-Type: text/plain; charset="UTF-8" > > I wonder if ebpf has opcode space for an invsqrt? > > > ------------------------------ > > Message: 7 > Date: Wed, 1 Aug 2018 12:20:46 -0700 > From: Dave Taht <[email protected]> > To: Jonathan Morton <[email protected]> > Cc: Cake List <[email protected]> > Subject: [Cake] codel in ebpf? > Message-ID: > <CAA93jw5RvZSj3JBE9pS=se29vsznp9zonqzlbre07xa3qaq...@mail.gmail.com> > Content-Type: text/plain; charset="UTF-8" > >> On Wed, Aug 1, 2018 at 10:25 AM Dave Taht <[email protected]> wrote: >> >> I wonder if ebpf has opcode space for an invsqrt? > > bpf_ktime_get_ns() exists... > > one thing that I don't know if bpf can do is read/write the > skb->tstamp field. The plan would be to rigorously write it (if not > supplied by hw) on all ingress ports and check it on all egress ports. > > That said, every time I've tried to do something in ebpf I hit a > limitation I'd not thunk of yet. For example, where can you attach the > egress filter? > > My thought would be to use a bfifo > bpf -> bql, but from what little I > understand, it's bpf -> bfifo -> bql > > -- > > Dave Täht > CEO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-669-226-2619 > > > ------------------------------ > > Message: 8 > Date: Wed, 1 Aug 2018 12:48:58 -0700 > From: Dave Taht <[email protected]> > To: [email protected], Cake List > <[email protected]>, [email protected] > Cc: Jakub Kicinski <[email protected]> > Subject: [Cake] fq_codel on netronome's NICs? > Message-ID: > <caa93jw6l6f19raemnyz5yxl7q_3vqoipzr-0uqurqjsfsef...@mail.gmail.com> > Content-Type: text/plain; charset="UTF-8" > > Being kind of inspired by all the tricks > https://homes.cs.washington.edu/~arvind/papers/afq.pdf used on the > cavium, I went looking for other smart nics to play with. > https://open-nfp.org/resources/ looked interesting so I pinged them... > > from netronome: > > "I think it would be feasible to implement fq_codel on the NFP. > > The hardware schedulers do not support fq_codel, so the schedulers > would have to be implemented in one of the NFP firmware languages > (e.g. micro-C or micro-code); the NFP hardware rings could be used for > the queueing mechanism. Practically, this may be one way of making it > work: > > The main worker threads could calculate the flow hash in order to > select which ring should be used, and then issue the packet to a > re-ordering thread. > I believe the re-ordering thread can push the packets to the internal > NFP rings instead of the wire. > The scheduler thread could then make the scheduling decision, pop the > packet from the corresponding ring, then send the packet to the > hardware packet schedulers (or drop the packet if performing a > head-drop), and also check the timestamp for the CoDel portion of the > algorithm. > The hardware packet schedulers should then transmit the packet. > > > In terms of handling any rate-mismatch on the outgoing interface, you > could have another thread monitor the NFP hardware packet scheduler > queue levels. The scheduler thread can then throttle the packet rate > being sent to the hardware packet schedulers (unless of course it is > okay to tail-drop at the hardware packet scheduler queues). > > Finally, if the outgoing interface is not the natural point of > congestion/rate mis-match (e.g. if the outgoing Ethernet interface is > attached to a cable/DLS modem), the NFP hardware does have some > support for rate-limiting the outgoing interface (e.g. limiting a 10 > Gigabit Ethernet interface down to 600 Mbps outbound), so as to move > the congestion/rate mis-match point to the NFP, so that fq_codel can > take effect in terms of handling the buffer bloat." > > -- > > Dave Täht > CEO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-669-226-2619 > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > Cake mailing list > [email protected] > https://lists.bufferbloat.net/listinfo/cake > > > ------------------------------ > > End of Cake Digest, Vol 41, Issue 3 > *********************************** _______________________________________________ Cake mailing list [email protected] https://lists.bufferbloat.net/listinfo/cake
