On Mon, Nov 12, 2018 at 12:49 PM Dave Taht <[email protected]> wrote: > > Juliusz Chroboczek <[email protected]> writes: > > >>>> + rc = setsockopt(s, SOL_SOCKET, SO_MAX_PACING_RATE, &rate, > >>>> sizeof(rate)); > > > >>> It's only effective on TCP sockets, and only when using the FQ scheduler. > > > >> I am under the impression that since linux 4.12 it works on udp, and I > >> forget when it started working outside the fq scheduler... > > > > Ah. > > > > Still, I think that we should be able to do pacing in userspace. At least > > in the no-churn case, we should be able to predict how many updates per > > unit of time we want to send, and spread them out across the update > > interval. > > Yes. I still kind of like finally leveraging the announcement interval > in the route update message to also spread things out... you can > announce earlier than the interval on a metric change but otherwise idle > along with routes persisting for say half to 1/3 the max. (11 minutes?) > - this is an update interval about equivalent to modern BGP. > > One patch I was fiddling with was to limit passes through the resend > routines much like MAX_BUFFERED_UPDATES does, so the main babel loop > gets a chance to do other things, like read new packets. > > What happens in "churn mode" reminds me a lot of classic dns/ntp > amplification attacks and trying to keep the ratio of packets in to > packets out lower is a good idea. > > So I quit after 64 send_multicast_multihop or send_update requests > and keep processing the loop forward from there incrementally deferring GC... > > Haven't benchmarked it yet... not happy with it... been busy doing other > things. > > Another patch basically adds something fq_codel-like to recv, so that > each call to bab_recv calls recv(args... MSG_NO_WAIT) 42 times or it > gets nothing back, sorts the input by src/dst IP hash and serves things > back to recv. Theoretically this keeps the RCV_BUF as drained as possible. > > It does bulk drops from the fattest queue when the internal packet limit > is exceeded, rather than codel, cause babel is not tcp-friendly as > yet. It serves up short (e.g. hello) packets from each speaker first and > short flows faster than long ones... so a big fat speaker can't drown > out the others.
The codel-ly bit is "just drop anything > 512 bytes older than 1sec". > > (this is where I wanted to reach for clang-format, and for purity's sake > I should go reuse the bsd version or rewrite from scratch - and it > totally doesn't work right now) > > the edgerouter can't push a gbit, but 1MByte/sec is totally doable from a > pps standpoint > > I think it will work, but it's more complicated than I'd like, > reinventing a ton of stuff in userspace. I'm watching the QUIC related > bind-connect work closely. > > An alternative idea is to put a skb->hash probalistic dropper into ebpf > for reads when we're in trouble, same rough idea, no FQ. > > > > > I'd need to read up on data structures, as I don't currently understand > > the tradeoffs between binary heaps and timer wheels. (Same goes for > > dealing with resends. And Christof suggested that we should modify the > > main event loop in babeld to use a proper data structure.) > > Yes, I kind of think that whatever happens to resend.* might end up > being a scheduling technique for more of babel itself. > > The timer wheel in the linux kernel is the best implementation I know > of, handling the thundering herd problem, sloppy timings (where you > really don't care if something happens 1ms or 2ms in the future, just do > everything in the range) and much else. There's plenty of others worth > looking at... but the discussions around that code have been endless for > a decade and informative. > > You'd schedule a hello and keep it updated til it went out, then reschedule. > Etc. > > when I was flailing at rabeld I added a good old fashioned "alarm" call > to make sure hellos went out and broke up major loops to check it > periodically. > > > > -- Juliusz -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 _______________________________________________ Babel-users mailing list [email protected] https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
