On Tue, Sep 25, 2018 at 12:30 AM Christof Schulze <[email protected]> wrote: > > >> Hi Babel community, > > >> As I mentioned in another thread, I am curious about whether Babeld > >> can be adapted to work with global full-table. > > > >No. > > > >There is one long standing issue with merging from the kernel table > >that would benefit from a qsort. > > > >But you are going to > > > >A) run out of bandwidth - 785k routes = ~14,000 babel packets. I > >think that rounds to 1280 bytes/packet, > > so - and babel will want to announce these every 4 seconds - so > >call it 44mbits/sec? (feel free to check my math, it's friday). That's > >well above what I've ever seen wifi mcast in particular, achieve. And > >that's *per router*. > > > >To get there, the announcement interval would have to be increased up > >to at least a typical bgp interval (2 minutes) and even then... > > > >B) you run out of cpu - babeld uses linked lists, and tries to recalc > >bellman-ford every 4 seconds also. There's a need for a faster, safer > >kernel interface. > I do not see a reason why we could not change the data structures to > consume less CPU under the given scenario.
Well, I took a stab at some of that. In particular, babeld has to compare a lot of bytes, and while profiling it on these loads, was totally bottlenecked on memcmp. So I attempted to add sse2 and arm neon support to make that easier to xor on 128 bits at a time, and nearly eliminated memcmp with inline xor in general to find that the real cause of the bottleneck on that version of the benchmark was a dumb sort routine... Juliusz then rightly pointed out that a better algorithm for merging routes would be good... I'd rather index them... I'm also of the opinion we should announce routes over an increasingly larger interval based on the available bandwidth and number of speakers. And - setting a goal for 64k routes - I thought that switching to a normalized table structure internally would be useful. Instead of storing the nexthop as a full address store an 16 bit index pointing to an array of those nexthops. And so on. In the end I decided that making wifi into a suitable network substrate, again, with good latency under load and vastly improved multistation support, was the most worthwhile, before tackling mesh again. The first version of that is done and shipping for ath9k, ath10k, mt76 since openwrt 17.01. Ath10k with the ct firmware does adhoc. Toke and the other make-wifi-fast folk are busy trying to make it generic (for the iwl, now also), fq_codel's the default now on OSX... so wifi is looking vastly improved. There's still quite a few things left on the make-wifi-fast roadmap left to tackle. I haven't looked at this document since the last funding round collapsed. https://docs.google.com/document/d/1Se36svYE1Uzpppe1HWnEyat_sAGghB3kE285LElJBW4/edit Then 802.11s started to work... There were all sorts of other issues, getting into a dogfight with either odhcpd or network manager has always been a problem, address assignment for ipv6 is still a problem, We could write a roadmap for "better wifi mesh networking", but it starts with "more bodies". if we could just siphon off 1/10000 the effort going into 5G, we could get somewhere. My own network is decrepit, wlan-slovenia has mostly switched to 4G uplinks, sudomesh died (I think), and I didn't know freifunct was still alive. Is guifi still alive? > > >my rtod tests showed babeld typically falling over for any one of > >these four reasons in well under 4k routes on low end mips and arm > >hardware. Even the low end apu2 eats a whole cpu with about that many > >(ipv6) routes. > What are the other two reasons? * walking linked lists * running out of bandwidth * having to process a ton of packets per router * faster safer kernel interface * recalcing bellman-ford I note, incidentally, that my memory of that experiment was getting fuzzy. I'd had 8+ devices on that network, and also was regularly having odhcpd or network manager duke it out with babeld for cpu due to all the messages on the netlink bus. I just did the same experiment with only 5 devices on that scenario and kept the local network running ok (with both bird and babeld), on up to 16k routes, crashing a 64k box locally.. and crashing 3 nanostation/picostation devices a hop away that either ran out of memory or cpu, or both. Thankfully procd restarted two of em, the third was so old it didn't, and I just had to climb on the roof at 3 in the morning to reboot it) One of the huge advantages of vlans or 802.11s is the number of routes is *bounded* to a fairly low figure. A general purpose routing protocol has to somehow sanely start rejecting routes at some bound, for some definition of sane. > > >I made a few sloppy computational improvements and so on while > >developing the rtod test. Tried to upstraeam a few, my then-current > >employer wasn't happy with me working under anything but the apache > >license and didn't care, and I ran out of time and energy and have to > >admit I was hacking far more than programming - > > > >I think making some version of babel (be it bird or frr ) scale well > >to at least 64k routes would be a very good idea, > I agree. It's a nice defining goal for a project. 64k routes with 32 stations or bust! > >and once things now entering it like unicast, and crypto, are stable, > >it would be a GREAT thing to have a version that did that, but I fear > >it will involve parallizing hellos and bellman ford and per interface > >threads, changes to the protocol to adapt the interval to the bandwidth > >and cpu available, tcp friendly rate control (or swapping routes via > >tcp), etc, etc. > I agree, as routers become more powerful and even low-end devices are > emerging that feature multiple CPU cores, there might be a benefit. On > another note - just parallelizing any algorithm (not specific to babeld) > will only get you so far. The algorithm/data structures should be > optimized first. Heh. One of my targets for that attempt a few years back was a 1024 core parallella chip. When you have cores to burn like that you can come up with all sorts of crazy ideas. The 1024 core version never shipped. > > > > >and a whole suite of other cool things that nobody has time, energy, > >or sufficient programmers for. And it wouldn't be babeld anymore. > > > >Bird's version of babel should perform mildly better, as it has > >tighter code (xor rather than memcmp in one case I tried to upstream), > >and a few other better algorithms overall, but I suspect few besides > >me and john ( http://the-edge.taht.net/post/gilmores_list/ ) care > >enough about city-scale routing to get anywhere. > You are forgetting the Freifunk Communities in Germany. This is what > they do: building city-wide wifi mesh networks. Currently mostly with > segmented batman. Now that my patchset for babeld integration has been > merged in gluon (the framework which most communities use to build their > networks) the babeld technology is available to a wider audience for > their meshes. To get an impression on the size of the community, > https://www.freifunk-karte.de/ might be an interesting start. > Agreed, the development community is much smaller - still I can see > dozen or so people contributing to gluon. It certainly could be worse. yep. Personally I'd really like to get my hands into some 5G stuff, get my own base station, make private 5G possible, but I know that isn't going to happen... > > > >I should probably try to extract more patches from my misguided > >efforts, like this: > > > >https://github.com/dtaht/rabeld/commit/b74b4a6f9b532717ee93346963efd894e94615b3 > > > >and I had a bpf filter that helped a lot, and I sunk time into > >enabling sse and neon ins... The bpf filter was helpful in lowering the noise and cpu usage from odhcpd. > > > >but I was mostly hoping the unicast/crypto/etc stuff would land in one > >piece I could do all up testing on before tackling the scaling > >problems, on someone elses time. I ended up deciding that I wanted to > >rewrite it all from scratch, hit licensing and employer problems... > >and time..... > I would appreciate that. We are just starting another test network for a > city-wide mesh which will be based on babeld. Links that do not have Cool. Enable ecn. :) > wifi connectivity will be using wireguard as vpn. There is a significant > speed improvement of that tech stack over batman+fastd. Let's see how Wireguard is nice. It however can chew up memory with it's default queue size. > much pull this gets. In any case: 64K routes should be working on > current cheap routers in a network like that. Can you define a "current cheap router"? For example, I am sad to report the uap-lite-mesh APs have really terrible range compared to nanostation and picostations, and only have 8k of flash. Finding a decent outdoor mesh capable dual radio AP has been on my mind for years. Arm multicores are cheap, but I haven't seen any... > >> One of my environments uses BGP full-table from 3 upstream ISPs (each > >> with 785k routes currently). > >> +----------+ +----------+ +----------+ > >> |Customer A| |Customer B| |Customer C| > >> +----+-----+ +----+-----+ +----+-----+ > > >> +----+----+ +---+---+ +-----+-----+ > >> |Edge Asia|----|Edge US|---|Edge Europe| > >> ++-------++ +---+---+ +-----------+ > > >> +--+--+ +--+--+ +--+--+ > >> |ISP A| |ISP B| |ISP C| > >> +-----+ +-----+ +-----+ > >> Babeld would simply refuse to run on this environment, blocking the > >> whole network without converging, with 100% CPU utilization. > > > >Don't do that. It hurts when you do that. :) rtod is a way to get to > >overload more gently. > What would have to be done to get confirmation of Juliusz' theory wrt > the source of the load? > > Christof > > -- > () ascii ribbon campaign - against html e-mail > /\ against proprietary attachments > > _______________________________________________ > Babel-users mailing list > [email protected] > https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users -- Dave Täht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619 _______________________________________________ Babel-users mailing list [email protected] https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
