Hi Pim, On Sun, Mar 10, 2024 at 10:40:00PM +0100, Pim van Pelt wrote: > AS8298 has a ring from Zurich, Frankfurt, Amsterdam, Lille, Paris and > Geneva. Our links are carrier ethernet from a telco. Sometimes, when the > underlying links fail, the underlay MPLS network will recover and route > around the failed telco link, and I'll see latency on my own service go from > say 6ms ZRH-FRA to 40ms ZRH-FRA; I thought this would be an excellent use > case for Babel. > 1) Is there any advice you could offer for rtt cost/min/max/decay values > when using Bird2 ?
The defaults should be fine honestly. While I have recommended changing them in the past I fear I don't (yet) fully understand the impact of that change on stability so best to leave them as is for now. > 2) any words of wisdom before I move from an OSPF v2/v3 IGP to Babel in a > running network ? Does anybody on this list have operational experience to > share ? I run babel as IGP in my experimental network (AS212704) and I just want to make sure you're aware of some of the quirks and challanges when using babel outside it's original design domain (wireless/mesh networks). I've used both BIRD and babeld in my network. Right now I'm (grudgingly) running babeld. 1) Bird's proto/babel doesn't have good policy controls right now. Ff you need any sort of control over your IGP announcements for TE or what have you things might get tricky. I do have a patch ready to begin fixing that but unfortunately it's in limbo until BIRD v3 shakes out or we find some funding/motivation to push a port to v3 forward. Babeld does have (most of) the knobs I think you'd need but it's just not suitable for 24/7 operation outside of toy networks without major rework (sorry Juliusz!). For illustration: until one of my patches restarting babeld on a node would cause packet loss since it retracts the kernel routes without waiting for the rest of the network to get notified of this node going away https://github.com/jech/babeld/pull/102. Nobody noticing this before me tells you in what environment babeld has been deployed/tested so far. This wouldn't be a problem if we had hot reload ofc. but babeld is just not built for that. Without hot reload there's also no good way to put a node in maintainance mode without internal traffic disruption, the filters are not expressive enough to make policy writing easy and there are quirks I haven't even yet had the energy to write up or fix. 2) When a prefix is no longer reachable babel will insert an unreachable route for it until some timeout expires. I don't recall the details off hand but I'm sure Juliusz will jump in here ;) I'm not sure this is really a problem as such, it was just jarring when I saw this unexpected behaviour in my network and something to be aware of. My conclusion: if it aint a BIRD I ain't fly'in. While BIRD's babel implementation is just dandy from a quality perspective AFICT it just needs a bit more effort put in to get it ready for serious network operations. > By means of introduction, I'm Pim and in my spare time I work on fd.io > <http://fd.io>'s Vector Packet Processor. I've had my eyes on your VPP work for a while now. Very exciting especially with babel support on the horizon :) --Daniel _______________________________________________ Babel-users mailing list [email protected] https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
