On Fri, Feb 24, 2023 at 07:41:03PM +0100, Juliusz Chroboczek wrote: > > I think I figured out whats going on: babeld immediately flushes the kernel > > routes it installed when shutting down, without waiting for neighbours to > > switch to a different path. > > Right. How long is the disruption?
It's not so much how long it is as it is that it's there at all. I don't want my network to drop packets on the floor unecessarily. > > I figure this has to be configurable option since full propagation of the > > retractions depends on the network diameter and there's no way in the > > protocol we can get acknowledgments from the entire network (AFAIK?), not > > just our immediate neighbours. > > We only signal the neighbours: this is distance-vector, so the neighbours > will start searching for a different route without any need for end-to-end > signalling. You're right ofc. as soon as we signal a neighbour they will divert traffic somewhere else, but iff. they have a feasible route at hand. I'm also concerned about blackholing in the "no feasible route" case. > Of course, if there are no feasible routes to a given destination, then > the neighbours will perform an end-to-end search for a loop-free route, > but that's the neghbours' problem, not ours. I can't say I agree with the "their problem" mentality. The way I see it during graceful shutdown we're still responsible for in-flight traffic anyway. We're also in a reasonable position to avoid dropping any traffic still about to be routed based on the assumption we're still alive and routing until our retraction propagates, why shouldn't we take advantage of that? In my mind it doesn't matter if babeld takes 500ms or 15sec to shutdown if that buys me a rock solid network. So my thinking is I'd like to know when everything has converged, since that isn't really a thing in DV as you note an ad-hoc delay is the next best thing I could think of. The note about the ACKs was simply supposed to be reasoning for why an ad-hoc delay rather than having neighbours ACK the retractions. > > https://github.com/jech/babeld/pull/102 > > Looks good to me. Just two comments: > > - should the granularity be lower? A second for local signalling is > a lot, I'd expect 300ms to be enough in most cases; I have no problem changing it to millisecond granularity if that suits you? > - why a goto rather than a loop? Oh you know how it goes: you make a decision then another, change that one and then unbeknownst to you the first one doesn't really make sense anymore ;) Will fix. --Daniel _______________________________________________ Babel-users mailing list [email protected] https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
