This series tries to clean up the FSM/collision-detection logic somewhat, and make it bit more robust.
The primary functional changes of the series are: - Send Open immediately on newly accepted inbound connections. We should have been doing that, and not doing so increases the window for collisions to occur. Speed things up and hopefully make things more robust. - Ensure Established sessions are favoured, in collision detection (except for GR which for some reason doesn't want to just rely on KEEPALIVE). I'm not 100% sure, but it seems the old code would shut down the Established connection if a new connection sent an OPEN. Which seems unstable. - Ensure the collision detection logic shuts down the correct 'inbound' or 'outbound' connection. Previously, it assumed the 'new' argument was the 'inbound' and the looked-up 'peer' the 'outbound' but that doesn't seem 100% safe. If/when that isn't the case, bgpd could be inconsistent with other implementations and result in both shutting down the same connection. Should make things more robust. For the rest, I've tried to tidy and untangle the logic a bit. There is also a patch from Cumulus, from Dinesh, on the same subject: 'bgpd, tests, vtysh: Fix FSM to handle active/passive connections better' Which deletes the existing connection-transfer logic and recreates it elsewhere. It also adds peer state flags, some of which seem to overlap with or duplicate existing peer states - which then need further code to manage. I found it hard to follow some parts of it, hence this more mimimal attempt. All the items above, when I went back to look, the Cumulus patch seems to do too. Testing: I've tested this in a ring topology with a mix of the new code and previous Quagga release. It works. However, torture testing in a more mixed implementation environment would be good. In an ideal world, peer configuration and connection state would be cleanly separated. I tried hacking at that, but it'd require huge changes across the code - whether it's done by adding a 'connection state' object to (struct peer), or done by moving 'configuration state' out of (struct peer) somehow. Way too much churn to be worth it I think. _______________________________________________ Quagga-dev mailing list [email protected] https://lists.quagga.net/mailman/listinfo/quagga-dev
