Hello,

re6stnet creates an ipv6 resilient network, using OpenVPN to link nodes that 
are not in the same LAN, and babeld for routing:
  http://git.erp5.org/gitweb/re6stnet.git

re6stnet limits the number of OpenVPN tunnels so we don't have a full-mesh and 
from time to time, it deletes one to create a new one to a random node. 
However, since the beginning (mid-2012), we have an issue with the destruction 
of tunnels because we couldn't make sure there were no route through the tunnel 
being destroyed.

I didn't find any reliable way to fix this without modifying babeld, so here 
are 2 patches for review:
  http://git.erp5.org/gitweb/babeld.git/shortlog/refs/heads/ctl
  commits d37e373 & bd1cf65


The first one (d37e373) implements a new way to communicate with babeld.
- It is done via a unix socket for security, but uses network byte ordering so 
it can easily be transformed into TCP with socat.
- It is fully asynchronous i.e. the processing of a packet does not do any IO. 
No hang or data lost if either side is slow.
- Binary protocol for reliability and simplicity. I didn't want to format and 
parse strings with regex.

The protocol uses simple TLV packets like between babeld nodes. L is 4 bytes 
though to possibility contain big amounts of data, which happens easily when 
dumping the routing table.

This first commit implements 1 command, to get the same information as 
dump_tables()
The request packet contains a few parameters to specify which information is 
wanted.

In fact, I even think this new interface should replace local.c


The second commit (bd1cf65) implements a second packet to alter the result of 
neighbour_cost: its returned value is multiplied by a new 'cost_multiplier' 
field in struct neighbour. cost_multiplier=0 forces neighbour_cost to return 
INFINITY

For the moment, we only use it to avoid routing via a specified neighbour (the 
other side of the tunnel to destroy). Algorithm is:
1. Node C (as client) decides to destroy a openvpn client tunnel that is 
connected to node S
2. C sends a packet to its babeld to increase cost to S
3. C requests dumps until no route go via S
4. C sends a second packet to its babeld to set cost_multiplier=0 for S, to 
make sure no route comes back. Note that the processing of such packet is 
atomic: the packet is ignored if there are still installed routes via S.
5. If there's still no route via S and if cost_multiplier coud be set to 0, C 
tells S to do the same thing on its side.
6. Same as steps 2,3,4 for S
7. If there's still no route via C and if cost_multiplier coud be set to 0, S 
replies to C that the tunnel can be deleted.
8. C deletes the tunnel

At any point the whole process is aborted if a step fails or takes too long. 
There's no point insisting: we can try deleting another tunnel.

Whether the tunnel could be destroyed or not, the process is ended as follows:
1. Wait some time.
2. Restore original cost_multiplier.

In the future, we may use this new packet to define different classes of nodes. 
RTT-based metric is great but not always enough. A node with low latency may be 
unreliable (for example crashing all the time). So we consider having a set of 
core trustworthy nodes.

In any case, neighbour costs could then vary a lot and I was a little annoyed 
by their small precision (2 bytes). Because most of the time, values starts at 
96 or 256, you can see that the result of neighbour_cost is also divided by 256 
(and the default value for cost_multiplier is 256). In other words, the 
cost_multiplier field codes a value between 1/256 and ~256 (+ inf).


re6stnet includes a demo using network namespaces. It simulates 9 nodes in a 
somewhat accelerated mode. Hello interval is 4 seconds. Each node creates at 
most 2 client tunnels and tries to delete a tunnel every 100 seconds.
We finally have code that does not lose a single packet after several days.


However, there's still a limitation. We are not always able to identify a 
neighbour. This happens when the direct route to a neighbour is not the best 
route (yes, we found such cases in China, or between China and Japon), and 
without keep-unfeasible, there's not always a route to the neighbour with 
refmetric=0 from which we can take the neigh address/ifindex.

In re6st, a node is only identified by the prefix it exports. The only place we 
use link-local IPv6 is for the SET_COST_MULTIPLIER packet and we get the 
information from babeld dumps. It would be quite heavy to get the 
address/ifindex by other means.

On the other side, it looks trivial and efficient to solve this in babeld, by 
adding an 'id' field in neighbour, and use the id instead of address/ifindex in 
the SET_COST_MULTIPLIER packet.


Regards,
Julien

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Babel-users mailing list
[email protected]
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users

Reply via email to