Hi there, I am a Babel user over both Wi-Fi and tunneled mesh VPN. I want to share some idea on cost calculation based on hop count, loss and RTT.
=== The Problem === I would like to find a feasible low-RTT route across my network because direct route is usually not the best. My configuration is: > interface zt0 type tunnel link-quality true rxcost 16 hello-interval 10 > rtt-min 16 rtt-max 1024 max-rtt-penalty 1008 Even after I have tweaked with these parameters, Babel still often chooses me a route with high loss but low latency. Such link is impossible with Wi-Fi but very common with tunnels. According to the code, I know Babel uses a formula like this: > cost = hop/(1-loss)^2 + RTT, (Constants omitted. In case of mistake, please > correct me.) in which "hop" is related to the sum of (txcost * rxcost), "loss" is the rate of one-way packet loss. Using the formula above, we know my configuration parameters would ask Babel to choose a route with the lowest RTT, adding a hop if 16ms can be saved, and try a 1.73ms slower route if packet loss reaches 5%, or 3.75ms if 10%. The influence of packet loss is obviously overlooked by Babel. To compensate this, I have to increase "rxcost" and "rtt-min", but that would give me routes with lower hops but higher RTT, which is not what I want. === My Model === Therefore I want to propose a new model on cost calculation, suitable for long-range tunneled networks. Typically we usually want to maximize TCP throughput and keep the hop count not too high. Mathis et al. gave us a formula to estimate the theoretical maximum TCP throughput: > throughput <= 1/(rtt * sqrt(loss)), (Again constants omitted.) We want to make cost linear to RTT and hop count, so we define it as the invert of theoretical throughput: > my_new_cost = (hop+RTT) * sqrt(loss), where loss = 1 - (alpha+beta) / 2 or loss = 1 - sqrt(alpha*beta). Now the derivation of loss goes downwards. I think it better reflects the reality: Adding 1% of packet loss to a link with 5% loss, makes things much worse than adding it to a link with already 50% loss. === Issues === There are several issues regarding my model. First, no real experiments have done over this formula. We need to consider more. Second, cross-version compatibility would be a problem. It's lucky that Babel is still a draft so we can discuss it without breaking things. Third, "loss" is on the denominator side, we need to set a lower bound (e.g. 100%/32 or 100%/64) so it does not overflow. Last, no research shows whether Mathis et al.'s formula works on Wi-Fi. Thank you for listening. And I hope we can discuss more about this topic. _______________________________________________ Babel-users mailing list [email protected] https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
