I’ve just published a new paper on selecting the node selection probabilities 
(consensus weights) in Tor. It takes a queuing-theory approach and shows that 
what Tor used to do (distributing traffic to nodes in proportion to their 
contribution to network capacity) is not the best approach.

Counter-intuitively the paper shows that some of the slowest nodes should not 
be used at all, because if they are used they will slow down the average 
performance for all users. The proportion of nodes which shouldn’t be used 
depends on the relationship between network usage and network capacity, so will 
vary over time.

It’s not clear that there is a closed-form solution to the problem of 
calculating node selection probabilities (I couldn’t find one), but this paper 
shows that the optimisation surface is convex and so gradient-based 
optimisation methods will find the global optimum (rather than some local 
optimum which depends on the starting position of the optimisation process).

Although the process outlined in the paper requires knowing the relationship 
between network capacity and usage, it isn’t highly sensitive to minor 
inaccuracies in measuring this value. For example if it is assumed the network 
is loaded at 50% then the solution will outperform Tor’s old approach provided 
the true network load is between 0% and 60%.

After this work was done, Tor moved to actively measuring the network 
performance and manipulating the consensus weights in response to changes. This 
seems to have ended up with roughly the same outcome. The advantage of Tor’s 
new approach is that it doesn’t require knowing network usage and node 
capacity; however the disadvantage is that it can only react slowly to changes 
in network characteristics.

For more details, see the paper:
  http://www.cl.cam.ac.uk/~sjm217/papers/#pub-el14optimising

Note that this is published in IET Electronics Letters, which is a bit 
different to the usual Computer Science publication venues. It jumps straight 
into the maths and leaves it to the reader to understand the background and 
implications. The advantage is that it’s 2 pages long; the disadvantage is that 
to understand it you need to know a reasonable amount about Tor and queuing 
theory to make much sense of it.

Best wishes,
Steven
 
_______________________________________________
tor-dev mailing list
[email protected]
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Reply via email to