[tor-relays] Measuring the Accuracy of Tor Relays' Advertised Bandwidths

Rob Jansen Fri, 26 Jul 2019 07:41:46 -0700

Hello relay operators!

I am planning on performing an experiment on the Tor network to try to gauge 
the accuracy of the advertised bandwidths that relays report in their server 
descriptors. Briefly, the experiment involves running a speed test on every 
relay for a short time (about 20 seconds). Details follow.


I plan to run the experiment in about 1 week. Relay operators can opt-out of 
the speed test by replying on this thread, and we will remove you from the list 
of relays to scan.

Peace, love, and positivity,
Rob

---
Measuring the Accuracy of Tor Relays' Advertised Bandwidths

Motivation
----------
The capacity of Tor relays (maximum available goodput) is an important metric. 
Combined with mean goodput, it allows us to compute the bandwidth utilization 
of individual relays as well as the entire network in aggregate. Generally, 
capacity is used to help balance client load across relays, and relay 
utilization rates help Tor make informed decisions about how to allocate 
resources and prioritize performance and scalability improvements.

Problem
-------
Currently, Tor uses a heuristic measure of unknown accuracy to estimate Tor 
relay capacity. Each relay keeps track of the maximum goodput it has achieved 
over any 10 second window in a 24 hour period. This is called the "observed 
bandwidth". Relays take the minimum of their "observed bandwidth" and their 
bandwidth rate-limiting configuration and reports the result as the "advertised 
bandwidth" in their server descriptors. We do not know how well the advertised 
bandwidth estimates the true relay capacity, but we do know that it represents 
a lower bound on capacity.

Hypothesis
----------
The advertised bandwidth significantly underestimates the true capacity of Tor 
relays. On average, relays with higher true capacities will be more strongly 
correlated with capacity underestimation (because it will be less likely that 
fast relays will have sustained their full capacity over a 10 second period).

Experiment
----------
A relay reports its advertised bandwidth in its server descriptor. To test how 
well these reported numbers represent the true capacity of a relay, we can 
manually perform a speed test on the relay by initiating the simultaneous 
download of several large data streams for a period that exceeds 10 seconds. In 
the report following our test, the relay will report its advertised bandwidth 
in its server descriptor and the results will be collected and reported by 
metrics.torproject.org.

The experiment involves two steps: running the speed test on a relay under our 
control, and running the speed test on all relays in Tor network.

We will first run the speed test on at least one relay that we control, in 
order to test that the method is effective and that we can in fact observe a 
change in the advertised bandwidth reported on metrics.torproject.org. Once we 
have confidence that our speed test is functioning correctly, and that the 
metrics pipeline will allow us to gather the results, we will repeat it on all 
relays in the network.

We will conduct the speed tests while minimizing network overhead. We will use 
a custom client that builds 2-relay circuits. The first relay will be the 
target relay we are speed testing, and the second relay will be a fast exit 
relay that we control. We will initiate data streams between a speedtest client 
and server running on the same machine as our exit relay.

The setup will look like:

speedtest-client <--> tor-client <--> target-relay <--> exit-relay <--> 
speedtest-server

All components will run on the same machine that we control except for the 
target-relay, which will rotate as we test different relays in the network. For 
each target relay, we plan to run the speedtest for 20 seconds in order to 
increase the probability that the 10 second mean goodput will reach the true 
capacity. We will measure each relay over a few days to ensure that our 
speedtest effects are reported by every relay.

Although we believe that the overhead of this speed test is in line with 
regular usage, relay operators can opt-out of the speed test by replying on 
this thread. Those that opt out will be removed from our list of relays to scan.

Analysis
--------
Following our speedtest, we will analyze the data collected and reported by Tor 
metrics. We will compared the advertised bandwidth that each relay reports 
before our experiment to those reported during our experiment. This will help 
us test our hypothesis that relays' advertised bandwidth underestimates the 
true capacity of relays. We will run a statistical correlation analysis on the 
data to test the strength of the correlation between the previously reported 
(estimated) relay capacity and relay capacity underestimation. We will report 
our results to the Tor community.

We expect that the results of our experiment will help Tor decide how to 
allocate resources and will help them plan and prioritize performance 
improvements. It will also provide insight into the operation of the current 
load balancing system, which uses advertised bandwidth to produce consensus 
weights.

_______________________________________________
tor-relays mailing list
[email protected]
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays

[tor-relays] Measuring the Accuracy of Tor Relays' Advertised Bandwidths

Reply via email to