[tor-dev] Improving Tor network models [was: Update to Proposal 316: FlashFlow]

Jansen, Robert G CIV USN NRL (5543) Washington DC (USA) Sat, 17 Oct 2020 20:44:51 -0700

> 
> On Oct 8, 2020, at 2:50 PM, Mike Perry <mikepe...@torproject.org> wrote:
> 
> I do not yet have confidence that these issues are solved simply because
> they did not appear in Shadow. Shadow does not simulate multi-instance
> relays, CPU bound relays, or structural load imbalances in the network.


Hi Mike and others!

I'd like to better understand your criticisms here so that we can work to make 
Shadow more useful (work that fits squarely under sponsor 38).

> multi-instance relays

Nothing prevents Shadow from running multiple tor relay processes on the same 
virtual host. We could add this to the Tor models that are created by our model 
generation tool[0].

One issue is that we don't have ground truth about:
- which relays are co-resident with one another; and
- the capacity of the machine hosting the co-resident relays.

A short term fix could be that we look at relays in the same family, and 
randomly choose some of them to run on the same machine (setting the capacity 
of the machine as the sum of max observed bandwidth of the co-resident relays). 
A longer term solution would be to add a new parameter similar to MyFamily and 
ask operators to identify which relays are co-resident, or add to tor a 
self-measurement of co-residency - and that would provide the ground truth we 
would need for accurate modeling.

Thoughts? Any other ideas?

> CPU-bound relays

There are two issues here:
- we need to improve/rewrite our virtual CPU module in Shadow that accounts for 
CPU load; and
- we need ground truth about the number of CPUs and CPU speeds for each relay.

The first one is relatively straightforward to resolve, the second one again 
requires some form of self-reporting or automated self-measurement in tor.

> structural load imbalances

Could you please explain this one in a couple more sentences?

By 'structural' I think you might mean imbalances across relay positions (i.e., 
more guard bandwidth and less exit bandwidth). If so, then Shadow does already 
properly account for this by statically assigning flags using the 
TestingDirAuthVoteExit and TestingDirAuthVoteGuard torrc options.

Here are some bonus ones for you:

> capacity of relays

We currently use the maximum observed bandwidth that we've seen for a relay and 
set that value as the network link capacity of the (virtual) host machine that 
runs reach relay. Again, we don't have any ground truth of how much capacity is 
available to each relay, though maybe someday FlashFlow will collect it for us.

> diversity of Tor versions

We should make sure our modeling tool includes relays across different versions 
of Tor, since not all relays in the public network run the same version. This 
one is pretty simple to fix (it just requires us to build Tor plugins multiple 
different Tor source versions) but research that is testing how a new idea 
performs across the network by modifying Tor source will obviously need to use 
their custom research version of Tor.

Peace, love, and positivity,
Rob

[0] https://github.com/shadow/tornetgen
_______________________________________________
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

[tor-dev] Improving Tor network models [was: Update to Proposal 316: FlashFlow]

Reply via email to