tl;dr: We propose collecting data from exit nodes to improve the Tor
network, using differential privacy and secure multiparty computation to
do it in a privacy-sensitive manner.


Hi tor-dev,

In the ongoing effort to make Tor faster, secure and more resilient, network 
data plays an important role. If we know how the network is being used, what 
its clients' needs are and the threats that it faces we can deal with these in 
an intelligent manner. While the Tor Project does collect some statistics from 
guards, it does not currently collect and share potentially sensitive exit 
statistics. This data includes destination statistics and client timing 
behaviour, among many other potentially interesting, but privacy sensitive, 
data points.

This reticence to collect data is due to the (well-founded) risk to clients and 
OR operators that this data could pose, such as correlation and coercion 
attacks. This is unfortunate since, as we observe above, in order to make 
improvements to the Tor network and its feature set, it would be beneficial to 
know what is going on inside it and with its users.

To that end, it would be great if we were able to learn about network and 
client trend data. Some concrete examples include circuit-level data volumes, 
guard traffic usage, lengths of internal buffers, and latencies at relays. 
Indeed, if it can be counted then we should be able to collect and report it in 
a privacy-preserving manner.

Which brings me to the reason for this email; I have had the good fortune to 
work with George Danezis at UCL and my supervisor Ian Goldberg at the 
University of Waterloo on coming up with a solution to this private data 
collection problem. We have created a system, PrivEx, that uses modern 
privacy-preserving techniques such as differential privacy and secure 
multiparty computation to address this thorny set of challenges; we have 
written up the details in a tech report that can be found 
here:http://cacr.uwaterloo.ca/techreports/2014/cacr2014-08.pdf  .

We have also created implementations of the two variants of PrivEx as described 
in the tech report. We are currently putting in the finishing touches and will 
be releasing them soon as open source in a git repo.

We would like to start by rolling out our own PrivEx-enabled exits in the Tor 
network and begin collecting destination visit statistics. We expect that 
PrivEx will be generally useful to all exit operators and the Tor network in 
general but there is no requirement to deploy it everywhere. We hope to deploy 
PrivEx on a handful of exits during the June-August timeframe.

What we would really like in order of importance is 1) a design review of our 
proposal, 2) an implementation review would be nice (once we release it). We 
hope that these reviews will address the main concerns of the community at 
large as well as give it, and us, a measure of confidence that collecting data 
with PrivEx is inherently good and is being done in a responsible and 
intelligent manner. We anticipate that this would make PrivEx an attractive 
addition for the Tor Project and their data collection needs.

Please don't hesitate to give us your feedback, either to the list or to me via 
email.

Cheers,

Tariq

_______________________________________________
tor-dev mailing list
[email protected]
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Reply via email to