Re: [tor-dev] Relay diversity master thesis

2018-01-07 Thread grarpamp
On Sun, Jan 7, 2018 at 8:29 AM, teor  wrote:
>> On 22 Dec 2017, at 11:23, Robin Descamps  wrote:
>> May I ask you advices/feedback about this master thesis plan?
>> The master thesis plan: 
>> https://drive.google.com/open?id=1XEOSS29owavKJ_cJJAVaPiJe34Ez6XXx
>> The poster: 
>> https://drive.google.com/open?id=1BlF2U-Kexyz6ihVSqvsVHv4PUsvXATc4

> In particular, operators that could perform end-to-end correlation?

> Have you considered the relay's Operating System?

If considering as yet non tor daemon, non measured, non consensus voted
things like operators and OS, then you should extend research into similar
meta parameters about the relays themselves such as datacenter hosted vs
cable/dsl/fiber "home" relays, country locations, opposing legal jurisdictions,
operation by "known" or "trusted" operators / entities or not, by
working / fake / no
contact info, by any PKI Web Of Trust asserted among operators, funding
sources, employer / corporate / political / other affiliations,
statistical analysis
of historical relay "presence" on the network (add/drop/uptime, nicknames,
movement, versions, bulk turnups, correlation groups, etc), and many more
possible metas that people should think up and add to this list.

That research then followed by development of third party subscription
lists of categorized / ranked relays the user or tor daemon may further
pluggably select from when choosing nodes to path through.

There have been posts on tor-relays@ and tor-talk@ that mention more
about these sorts of meta parameters. AFAIK, no one has done any
research into them or their potential impact / benefits, whether to particularly
affected, or for plain preferential choice users, or to the network as a whole.
So the chance of a first good paper in the area awaits whoever does that
meta analysis project.

[xpost for open project oppurtunity]
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Marker branch for current tor release(s)

2018-01-07 Thread teor
Hi,

Tor branches are a question for tor-dev@, I am directing all responses there.
Also, I fixed the top-post.

> On Jan 5, 2018 00:48, "Andreas Krey"  wrote:
> 
> https://www.torproject.org/download/download.html.en in the source code 'tab'
> states the current stable and alpha version of tor.
> 
> Would it be possible to publish the current states as branches 'stable' and
> 'alpha' (or 'testing', or 'unstable') in the git repo?

What do you mean by "alpha" and "stable" ?

When tor 0.3.2.9 is released next week, there will be no alpha version.
When this happens, do you want master, or the latest stable?

When there are multiple supported tor versions, which one should be stable?
At the moment, we support 0.2.5 and 0.2.9 as long-term support, and 0.3.0 and
0.3.1 as regular releases.

Should stable be 0.3.1 (and change to 0.3.2 next week)?

Do you want a long-term support branch as well?
Should it be 0.2.5 or 0.2.9?

> That would help us tor-from-source builders to just fetch the repo, and
> if the respective branch changes, to rebuild and redeploy. Looking for a
> new release tag or screen-scraping said web page is a bit hairy, and feels
> unnecessary.

If you want something that's easier to scrape, and signed, check for
new source releases at:

https://dist.torproject.org/

We provide the latest Tor Browser version through a URL (which I can't
remember right now). Maybe we could do the same thing with Tor.

> On 5 Jan 2018, at 23:17, Chad MILLER  wrote:
> 
> I second this.
> 
> There's a recommended-versions list in the consensus, but you have to already 
> have Tor available and running to get it.

No, you don't need Tor:

$ curl http://197.231.221.211:9030/tor/status-vote/current/consensus-microdesc 
| grep server-versions | tr "," "\n" | tail -1
0.3.2.8-rc

Or you can do this far more reliably in Python using stem:

https://stem.torproject.org/

> Maybe also publish in a DNS TXT record or something?

Is that secure?
Can you sign a TXT record?

T

--
Tim Wilson-Brown (teor)

teor2345 at gmail dot com
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B
ricochet:ekmygaiu4rzgsk6n
xmpp: teor at torproject dot org






signature.asc
Description: Message signed with OpenPGP
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Relay diversity master thesis

2018-01-07 Thread teor
Hi Robin,

Sorry it's taken a while for someone to respond to your email.
Many of us have been on leave from the start of December until this week.

Please see my response below:

> On 22 Dec 2017, at 11:23, Robin Descamps  wrote:
> 
> Hello,
> 
> I already sent this message to the metrics team, but they advice me to 
> address it to the dev team, which seem to be more relevant.
> 
> I realise this year a master thesis, in the Université catholique de Louvain 
> in Belgium, about measuring the utility brought to the Tor network diversity 
> by adding a new relay, according to its configuration. I added to this 
> message my master thesis plan, as well as a poster that presents a summary of 
> the key elements.
> 
> May I ask you advices/feedback about this master thesis plan? Since I would 
> like this project to bring a real contribution to the Tor development, I want 
> to make sure that all the steps I will perform are useful and/or worth it.
> 
> The master thesis plan: 
> https://drive.google.com/open?id=1XEOSS29owavKJ_cJJAVaPiJe34Ez6XXx
> The poster: https://drive.google.com/open?id=1BlF2U-Kexyz6ihVSqvsVHv4PUsvXATc4

Have you considered relay bandwidth capacity, measured bandwidth,
consensus weight, or bandwidth authorities in your plan?

When using the Tor path selection algorithm, relay consensus weight has
a big impact on the paths selected by clients.

At the moment, relay consensus weight is a function of relay bandwidth
capacity, and geographic location. For a map of consensus weights, see
"Consensus Weight versus Bandwidth" on:

https://atlas.torproject.org/#map


Have you considered relay operators or relay families?
In particular, operators that could perform end-to-end correlation?

https://nusenu.github.io/OrNetStats/


Have you considered the relay's Operating System?
Are you aware that the Tor network has historically been a Linux
monoculture, and 90% of relays still run Linux?

https://nusenu.github.io/OrNetStats/
https://torbsd.github.io/blog.html


Have you considered the Tor version that the relay is running?

https://nusenu.github.io/OrNetStats/


Recently, someone created a website that gave badges for different
kinds of relay diversity. But I can't remember what it was called.


I've also cc'd nusenu, who has done some work in this area.

T

--
Tim Wilson-Brown (teor)

teor2345 at gmail dot com
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B
ricochet:ekmygaiu4rzgsk6n
xmpp: teor at torproject dot org






signature.asc
Description: Message signed with OpenPGP
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] A suggestion for differential private statistics gathering

2018-01-07 Thread teor
Hi Eyal,

Thank you for letting us know about your research.

You've emailed an internal discussion list, so I'm going to direct all
responses to the public tor-dev@ list. That is the list where we discuss
changes to Tor.

Please see my response below:

> On 7 Jan 2018, at 20:31, Eyal Ronen  wrote:
> 
> I am a PHD student, and have just published online a paper, that shows a 
> protocol that I think might be relevant to the TOR network.
> The protocol allows a server to privately learn information from a client, 
> and is resilient to a situation where a malicious adversary wants to temper 
> with the statistics.
> The main use case in our paper is learning popular passwords,

Are you aware of this password list?
(It has no privacy apart from hashing the passwords, because the
passwords are already publicly available in data breaches.)

https://www.troyhunt.com/introducing-306-million-freely-downloadable-pwned-passwords/

> but we believe that it might also be usable for other cases in the TOR 
> network. As we do not know the needs and challenges in the  TOR network, we 
> would greatly appreciate any feedback from the TOR metrics community.

I would be interested in an explanation of the tradeoffs between the
16 and 32 bit versions of the protocol. In particular, we may not be
able to provide servers with 4 GB of RAM for the 32 bit protocol.

What would we lose with a 16 bit version? Is a 24 bit version possible?

> The paper is titles "How to (not) share a password: Privacy preserving 
> protocols for finding heavy hitters with adversarial behavior" and can be 
> found at https://eprint.iacr.org/2018/003
> 
> The abstract is:
> "Bad choices of passwords were and are a pervasive problem. Most password 
> alternatives (such as two-factor authentication) may increase cost and 
> arguably hurt the usability of the system. This is of special significance 
> for low cost IoT devices.
> 
> Users choosing weak passwords do not only compromise themselves, but the 
> whole eco system. For example, common and default passwords in IoT devices 
> were exploited by hackers to create botnets and mount severe attacks on large 
> Internet services, such as the Mirai botnet DDoS attack.
> 
> We present a method to help protect the Internet from such large scale 
> attacks. We enable a server to identify popular passwords (heavy hitters), 
> and publish a list of over-popular passwords that must be avoided. This 
> filter ensures that no single password can be used to comprise a large 
> percentage of the users. The list is dynamic and can be changed as new users 
> are added or when current users change their passwords. We apply maliciously 
> secure two-party computation and differential privacy to protect the users' 
> password privacy. Our solution does not require extra hardware or cost, and 
> is transparent to the user.
> 
> The construction is secure even against a malicious coalition of devices 
> which tries to manipulate the protocol in order to hide the popularity of 
> some password that the attacker is exploiting. We show a proof-of-concept 
> implementation and analyze its performance.
> 
> Our construction can also be used in any setting where one would desire to 
> privately learn heavy hitters in the presence of an active malicious 
> adversary. For example, learning the most popular sites accessed by the TOR 
> network."

Tor Proposals

There is a current proposal to add privacy-preserving statistics to Tor
using a scheme that's based on PrivCount:

https://gitweb.torproject.org/torspec.git/tree/proposals/288-privcount-with-shamir.txt

PrivCount is limited to counters with integer increments, so we can't
easily find the most popular site (the mode), or calculate the median.

We discussed the proposal on tor-dev@, and decided that we could add
more sophisticated statistics later. We would appreciate your feedback
on the proposal. In particular, please let us know if there is anything
we should change in the proposal to make it easier to extend with schemes
like yours in future.

Here is the proposal thread:

https://lists.torproject.org/pipermail/tor-dev/2017-December/012699.html


Anyone is welcome to submit Tor proposals, the process is described here:

https://gitweb.torproject.org/torspec.git/tree/proposals/001-process.txt


Related Research

There was a paper at CCS 2017 that sounds very similar to your scheme:

Ellis Fenske, Akshaya Mani, Aaron Johnson, and Micah Sherr. Distributed
Measurement with Private Set-Union Cardinality. In ACM Conference on
Computer and Communications Security (CCS), November 2017.

Source: http://safecounting.com/

Can you explain how your scheme differs from PSC?


There's a forthcoming paper at NDSS 2018 that measures a single onion
site's popularity with differential privacy using PrivCount:

Inside Job: Applying Traffic Analysis to Measure Tor from Within
25th Symposium on Network and Distributed System Security (NDSS 2018)
Rob Jansen, Marc