[freenet-dev] To mix or not to mix? was Re: Security

Matthew Toseland Thu, 01 Nov 2012 10:27:59 -0700

On Thursday 01 Nov 2012 15:04:10 Matthew Toseland wrote:
> JARGON:
> Correlation attack: If you are very close to the originator, and can identify 
> a stream of requests, you may be able to identify it simply by the fact that 
> there is a large proportion of the known download coming from a peer. This 
> works a few hops away too IMHO. So it's both malicious peers and attackers 
> controlling an (as yet unquantified) proportion of the network.
> 
> MAST: Mobile attacker source tracing. Requires an identifiable stream of 
> requests. Uses the location of the target of a request combined with the fact 
> that it was routed "here" to eliminate a swathe of the keyspace. Use this 
> data to guess the location of the originator. Announce to it, and hopefully 
> get more requests from the request stream, so the attack speeds up.
> 
> Random route: Route to a random peer for some limited number of hops at the 
> beginning of a request/insert. For inserts the performance cost is 
> negligible; for requests it costs significant latency and bandwidth but may 
> improve data persistence (because e.g. currently we don't cache on the first 
> few hops).
> 
> Age-based routing restrictions: One possible countermeasure for MAST, 
> suggested long ago by somebody whose identity is on the bug tracker, is to 
> not route to nodes which were added more recently than when we started the 
> request.
> 
> Rendezvous tunnels: Send two or more "anchor" requests out, via different 
> nodes, use a shared secret scheme to split a secret key between them. These 
> are either routed entirely randomly, or routed randomly and then to a target 
> location. When they meet, combine the secrets and create an encrypted tunnel, 
> which goes down the most direct path (one of them might be routed directly to 
> the target location); this should be no longer than the typical path for 
> random routing. The main drawbacks are that 1) nodes prior to the tunnel exit 
> can't cache anything, which costs some performance, and 2) Additional 
> latency. #1 is not important if the data being fetched or inserted is not 
> popular anyway, or is very small.
> 
> Mixnet tunnels: For small but important data we can actually afford onion 
> routing, across the entire network, even though it will involve an absurd 
> number of hops. The most obvious problem is how to obtain nodes to put on the 
> chain. Non-telescoping tunnel construction is apparently used widely in other 
> mixnets, so it's not necessarily "bad" to have *seen* the node somehow...
> 
> Both kinds of tunnels could support deliberate delays and telescoping stages 
> for inserts, however reliability becomes a problem.
> 
> ATTACKS:
> 
> Firstly, we need to sort out announcement. See the other email. MAST is much 
> slower if you can't announce directly to the target, although sadly it is 
> still possible. Announcement spam is also a serious problem, and it's VITAL 
> for performance that we maintain the seednodes list automatically and have a 
> lot more seeds; and we may be able to avoid shipping a full list to everyone!
> 
> Inserts of unpredictable files are not vulnerable to MAST at run-time. An 
> attacker can however gain some location range samples from his logs once the 
> file is identified (published).
> => Initial random route on inserts undermines this. Allows for some samples 
> in the random routing phase, but these are less valuable.
> => No need for age-based routing restrictions on unpredictable inserts.
(Because the attacker can't move until the file is published anyway)
> 
> Small inserts to predictable locations, especially if they are frequent, e.g. 
> for forums, freesite inserts, provide location range samples.
> => Initial random route on inserts reduces the value of these samples 
> dramatically.
> => Tunnels are the obvious solution. This prevents both MAST and local 
> correlation attacks.
> (Big inserts to predictable locations should not exist, we are talking about 
> individual blocks here)
> 
> Downloads of known files are vulnerable to MAST at run-time. An attacker can 
> quickly approach the originator.
> => Initial random route greatly reduces the value of samples, but a similar 
> attack although weaker is possible against the random routing stage, so it 
> does not eliminate the threat completely.
> => Bundling requests together reduces the number of samples for initial 
> random routing, although may be tricky to get right.
> => Age-based routing restrictions, during the random routing phase, could be 
> very effective against MAST. The main problem is that big downloads may take 
> so long that they are unrealistic. Solutions:
> 1) Improving performance improves security.
> 2) If a request is failing, the cost of using rendezvous tunnels is marginal; 
> clearly the data isn't cached locally. So requests for old content that isn't 
> readily available can just use rendezvous tunnels. We can detect this in the 
> client layer and switch modes.
> 3) The minimum peer age can be specified for requests, based on the size of 
> the file, provided that it has only a limited number of possible values and 
> all of them are popular. It will be dropped once we exit the random routing 
> stage. Age limits should depend on when the node was originally seen, how 
> often it's been online recently and so on as well as its actual connected 
> time. I don't think using the Bootstrapping Certificate will work, since an 
> attacker doesn't need to keep bootstrapped nodes online, although it could at 
> least cost him a pool of largely idle IP addresses, so maybe...
> 4) As soon as we reach the end of the random routing stage, we switch to 
> rendezvous tunnels regardless of success rates.
> 
> Various crazy bursting schemes have been proposed, but likely only work on 
> darknet.
> 
> After we exit the random routing or rendezvous tunnel phase, we can safely do 
> a number of risky optimisations. One of the most frequently suggested is 
> returning data directly. Clearly we do want the data to be cached on enough 
> nodes, so we need to be careful here; we may want to transfer it just to 
> cache it, but probably not on ALL nodes on the path. 
> 
> Let's say we've decided we don't need to cache it on every node on the path. 
> Path folding also occurs in the routing stage, so we could efficiently 
> combine the two: If we want the node, we could connect to it and transfer the 
> data, and upgrade it to a full connection if the transfer succeeds and 
> validates. If we don't want the node, direct transfer would leak more 
> information than we do already, which may be an issue depending on e.g. the 
> number of end-points.
> 
> Downside: Encourages people to use opennet. But if we can make opennet 
> sufficiently secure is this a problem? Darknet is a fallback, and we need it, 
> but it's not going to happen unless we have a lot of users first. We can 
> further encourage use of darknet by FOAF connections, and possibly by making 
> opennet connections additional to darknet connections. Also there are other 
> things we can do on darknet: Not only FOAF connections, but also some form of 
> bursting; bloom filter sharing is easier because connections are longer lived 
> (although maybe not dramatically); and if we trust our direct peers, there 
> may be other possible optimisations e.g. broadcast to trusted peers only 
> before requesting. 
> 
With the above, our threat model is as follows:
- No advanced traffic analysis. (Tor makes the same assumption). We are well 
placed to limit the impact of traffic analysis in future though IMHO, since a 
lot of our traffic isn't realtime and we're a *storage* network.
- No Sybil!
-- User doesn't care about correlation attacks (i.e. attacker has a significant 
proportion of the network, or is nearby already). If they do, they enable 
rendezvous tunnels for everything.
-- Attacker does not have the resources (or time) to take over the routing 
table of each node one by one and see what happens (or to just connect to 
everyone).
-- Attacker does not have the resources to totally dominate routing, so that 
routing to a random location will end up on an attacker's node with high 
probability.


Most of these are going to be common assumptions for anything scalable.

Darknet does make violating them very difficult. Darknet is the long term 
future.

All the stuff above recommends rendezvous tunnels rather than mixnet tunnels.

Mixnet tunnels *might* be more secure than rendezvous tunnels, but we'd need to 
solve the peer selection problem, which likely involves routing; so if an 
attacker can control the keyspace he can likely compromise most tunnels. Even 
Tor has had route selection attacks, and Tor *has a published list of routers*! 
On opennet, direct connection between onion participants would be possible, 
which would improve performance, but make traffic analysis much easier. On 
darknet, we'd have to route between nodes internally, which is still usable for 
Really Important Small Stuff. So I don't think that a mixnet layer solves our 
other problems convincingly. Plus it might make harvesting even easier, 
although I dunno if the tunnels are long-lived.

Also IMHO protecting inserters is more important than protecting requesters, 
and we're clearly better at protecting inserters than downloaders.

HOWEVER... initial random route may have costs similar to mixnets with direct 
connections ... Rendezvous tunnels certainly would, their main advantage is 
they are better against traffic analysis, and solve the peer selection problem 
in a different way (which isn't vulnerable to routing exploits, if they are 
routed purely randomly).

On insert, this doesn't really matter - but only because the number of hops is 
probably too high already, and because random initial routing helps us escape 
any dead-end we might be in. Plus inserts are inherently easier to protect. And 
all sorts of elaborate schemes are possible, e.g. timing delays, even inserting 
random keys and asking another node to convert it to its final location, 
possibly with several layers (people have discussed implementing this at the 
client layer for a plugin).

On requests, random route will cost us significant performance. The cost of 
rendezvous tunnels is only slightly higher because of not caching; for popular 
content, random route is better, if less secure, for unpopular content it 
doesn't matter so we should always use rendezvous tunnels. However a short 
mixnet layer, with direct connections, would be cheaper than either of these 
options, at least for unpopular content. The caveats are similar to those for 
rendezvous tunnels: You can't use just one tunnel if you want good performance. 
The other problems are peer selection and traffic analysis (short direct 
connections are highly vulnerable; tunnel padding is an unsolved problem; etc). 
Rendezvous tunnels avoid the peer selection problem, and are less obvious on 
traffic analysis, but are still vulnerable to hop-by-hop 
DoS-all-possible-predecessors-one-at-a-time attacks. Worse, what happens when 
we lose a connection, even in the absence of active attacks? Rendezvous tunnels 
are longer so more likely to have such problems, as well as more wasteful.

Argh! How do we separate things properly?

First, HTL is probably too big.

Inserts of unpredictable files:
- Data is transferred over all HTL hops.
- No MAST, small number of samples from logs once the file is published.
- Extra security is possible, but not always required.

Inserts of predictable files:
- Data is transferred over all HTL hops.
- MAST-like samples.
- Essential that it be protected somehow.

Caching starts when we reach a threshold HTL. Tunnels would eliminate the need 
for this. We don't necessarily need to transfer the data through, and cache it 
on, all the nodes prior to the "ideal" area.

Requests:
- Immediately identifiable.
- Some ambiguity re other requesters.
- Popular files: Satisfied quickly by caches.
- Rare files: Need to go to the "ideal" area.
- Still have the caching rules. Which means the cost of tunnels would be offset 
by improved caching.
- MAST works.
- Random initial routing would blur MAST, but still get samples in the random 
phase. We want the random phase to reach the whole network so this is 
significant.
- Random routed bundles would give rather fewer samples, with a lot of the same 
complications and security/performance tradeoffs as tunnels.
- Random initial routing is likely costly, in spite of the caching improvement.
- Rendezvous tunnels not much more expensive for better security, for rare 
content. Can switch automatically.
- Direct mix tunnels are cheaper and safer, IF we ignore traffic analysis, peer 
selection, etc.

Traffic analysis:
- We could easily provide enough CBR bitrate to cover control messages.
- We could easily do what Mixmaster does (for bulk transfer requests): Wait 
until enough blocks have arrived, and then send several at once. Direct mixnet 
connections make this a lot less useful. Unless those connections are used for 
a lot of different tunnels. In which case we have a classic mixnet, which might 
be exactly what we want for Freenet; slightly higher latency than an onion 
network, but better security in practice.

Hence IMHO in principle, *for regular requests alone* (no mixing), we could 
have a reasonable degree of resistance to passive traffic analysis, on the busy 
nodes, simply by batching. This does not protect the original requestor, who 
clearly will have more data coming in than going out. But as soon as their 
requests reach busy nodes, they won't be easily followed. A bolted on onion 
network would not have this property; but a true mixnet might. Clearly Freenet 
as-is has bigger problems...

The standard answer to this is "traffic analysis isn't part of your threat 
model, so ignore it". But is that the right answer for Freenet?

Peer selection:
- Can be compromised by a powerful attacker.
- In which case they can also break the network in various ways (but they 
wouldn't want to); and they're likely powerful enough to do hop-by-hop attacks 
on rendezvous tunnels or random routed bundles.

Active attacks: (e.g. take out the possible predecessors and work backwards one 
hop at a time)
- Are possible on any of bundles, rendezvous tunnels or mix tunnels.
- If tunnels only go through reliable peers, we could drop tunnels when we 
unexpectedly lose a peer. But this would likely result in the originator 
retrying elsewhere, which gives the attacker more opportunities. Failover at 
each hop would probably be detectable too.
- And we have all the usual more subtle attacks e.g. traffic labelling (tunnel 
padding is an unsolved problem last I heard). There is a lot of literature on 
mixnets and somebody would need to look into it.
- It appears to be possible to solve this problem, at least partially, see e.g. 
the mix rings / onion rings paper, and others... AFAICS this is classified as a 
blending attack?

Partitioning is an issue if we can exit anywhere...?

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

[freenet-dev] To mix or not to mix? was Re: Security

Reply via email to