On Thursday 02 April 2009 20:18:32 Matthew Toseland wrote:
> 
> There are lots of ways we can improve Freenet's performance, and we will 
> implement some of the more interesting ones in 0.9: For example, sharing 
> Bloom filters of our datastore with our peers will gain us a lot, although 
to 
> what degree it can work on opennet is an open question, and encrypted 
tunnels 
> may eat up most of the hops we gain from bloom filters. And new load 
> management will help too when we eventually get there. However, at least for 
> popular data, we can never achieve the high, transient download rates that 
> bursty filesharing networks can. How does that affect our target audience 
and 
> our strategy for getting people to use Freenet in general? Does it affect 
it?

I am not sure that all of the assumptions behind the above conclusions are 
valid. We may be able to get away with a certain amount of burstiness, 
especially if users have set a low security level.

Specifically:

I assumed that traffic levels are observable in real time, or sufficiently 
close to real time, on all links:

Some links are inherently unobservable:
- CBR links.
- Stego where the protocol being emulated dictates traffic patterns.
- Sneakernet links: Daily exchange of big USB sticks is high bandwidth, is 
practically undetectable, but is short-range, inconvenient and high latency.
- Private unobservable networks.

I assumed that bursts start on one node and fan out to the whole network:

The risk of an attacker compromising a burst is related to several factors:
- The number of nodes on which the burst is observable (i.e. constitutes some 
fraction of traffic).
- The distance of the outer nodes from the center.
- Whether the user cares about the burst being compromised.
- Whether the user cares about the burst being *visible*.
- Whether the protocol is real-time and end-to-end.

If Alice's node has a darknet connection to Bob's node, and Bob's node has the 
whole of the file Alice wants, clearly Alice can burst the data from Bob's 
node at link speed with the only risks being 
1) that a passive adversary can tell she is doing a big download from Bob, and 
maybe infiltrate Bob to guess at it, 
2) the size of the burst, and
3) the possibility that Bob is the bad guy

The same argument applies if Alice bursts data from all her darknet peers. Of 
course this reveals to Alice that Bob has all this data in his datastore, 
which may be an issue, but less so on darknet. Also, tunneling breaks this, 
unless we let our trusted darknet peers have access to our client cache. But 
it would still work for really popular data. An interesting point here is 
that allowing trusted darknet peers access to our local downloads (our client 
cache, or however we implement it), actually improves security considerably, 
as well as performance, provided they are in fact trustworthy...

What about opennet? Revealing your store to your opennet peers is probably a 
bad idea, unless we have tunneling...

More broadly, after what point does a burst become invisible? It depends on 
the number of peers and the background traffic level. If all requests are 
bursts, and they are satisfied quickly, an attacker will be able to trace 
most routes. If the originator trusts his/her peers, we have the originator 
(hop 0), then his peers (hop 1), then their peers (hop 2). If the originator 
trusts his peers, we should consider an attacker in hop 2. Half of the 
requests coming in from the hop 1 nodes will be the result of this spike, and 
the attacker knows from traffic analysis which of the hop 1 nodes is 
responsible, so traffic analysis plus 1 in 400 nodes (20 fanout, assuming 40 
peers and 50% overlap), gives the attacker a hit (correlation attacks help 
here). Hop 3 is probably relatively safe as it is only 1 in 40 and there will 
be more possible request sources, although correlation attacks are a worry; 
hop 4 should be safe...

If we tunnel requests that can't be answered by our trusted peers, this 
changes things somewhat. Download speed is limited by the capacity and number 
of the tunnels. Capacity of tunnels is limited by security concerns: there 
must be ambiguity over which tunnel going in to a node matches to which 
tunnel coming out of it, so there must be multiple tunnels going through a 
node, and enough background traffic to make it difficult to match them by 
matching incoming and outgoing traffic patterns. And the more tunnels you 
use, the more predecessor samples you give to an attacker, so roughly p ~= t 
* c^2/n^2, since the attacker needs the endpoint and probably something in 
hop 2. Also tunnels in Freenet will need to be quite long, to make sure the 
possible originator set for an endpoint is large. So this is a tunable by 
security level: at LOW, don't use tunnels, at MEDIUM, use lots of tunnels (~= 
BitTorrent over Tor), at HIGH, use few tunnels, at MAXIMUM, use one tunnel. 
If using lots of tunnels, or using no tunnels, bursts are feasible while not 
being very secure.

What if the user doesn't care about the burst being compromised?

What if the protocol is no longer real-time and end-to-end? Plans for 
persistent requests in 0.10 allow for bursting data back to a node when it 
comes back online, this data having trickled through the network; this may 
help to obscure data flows, but it is by definition about long-term requests, 
not instant gratification.


How to achieve such flexibility? Various threads:

Bloom filters and tunnels:
1. Share Bloom filters of our datastore with our trusted peers.
2. Seriously consider whether to make Bloom filter derived fetches from our 
trusted peers exempt from load limiting: they will only go one hop, the only 
question is whether we want our download burst to be visible. Probably this 
is another HIGH vs MAXIMUM setting. The same question can be asked of 
fetching offered keys from peers.
3. Implement encrypted tunnels. Make the number of tunnels configurable, with 
defaults based on the security level.
4. Share Bloom filters of our datastore with our non-trusted opennet peers.
5. Implement client cache, or something similar, and share our client cache 
with our trusted peers, provided that trust is HIGH / friends trust level is 
LOW. This is a clear tradeoff between our friends knowing what we are doing 
versus giving our enemies more shots at it, hence a useful option for some 
users.

Load management:
Better load management (e.g. token passing) would likely improve performance 
and flexibility e.g. on fast darknet pockets. If a peer does not care about 
bursts being visible, if it is a trusted peer or it has proven its 
usefulness, we may want to allow it reasonable bursts, based on its prior 
reputation. Burstiness of our own requests should be a configurable parameter 
to load management:
- How much do we limit ubernodes? This is a routing/capacity tradeoff as well 
as a security/capacity tradeoff, but there should probably still be different 
compromises for different security levels. By definition the misrouting cost 
of routing to a fast node is fairly low.
- How much do we favour our own requests? If we favour our own requests too 
much we will be punished by other nodes, especially on opennet, but we can 
probably get away with short-term bursts. However for security we would 
probably want to treat our requests as just another node vying for requests.
- How much burstiness do we allow incoming requests from any given peer? (E.g. 
capacity of the token bucket used to enforce fairness when accepting 
requests). This would be affected by whether they are a darknet peer, what 
their track record is on serving requests, and tunable parameters.

Persistent requests:
Current plans do not seem to present any problems with regards to bursting, 
apart from those addressed already.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 835 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20090407/c6039380/attachment.pgp>

Reply via email to