On Thursday 02 April 2009 20:18:32 Matthew Toseland wrote: > > There are lots of ways we can improve Freenet's performance, and we will > implement some of the more interesting ones in 0.9: For example, sharing > Bloom filters of our datastore with our peers will gain us a lot, although to > what degree it can work on opennet is an open question, and encrypted tunnels > may eat up most of the hops we gain from bloom filters. And new load > management will help too when we eventually get there. However, at least for > popular data, we can never achieve the high, transient download rates that > bursty filesharing networks can. How does that affect our target audience and > our strategy for getting people to use Freenet in general? Does it affect it?
I am not sure that all of the assumptions behind the above conclusions are valid. We may be able to get away with a certain amount of burstiness, especially if users have set a low security level. Specifically: I assumed that traffic levels are observable in real time, or sufficiently close to real time, on all links: Some links are inherently unobservable: - CBR links. - Stego where the protocol being emulated dictates traffic patterns. - Sneakernet links: Daily exchange of big USB sticks is high bandwidth, is practically undetectable, but is short-range, inconvenient and high latency. - Private unobservable networks. I assumed that bursts start on one node and fan out to the whole network: The risk of an attacker compromising a burst is related to several factors: - The number of nodes on which the burst is observable (i.e. constitutes some fraction of traffic). - The distance of the outer nodes from the center. - Whether the user cares about the burst being compromised. - Whether the user cares about the burst being *visible*. - Whether the protocol is real-time and end-to-end. If Alice's node has a darknet connection to Bob's node, and Bob's node has the whole of the file Alice wants, clearly Alice can burst the data from Bob's node at link speed with the only risks being 1) that a passive adversary can tell she is doing a big download from Bob, and maybe infiltrate Bob to guess at it, 2) the size of the burst, and 3) the possibility that Bob is the bad guy The same argument applies if Alice bursts data from all her darknet peers. Of course this reveals to Alice that Bob has all this data in his datastore, which may be an issue, but less so on darknet. Also, tunneling breaks this, unless we let our trusted darknet peers have access to our client cache. But it would still work for really popular data. An interesting point here is that allowing trusted darknet peers access to our local downloads (our client cache, or however we implement it), actually improves security considerably, as well as performance, provided they are in fact trustworthy... What about opennet? Revealing your store to your opennet peers is probably a bad idea, unless we have tunneling... More broadly, after what point does a burst become invisible? It depends on the number of peers and the background traffic level. If all requests are bursts, and they are satisfied quickly, an attacker will be able to trace most routes. If the originator trusts his/her peers, we have the originator (hop 0), then his peers (hop 1), then their peers (hop 2). If the originator trusts his peers, we should consider an attacker in hop 2. Half of the requests coming in from the hop 1 nodes will be the result of this spike, and the attacker knows from traffic analysis which of the hop 1 nodes is responsible, so traffic analysis plus 1 in 400 nodes (20 fanout, assuming 40 peers and 50% overlap), gives the attacker a hit (correlation attacks help here). Hop 3 is probably relatively safe as it is only 1 in 40 and there will be more possible request sources, although correlation attacks are a worry; hop 4 should be safe... If we tunnel requests that can't be answered by our trusted peers, this changes things somewhat. Download speed is limited by the capacity and number of the tunnels. Capacity of tunnels is limited by security concerns: there must be ambiguity over which tunnel going in to a node matches to which tunnel coming out of it, so there must be multiple tunnels going through a node, and enough background traffic to make it difficult to match them by matching incoming and outgoing traffic patterns. And the more tunnels you use, the more predecessor samples you give to an attacker, so roughly p ~= t * c^2/n^2, since the attacker needs the endpoint and probably something in hop 2. Also tunnels in Freenet will need to be quite long, to make sure the possible originator set for an endpoint is large. So this is a tunable by security level: at LOW, don't use tunnels, at MEDIUM, use lots of tunnels (~= BitTorrent over Tor), at HIGH, use few tunnels, at MAXIMUM, use one tunnel. If using lots of tunnels, or using no tunnels, bursts are feasible while not being very secure. What if the user doesn't care about the burst being compromised? What if the protocol is no longer real-time and end-to-end? Plans for persistent requests in 0.10 allow for bursting data back to a node when it comes back online, this data having trickled through the network; this may help to obscure data flows, but it is by definition about long-term requests, not instant gratification. How to achieve such flexibility? Various threads: Bloom filters and tunnels: 1. Share Bloom filters of our datastore with our trusted peers. 2. Seriously consider whether to make Bloom filter derived fetches from our trusted peers exempt from load limiting: they will only go one hop, the only question is whether we want our download burst to be visible. Probably this is another HIGH vs MAXIMUM setting. The same question can be asked of fetching offered keys from peers. 3. Implement encrypted tunnels. Make the number of tunnels configurable, with defaults based on the security level. 4. Share Bloom filters of our datastore with our non-trusted opennet peers. 5. Implement client cache, or something similar, and share our client cache with our trusted peers, provided that trust is HIGH / friends trust level is LOW. This is a clear tradeoff between our friends knowing what we are doing versus giving our enemies more shots at it, hence a useful option for some users. Load management: Better load management (e.g. token passing) would likely improve performance and flexibility e.g. on fast darknet pockets. If a peer does not care about bursts being visible, if it is a trusted peer or it has proven its usefulness, we may want to allow it reasonable bursts, based on its prior reputation. Burstiness of our own requests should be a configurable parameter to load management: - How much do we limit ubernodes? This is a routing/capacity tradeoff as well as a security/capacity tradeoff, but there should probably still be different compromises for different security levels. By definition the misrouting cost of routing to a fast node is fairly low. - How much do we favour our own requests? If we favour our own requests too much we will be punished by other nodes, especially on opennet, but we can probably get away with short-term bursts. However for security we would probably want to treat our requests as just another node vying for requests. - How much burstiness do we allow incoming requests from any given peer? (E.g. capacity of the token bucket used to enforce fairness when accepting requests). This would be affected by whether they are a darknet peer, what their track record is on serving requests, and tunable parameters. Persistent requests: Current plans do not seem to present any problems with regards to bursting, apart from those addressed already. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 835 bytes Desc: This is a digitally signed message part. URL: <https://emu.freenetproject.org/pipermail/tech/attachments/20090407/c6039380/attachment.pgp>