[freenet-dev] Path folding, secure bursts and aborting bursts Re: Fast and secure Freenet: Secure bursting and long term requests was Re: Planned changes to keys and UI

Matthew Toseland Mon, 19 Jul 2010 13:44:16 -0700

On Thursday 24 June 2010 19:56:58 Matthew Toseland wrote:
> On Thursday 24 June 2010 18:28:42 Matthew Toseland wrote:
> > On Thursday 24 June 2010 04:05:48 Evan Daniel wrote:
> > > On Wed, Jun 23, 2010 at 5:43 PM, Matthew Toseland
> > > <t...@amphibian.dyndns.org> wrote:
> > > > On Wednesday 23 June 2010 20:33:50 Sich wrote:
> > > >> Le 23/06/2010 21:01, Matthew Toseland a écrit :
> > > >> > Insert a random, safe key
> > > >> > This is much safer than the first option, but the key will be 
> > > >> > different every time you or somebody else inserts the key. Use this 
> > > >> > if you are the original source of some sensitive data.
> > > >> >
> > > >> >
> > > >> Very interesting for filesharing if we split the file.
> > > >> When some chunk are lost, you have only to reinsert those who are
> > > >> lost... But then we use much datastore... But it's more secure...
> > > >> Loosing datastore space is a big problem no ?
> > > >
> > > > If some people use the new key and some use the old then it's a 
> > > > problem. If everyone uses one or the other it isn't. I guess this is 
> > > > another reason to use par files etc (ugh).
> > > >
> > > > The next round of major changes (probably in 1255) will introduce 
> > > > cross-segment redundancy which should improve the reliability of  
> > > > really big files.
> > > >
> > > > Long term we may have selective reinsert support, but of course that 
> > > > would be nearly as unsafe as reinserting the whole file to the same key 
> > > > ...
> > > >
> > > > If you're building a reinsert-on-demand based filesharing system let me 
> > > > know if you need any specific functionality...
> > > 
> > > The obvious intermediate is to reinsert a small portion of a file.
> > > The normal case is (and will continue to be) that when a file becomes
> > > unretrievable, it's because one or more segments is only a couple
> > > blocks short of being retrievable.  If you reinsert say 8 blocks out
> > > of each segment (1/32 of the file), you'll be reinserting on average 4
> > > unretrievable blocks from each segment.  That should be enough in a
> > > lot of cases.  This is probably better than selective reinsert (the
> > > attacker doesn't get to choose which blocks you reinsert as easily),
> > > though it does mean reinserting more blocks (8 per segment when merely
> > > reinserting the correct 3 blocks might suffice).
> > > 
> > > The simple defense against a mobile opennet attacker that has been
> > > proposed before would be particularly well suited to partial
> > > randomized reinserts.  The insert comes with a time (randomized per
> > > block, to some time a bit before the reinsert started), and is only
> > > routed along connections that were established before that time, until
> > > it reaches some relatively low HTL (10?).  This prevents the attacker
> > > from moving during the insert.  On a large file that takes a long time
> > > to insert, this is problematic, because there aren't enough
> > > connections that are old enough to route along.  For a partial
> > > reinsert, this is less of a concern, simply because it doesn't take as
> > > long.
> > 
> > This is a very good point. What if we can improve on this further?
> > 
> > By implementing long-term requests, we could have *all* the requests for a 
> > splitfile go out *at once*, be routed immediately, and then return the data 
> > over a long period. This means that:
> > 1) The data needs to be trickled back even if nodes go offline - either via 
> > rerouting (but consider carefully how to make this safe, e.g. establishing 
> > backup routes at the time, or using a node identifier for the next hop so 
> > we can reroute via FOAFs without involving the originator so not giving a 
> > data point away), or by waiting for the nodes to come back online.
> > 2) Load management needs to be able to deal with the fact that we have 
> > thousands of requests in flight. This means it may not work on opennet, 
> > because there is no underlying trust; although we could maybe have a 
> > reputation system to build up some amount of trust. Trust can be translated 
> > into capacity limits.
> > 3) The mobile attacker defence holds: If all the requests get routed inside 
> > a few minutes, and then return data along fixed paths, the attacker has no 
> > chance of moving towards the originator. And this works even for fairly big 
> > files, without the overhead of tunnels, for requests and inserts of 
> > predictable data.
> > 4) Overheads should be reasonable, because we can bundle a large number of 
> > requests together efficiently.
> > 5) We get "burst" behaviour. If we have a fast connection, the data will be 
> > returned fast.
> > 6) We get slow-return behaviour. In many cases it will take a long time for 
> > the data to trickle back. At each hop it will make sense to send one key at 
> > a time, if we happen to have multiple keys fully available.
> > 7) The node needs to be able to cope with a large number of requests 
> > pending: We can keep the current code for routing them but once we have a 
> > route, the requests as well as the transfers need to be threadless.
> > 8) We need to be able to specify that we want a fast response on a request 
> > and that it should not get queued and trickled through/around offline nodes.
> > 9) We need a different way to handle timeouts. At the moment we detect when 
> > a node is busted by not getting the data within X period. If the node has a 
> > backlog of thousands of blocks then this is clearly not going to work. 
> > However we do expect a response eventually. So we need some way to take 
> > this into account and identify problems.
> > 
> > This is an elaboration on a chain of thought that goes way back, thanks 
> > particularly to the person who came up with the mobile attacker 
> > countermeasure, but we've been pondering passive requests for a long time 
> > ... It also combines ideas from CCN's hands off load management and various 
> > people who have suggested load management changes over the years e.g. on 
> > Frost...
> > 
> > This looks like a major rewrite, maybe 0.10 era, but it would be worth it, 
> > the potential benefits are rather enormous...
> > 
> > Of course we'd still need tunnels for e.g. Frost posts, stuff where you 
> > have multiple completed tasks over a period where the attacker could get a 
> > sample from each one.
> > 
> > PRINCIPLES IN DETAIL:
> > 
> > 1. We send requests when we have requests. We queue them as little as 
> > possible. Of course we have a queue of stuff we want to fetch - we 
> > recognise blocks that we want - but in terms of the actual queues of 
> > requests to send, we minimise this - stuff is sent immediately or near to 
> > immediately.
> > 2. We send all the requests for a single bundle at once. We queue other 
> > bundles if necessary to fit into our capacity limit. If two bundles are 
> > related (e.g. different levels of a splitfile or frost posts), we try to 
> > send them close together.
> > 3. Any request in a bundle has a starting timestamp, which is the same or 
> > very close for all the requests in the bundle. We don't route to any node 
> > that has connected since the starting timestamp.
> > 4. Any given link (connection between a pair of nodes) has a capacity in 
> > outstanding requests. On darknet this is fixed, or depends on the trust 
> > level for the connection. On opennet this is fixed but much lower, or 
> > depends on some sort of trust/history/reputation algorithm.
> > 5. Requests (and inserts) are routed reasonably quickly, but may be queued 
> > briefly if necessary to achieve good routing.
> > 6. Once a request has reached its destination, where the data is available, 
> > it is converted into a pending transfer. We exit the request thread and 
> > handle it asynchronously.
> > 7. On any link there may be a large number of pending transfers, up to a 
> > given capacity. If a request fails, we free up the capacity; hence we keep 
> > the number of pending transfers under the limit. As long as we have data to 
> > transfer, we transfer it. If we have a full blocks (e.g. if we are the data 
> > source), we transfer that first, then look for another one, then look for 
> > other blocks etc.
> > 8. As long as data is being transferred over a link, and requests are being 
> > completed regularly (this is why we try to transfer one block at a time 
> > rather than multiplexing everything), we don't timeout any of the 
> > transfers. We may have to incorporate age into the error detection somehow.
> > 9. As a request reaches each hop, we give it the identifier for a node for 
> > an alternative path. This can either be the previous hop or another node 
> > which has agreed to be a fallback path, depending on security requirements. 
> > The node identifier must be accessible within two hops of the node being 
> > routed to. If the predecessor node goes offline, we will try to route to 
> > the fallback path. If the fallback path is offline, we check whether we 
> > have the persistent flag (which may be disallowed on opennet): If we have 
> > the persistent flag we suspend the request until either the predecessor or 
> > the fallback comes online, and write it to disk. If we don't have the 
> > persistent flag, we store the returning data locally only, and penalise the 
> > predecessor if it comes back online. Once we have sent a request we are 
> > committed to accepting the returned data, this is essential to prevent 
> > probing-based censorship attacks. This is why opennet is a problem with 
> > this scheme: If you get any nontrivial trust from announcing, you can 
> > announce, connect to nodes, use that capacity to DoS them and then 
> > disconnect and talk to another node. However, if everyone has fast links, 
> > it could return data very quickly on opennet...
> > 10. If our capacity limits are low then we will have to send requests (or 
> > inserts) for big files in a series of separate bundles. We warn the user 
> > that we cannot guarantee full security, and get a confirmation. Unless 
> > seclevel = LOW, in which case we just proceed anyway without asking. One 
> > interesting point is that this allows us to quantify the loss of security 
> > on opennet, although arguably that will just result in more people getting 
> > connections to random strangers out of band, many of whom will turn out to 
> > be NSA...
> > 
> > Of course there are security risks with bursting, in that if an attacker 
> > has access to traffic analysis as well as nodes close to the originator, he 
> > may be able to tie the two together. But the above is not limited to 
> > bursting: The principle is we burst the routing phase and then return the 
> > data as fast or as slow as possible. It is compatible with sneakernet, it 
> > is compatible with CBR links.
> > 
> > Ideas? Challenges? Suggestions for how to deal with opennet in this 
> > framework?
> > 
> Backup routes should be established by the node being relayed through, not by 
> its predecessor, so as not to give anything away. They should also be 
> established lazily, only after some time has elapsed and the data hasn't all 
> been transferred, or only if the data is over a certain size.
> 
> A -> B -> C -> D
> 
> After a while, B chooses a backup node E. It asks it whether it is willing to 
> be a backup route for a given amount of data. If E says yes, B gives it the 
> bundle ID, and tells A about E. By the small world property there will be a 
> quick route from A to E and from C to E. We can probably pre-select to avoid 
> cases where this doesn't hold (e.g. based on counting common neighbours then 
> ranking and taking the upper half of the table and then choosing randomly 
> from that), without giving away too much information by our choice of E.
> 
> When B goes down, C broadcasts to all B's neighbours, via the FOAF fabric, a 
> one-way function of the bundle ID, as a challenge. E responds with a 
> different one-way function of the bundle ID, establishing a connection. This 
> could be encrypted, but IMHO that isn't necessary. A also notices that B has 
> gone down and sends a message to E, again authenticated by its knowledge of 
> the bundle ID. So we now have a short route for the data return.
> 
> Another option would be to do it completely on-demand: Both A and C broadcast 
> to some depth, say 2 hops (giving a maximum relayed route length of 4 hops). 
> Given the overheads of such a broadcast, it would have to be for all the 
> bundles going between A and C, but this is okay as timing-wise any attacker 
> will discover the correlation anyway; and this can be implemented without C 
> discovering A. A computes H_a(ID) and H_b(ID), C computes H_a(ID) and 
> H_c(ID), they broadcast their respective hashes. When they meet they are 
> relayed, along with a hop count. The shortest hop count wins. Hmmm, we might 
> want to encrypt it after all...
> 
> If there are lots of backup routes in a route, it may make sense to have some 
> voluntary path shortening. I'm not sure how to do this securely, although if 
> in the above if A is within range of D things might be interesting... 
> 
> Anything that involves A broadcasting results in a predecessor sample 
> however, so security-wise it is probably better to go with a prearranged 
> backup route, which minimises this: If the fallback path is not activated, E 
> does not know anything, and even if it is, there is no possibility of mobile 
> attacker source tracing because E was chosen relatively early on, or B chose 
> it late on but restricted its choice to nodes online when the request was 
> originally relayed. Of course this means that if the online nodes change 
> significantly we can end up losing the route completely, in which case we can 
> either wait until nodes come back online, or we can compromise security to 
> reroute from A to a node that wasn't online when the original request went 
> out.
> 
> How does rerouting from A affect mobile attacker source tracing exactly? If 
> an attacker is distant, and he receives a sample by being in the path of the 
> request, he can try to move towards the origin of the data, but he won't 
> receive any more data so cannot accelerate his attack; his success is 
> determined simply by the number of nodes he has at the time of the original 
> request. However, if we reroute to nodes that were added since the original 
> request, he has a nonzero chance of intercepting more data and so being able 
> to move towards the originator. So we have 3 options:
> 1) Don't reroute at all. Use the original path or nothing and hope the 
> redundancy is enough.
> 2) Establish alternate routes in advance. Use them but if they fail give up.
> 3) Reroute on demand, possibly after using pre-established alternate routes.
> 
> We can specify behaviour in the bundle:
> FAST: Reroute on demand using limited broadcasts.
> SECURE: Establish multiple alternate routes in advance. Use them but if they 
> fail give up (or wait for nodes to come back online).
> MIXED: Secure then fast.
> TRANSIENT: Do not establish alternate routes. Fail or wait when our original 
> routes go down.
> 
> The catch with backup routes is of course that they tend to be longer than 
> the original data return route. It would be nice if we could avoid triggering 
> the backup routes until we actually need them: In a popular splitfile for 
> example, the redundancy in the returned data may allow us to reconstruct even 
> if many of the pathways downstream fail. One way to implement this would be 
> for A to wait a while before contacting C. However, C is committed to 
> receiving the data, because there is no abort. If an alternative route to A 
> is found, then C will have less reason to penalise B if (when!) it reconnects.
> 
> Should there be an abort option? The problem is that being able to start a 
> transfer and then abort it facilitates censorship attacks. Classically 
> Freenet resists censorship because if you fetch a key and take out the node 
> that returned it, the data will have been cached downstream.
> 
> We should check fairly urgently whether the current code allows for aborting 
> transfers all the way down, I think it did but I fixed it.
> 
> Another interesting point with all this is it requires quite a lot of disk 
> space - the maximum in-flight data for each peer.
> 
Should there be an abort option, I ponder again? This is a fundamental question:


If there is no abort, we start a fetch for a splitfile, and assuming there is 
sufficient capacity, and all the requests are routed at once.

If we declare it as a bundle, then each node we route to will stream back the 
data as slowly or quickly as possible, and there is only a single transfer 
timeout.

Whether we declare it as a bundle or not, we should get feedback more or less 
immediately on whether the data has been found. If we declare it as a bundle 
this can be in aggregate and therefore more efficient.

The benefit is a burst which doesn't give a single attacker time to approach 
us. The cost is that we must either:
1) Not provide an abort function, meaning that all the data will be transferred 
to us, and if we disconnect, it will be buffered and transferred to us, and 
until the transfer is complete, that chunk of our capacity is locked in. OR
2) Provide an abort function, meaning that we can effectively probe for the 
existence of data without spending any downstream bandwidth receiving it, and 
without propagating it downstream.

Of course, we can abort incoming transfers (of CHK block transfers) now:
- If we propagate the cancellation, we can probe for the existence of data 
without causing it to be propagated. This is very bad, and IIRC we don't do it. 
CHECK THIS!
- If we don't propagate the cancellation, we can DoS to some degree by just 
requesting lots of data and cancelling all the transfers. However, this is 
limited to DoS'ing the nodes we are directly connected to, because they will 
complete the requests regardless.

We can take the same view with bursting and capacity-based load limiting: Even 
if we abort the incoming transfer, the upstream node will still fetch the data 
and propagate it, and while that is happening, that chunk of our capacity to 
that node will remain occupied.

Of course, the other side of this is that if we did offer an abort function, 
some interesting possibilities would open up, notably the 
only-download-if-fully-available function common on other p2p's. Similar issues 
come up with network-level keyword searching (should we allow users to 
de-propagate a request, or delay propagation until the user sends positive 
feedback?). And really in usability terms I'm not sure that no-abort is 
feasible. In fact arguably it's a serious security risk in itself.

So can we make abort safe? The reason abort (or delayed propagation) is 
regarded as unsafe is that on opennet, you could do a request, see what the 
DataSource node is, and then attack it. So all we have to do is ensure that 
path folding does not take place on any given request (i.e. a block transfer 
for an individual key) until after we have finished the data transfer for that 
block, *even if* it is part of a bundle. For keyword searching, we just turn 
off path folding; for darknet, we don't path fold; for opennet, path folding on 
secure bursts is not something I have yet considered...

This means that we can safely implement bursting without requiring 
trickle-back-across-restarts (and without necessarily implementing it either, 
but it IS a gain for security and in many cases for usability). And if we have 
fine-grained cancellation we could cancel those keys that are not needed 
because FEC will allow us to complete without them - or even pause them 
speculatively, and then cancel if the expected transfers happen, and un-pause 
if they don't. Of course we need to be careful with anything where there is 
feedback from the originator back out to the data fetching nodes - it may allow 
for timing attacks - depending on seclevel. For NORMAL we should probably allow 
everything to transfer even if we don't need it.

HOW SHOULD PATH FOLDING WORK WITH SECURE BURSTS??

If a burst is small, and all the data returns immediately, then this is easy. 
We deal with each individual key request separately. We can compress by having 
a cache on each link of recently sent node references (TODO: WE SHOULD DO THIS 
ANYWAY), but basically we just run the algorithm.

What about long-term trickle-back-over-restart? The easy option for dealing 
with this is just to turn it off on opennet. 

So the remaining case: Opennet with a large, long-term request, with very many 
blocks, that completes while the node is still up:
- Getting the data from the original data source may have been quite some time 
ago, by the time we get the block it might not be up.
- This is based on assumptions such as no flow control ... are these safe on 
opennet? Clearly capacity is limited on opennet, but it may still be big enough 
to be a problem...

The obvious answer is only path fold if the original node is still up. IIRC in 
current path folding the data source node sends a noderef and then we reply 
with our own. That might need to be reversed - but we can't delay it and buffer 
it because we might not want the node by that time ... so we need to wait until 
the data transfer has completed at the originator, and then use the existing 
path if possible ... This may give them a timing attack to see how far away we 
are, but it will be less accurate than what they get now with the current path 
folding mechanism ... It also means maintaining state well after the data 
source's transfer, or giving out a one-use token that allows path folding - the 
latter is probably preferable, but the downstream nodes (between the originator 
and the data source) will need to maintain state anyway, or use encrypted 
tokens which encode the path ... When a transfer finishes, we create an ID, 
this will be relayed after the transfer finishes; each node along the path 
remembers the ID for a while, allowing the originator node to send a 
connect-to-destination message with its own noderef, after it receives the last 
of the data. This is relayed, forwards (currently we go backwards but imho it 
is equivalent), quickly using the ID, so if the originator node gets a noderef 
back from the data source, it can confidently attempt a connection; it is 
possible for the originator to then reject it despite having sent an offer, but 
that is a problem with the current code and doesn't change with this proposal. 
A possible solution would be an extra pass on rejection. Obviously it involves 
a round trip, just as the current path folding code does, which could allow 
timing attacks; we probably can't get away from that on opennet.

How does this interact with churn? Clearly if a transfer is going to take 
longer than the average node uptime (or the average uptime of a node-to-node 
link), path folding is going to fail most of the time. Many p2p networks target 
1-2 hour average uptime; I doubt Freenet can handle this with or without secure 
bursting. On the other hand, it may be hoped that most transfers will be 
relatively short, especially if upstream bandwidths and load management improve 
significantly. And it is essential to the design that all the routes complete 
immediately, so it is inevitable that some of the later transfers will have a 
long delay from the initial routing to the path folding attempt...

Is this yet another pointer to the whole secure bursting thing being infeasible 
on opennet? I dunno. We don't want to have two separate load management systems 
in parallel. But we could limit bursts on opennet to be very small / 
short-lived. To prevent DoS from Sybil nodes on opennet we need either to limit 
bursts to being very small or we need to have some sort of reputation system to 
give long-lived, valuable nodes that have answered many queries and had high 
uptime more capacity; the latter may not be fully compatible with path 
folding???

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

[freenet-dev] Path folding, secure bursts and aborting bursts Re: Fast *and* secure Freenet: Secure bursting and long term requests was Re: Planned changes to keys and UI

Reply via email to

[freenet-dev] Path folding, secure bursts and aborting bursts Re: Fast and secure Freenet: Secure bursting and long term requests was Re: Planned changes to keys and UI