On Monday 07 Jan 2013 19:39:40 Matthew Toseland wrote: > On Monday 07 Jan 2013 18:11:14 Matthew Toseland wrote: > > On Saturday 29 Dec 2012 14:53:51 Arne Babenhauserheide wrote: > > > Am Samstag, 22. Dezember 2012, 02:17:24 schrieb Matthew Toseland: > > > > Hmmm, what about DoS attacks against the pre-insert cache? Maybe we > > > > should > > > > use the store; we do want to free up the slots when we've inserted > > > > successfully though (in spite of random replacement this still makes the > > > > store more efficient). > > > > > > wouldn’t the pre-insert keys never be requested, so they would drop out > > > anyway > > > after 14 days on average? (following digger3’s stats) > > > > Yes. I didn't think it was that bad though. :| > > > > > > > > I insert many chunks which go to the same part of the keyspace. When I > > > > > send > > > > > the reveal, I can start a distributed replacement attack. > > > > > > > > > > Any chance to avoid that? > > > > > > > > Ahhh, you mean you can bombard one keyspace location with lots of spam > > > > inserts, in the hopes of driving a single document out, and while > > > > avoiding > > > > the usual load management limits between you and the target? > > > > > > Yes, exactly that. > > > > > > It allows attacking a given location without being detectable. > > > > > > Though that’s essentially a condition to being anonymous, so I don’t know > > > if > > > it is fixable… > > > > > > Essentially you can pretend to be many different people at the same time. > > > If > > > you can orchestrate the reveal, you could create very many inserts to > > > some > > > location coming in one fast burst. And that could make it possible to > > > kill a > > > document with known content before anyone can download it. > > > > The attack isn't instant. You have to prepare it in advance and cannot > > retarget it. All that you get from the random routing is avoiding the > > bottlenecks between one node and the target location. > > > > And yes, we will probably need random routing, for a whole set of reasons. > > Even with classic mixnet tunnels there will be multiple tunnels for an > > insert, and nothing to stop an attacker using more tunnels if he wants to. > > > > > > > I guess it comes down to how effective will the attack be, given the > > > > structure of our datastores? How many keys on average do we need to > > > > send to > > > > a node to displace a target document, given they are close to the target > > > > location so will be routed to the target node, and given we don't know > > > > the > > > > salt value for that store? Won't that be a significant fraction of the > > > > datastore? > > > > > > That’s a very good point. > > > > > > If we want to be reasonable sure that we can kill a given file, we might > > > have > > > to displace 90% of the store or so. > > > > Right. So we can simply impose a minimum store size of say 5GB, and this > > attack gets pretty hard. > > > > > > I think that the assumption, that normal insert behaviour is pretty > > > random > > > distributed in the keyspace, so a sudden surge of inserts to a given > > > location > > > could be detected as an attack. > > > > I'm not sure how easy it would be to detect. > > > > > > And since that would very likely only happen with an attack, adding > > > limits to > > > displacement in the store should kill those attacks but only marginally > > > affect > > > regular inserts. > > > > > > In my 1GiB store, I have a write rate of 0.08/s and 13,414 keys, so if I > > > interpret the data correctly, it takes 40h till the number of write > > > requests > > > equals the number of keys. > > > > We have both the store and the cache. Both of these are in practice mostly > > specialised around our location. > > > > The store only gets written on inserts, and only when we're the sink. So > > remotely overwriting it from far away on the network may be rather > > difficult, unless they already know our location and peers - which they > > might. Conceivably we could detect an attempt to overwrite the *store* > > simply because the attacker would have to customise the keys' locations to > > ensure they are stored. It's not so easy with the cache but I guess it > > doesn't matter as much with the cache? Or they could keep a plausible > > distribution of locations / low hit rate, and require far more keys to > > overwrite everything... > > > > > > To estimate the amount of keys displaced after 40h: > > > > > > N: Number of keys in the store. > > > First key: 100% chance of displacing an old key. > > > Second: (1 - 1/N) chance of displacing an old key. > > > … > > > since I stumbled with the math, I realized it in a recursive function: > > > > > > def displacedold(n, steps, step=0, nth=None): > > > if step == steps: > > > return nth > > > if step == 0: > > > return displacedold(n, steps, step=step+1, nth=1 - 1.0/n) > > > return displacedold(n, steps, step=step+1, nth=nth*(1 - 1.0/n)) > > > > > > def displacedourkey(n, steps): > > > dis = 1.0/n > > > for step in range(steps-1): > > > dis += displacedold(n, step+1)/n > > > return dis > > > > > > print displacedourkey(100, 100) > > > → 0.63 > > > print displacedourkey(100, 200) > > > → 0.86 > > > print displacedourkey(100, 300) > > > → 0.95 > > > print displacedourkey(100, 400) > > > → 0.98 > > > > > > This stays similar for bigger numbers of keys, but my algo is bad enough > > > that > > > it quickly exceeds the maximum recursion depths :) > > > > > > So to be 95% sure that you replaced a key, you have to create inserts of > > > 3 > > > times the size of the store. Which in case of my store would take about 6 > > > days. > > > > > > If you could triple the rate of writes to my store with the distributed > > > insert > > > attack, you could decrease the lifetime to 2 days. > > > > > > But that could be stopped by throttling the rate of change of write > > > traffic to > > > the node (if that’s possible). Then an attacker would have to keep > > > inserting > > > constantly and the attack would be twarted. > > > > > > The attack draws its strength from being able to insert first and then > > > using > > > the reveal step to force a huge number of writes to a given location in a > > > very > > > short time. Essentially it allows storing up bandwidth and then releasing > > > it > > > in a much shorter timespan. > > > > Yep. > > > > > > And that is not possible if you have to keep up the insert traffic over a > > > longer timespan. > > > > > > And that can be stopped by throttling the rate of change of write traffic > > > to > > > the store. For example it would be possible to limit the writes per > > > minute to > > > 3× the daily rate. > > > > Good point. If you have to maintain traffic over a long period, you are > > better off doing regular inserts. Of course, the other thing to take into > > account is that the capacity of the node is limited, so many of the inserts > > will be redirected away from the originally targeted node. > > > > If you are not targeting a specific, single node, you will need to use a > > lot more bandwidth, because it's hard to target the node exactly if you > > don't even know what it's location is; so most of your inserts won't be > > stored. However, it doesn't matter as much if your inserts are redirected, > > since you're targeting a group of nodes, a location. But it will require a > > *lot* more bandwidth. > > > > If you are targeting a specific, single node, on opennet you can surround > > it (maybe indirectly, e.g. with a layer of innocent nodes), although we're > > working on ways to reduce that. > > > > > > Then displacing a key in my store with high probability (95%) would take > > > at > > > least 2 days. And that is much longer than the time needed for completing > > > an > > > upload, announcing the upload on Sone and someone downloading it. > > > > > > And as soon as the upload has been downloaded, it is in the cache and the > > > attack is useless, because downloaders can heal the inserted key. > > > > I'm not sure I follow your scenario here. It takes longer to displace a key > > than to reinsert it to a different key? > > > > > > (please doublecheck my logic, though!) > > > > > > > If we can quantify it, e.g. say that it's possible but you'd need to > > > > continually commit a significant amount of bandwidth per targeted key > > > > depending on its popularity, that's enough for me ... > > > > > > Throttling the write rate gives that. > > > > > > The attack tries to make all nodes around you send you inserts all of a > > > sudden, so the write rate has a massive spike. Smooth that spike and the > > > attack is mostly useless. > > > > So if our store rate suddenly spikes, we route the inserts anyway, but > > randomly refuse to store? We can't selectively reject inserts because > > backoff doesn't care whether it's an insert or a request... I'm not sure > > whether exploiting such a countermeasure would be a cheap way to DoS > > requests by causing us to backoff? > > > > > > > > > chain are still connected it will prove to them that the insert has > > > > > > been > > > > > > > > > > How do they prove it? > > > > > > > > Forwarding a portion of the reveal request? Not all of it, we don't want > > > > them to be able to decode the block; X_n is probably enough? > > > > > > I don’t know it well enough to say something about that. But I trust your > > > judgement here. > > > > > > > > http://en.wikipedia.org/wiki/Dining_cryptographers_problem > > > > > > > > Right, we can use provably anonymous broadcast groups provided the > > > > message > > > > is very small, and provided that it can be announced all at once, before > > > > there is any churn in the membership of the group. The problem comes if > > > > we > > > > can't do the reveal for the whole file in one broadcast, or, if we do > > > > two > > > > stages, if a node receives both the original and the reveal so can > > > > connect > > > > the two broadcast groups. These sorts of things look very hard to > > > > execute > > > > though IMHO? > > > > > > I don’t know enough to answer that. > > > > > > > previous, so there should be very few such levels. And we can use > > > > anonymous > > > > broadcast for the bottom layer. > > > > > > I’m a bit wary of broadcast, since that creates very many messages. Is it > > > a > > > real broadcast (reaching all nodes in the network) or limited somehow? > > > > I said "broadcast groups". We'd set up groups of nodes and use DC anonymous > > broadcast within a group. The groups would have a limited, fixed number of > > participants. This is very inefficient, so could only be used for > > broadcasting K and X, or the top level of a multi-level reveal structure. > > Also it has a smaller anonymity set than using tunnels. > > > > > > But all in all, the scheme sounds pretty good to me. I don’t have > > > sufficient > > > knowledge to really judge it completely, though. > > > > Thanks! > > > > My main concern is how does this compare to a more traditional tunneling > > approach, given the reliability issues on Freenet, the fact that on darknet > > one hop of a tunnel will be many actual hops and therefore will often be > > very slow (bottlenecked by the slowest node). On opennet conceivably we > > could use direct connections for tunnels, like I2P and Tor do, but bear in > > mind that opennet has other problems and we don't want to encourage its > > usage too much; an attacker can dominate the keyspace/topology and hence > > control most tunnels, for example. I.e. we can kill Sybil on darknet, but > > not on opennet. > > > > We have several options: > > 1. Rendezvous tunnels. Effectively only a single hop tunnel. Probably the > > cheapest, simplest option, but security is mediocre. > > 2. Tunnels like I2P or Tor. Essentially real-time. Use many tunnels and > > create new ones constantly throughout an insert; attacker can try to > > observe tunnel creation after he's seen the first one. > > 3. Tunnels like mixmaster. Store-and-forward, but not reliable, because of > > node uptime issues. Limit unreliability by having many backup routes at > > each point (which means the attacker is more likely to be one of them; > > these would probably have to be around a single location to limit extra > > routing?). And we'd still need to use multiple tunnels, though not as many. > > Chunk size may be limited by node uptime. We may need additional redundancy. > > 4. Preinsert/reveal. Expensive, real-time tunnels for the reveal's. > > Preinserts travel twice (or 3x) the usual (storage) number of hops. May > > need additional redundancy, probably not as much as #3. > > > > #3 and #4 only work for inserts. #1 and #2 could work for downloads too. > > It's probably worth implementing more than one strategy, so we have really > > good protection for bulk inserts, and adequate protection for downloads. > > However, in general, securing downloads is harder, and tolerance of > > slowness for downloads is likely less than for inserts. > > > > #3 is probably safer than #2. > > > > #4 is original, not well-studied, but it looks to me like it is comparable > > to #3 for security. > > > > UI issues: > > - #1, #2 no UI issues > > - #3 significant UI issues: Takes time to finish the insert on > > store-and-forward, and it's probably best not to have any feedback? > > - #4: Upload progress, then reveal is very fast. May or may not be feedback > > but should be done quickly after preinsert upload phase finishes so not a > > big deal. > > > > PERFORMANCE DEPENDS ON NUMBER OF HOPS: > > > > Preinsert/reveal needs to use twice the number of hops used for normal > > storage. This may well be too big; currently it's ~ 20. It's bigger than it > > needs to be partly because of wanting to store data in more than one place. > > > > Other forms of tunnels use, approximately, multiples of the network > > diameter. Which could be less than the number of hops for storage. Lets say > > 8 or 10. > > > > So conceivably #3 could be faster than #4 because data travels fewer hops. > > > > In an ideal world we'd combine #3 and #4 so we have many tunnels, all of > > which are totally useless to an attacker (unless they are happy to trace > > every tunnel!), then one reveal. But that'd be rather slow! (Other problem > > is how do we know when to post the reveal? Hence it's not really fire and > > forget, which ideally we'd like to be true of #3; and anything that > > distributes the reveal will allow an attacker to identify the tunnels). > > > #3 is looking complex actually. > > Reliability issues: What happens if we are transferring an onion and the node > we are sending to goes down? We send it elsewhere. What happens if we are > receiving an onion and the sender goes down? We assume they will send it > elsewhere. What if both go down? Clearly this is not going to be 100% > reliable. Multi-location keys may help, but make the tunnels easily > identified; see below. > > How do we notify the original inserter that the insert is completed? Ideally, > we'd like to preserve the current form of inserts: Insert the unpredictable > data blocks, then insert the top block; the original inserter subscribes to > the top block, along with everyone else. (E.g. for a bookmarked freesite). > But that means we need a way to ensure that the top block (or top block plus > announcement - small number of predictable keys) is inserted *last*. > > We can do it on the network and make #3 true fire-and-forget: > - Choose a group of nodes X for the completion. > - Insert through N tunnels, as many as needed for performance and > reliability. Each one contains a separate onion routing to X. When each > tunneled insert group finishes, it uses the onion to send a message, > including its own identity, and a secret provided by the original inserter. > - When all (or enough) of the inserts have reached X, they combine, using a > shared secret scheme as well as encryption to X. > - X (whichever one of the group is handling the completion) does something: > -- Insert the top block (or a small set of blocks). > -- Start a reveal protocol. If the lower blocks are predictable, it would be > nice to have the option to do preinsert/reveal. But we still want > fire-and-forget. > - ORIGINAL INSERTER: > -- Subscribe to the key. Just like everyone else who is waiting for it. > -- Fetch the data. Just like everyone else waiting for it. (Timing issues > possibly, depending on what the key is, e.g. is manual intervention needed?). > Maybe through a tunnel, again depending on popularity and protocol. > -- If the data is incomplete, identify which sub-onion-insert is responsible. > Maybe we can avoid them in future. Then: > --- Partial reinsert, tunneled, possibly with blind/reveal, OR > --- Reinsert from scratch. > -- Second option is likely significantly safer. But if the fetch is tunneled > the first option may be acceptable. > -- We could have the original inserter insert the top block, after testing, > then the rendezvous is just to notify them that they can. But it's probably > safer to avoid this. Tradeoff? > ADVANTAGES: > - True fire and forget. > - Tunnels aren't perfect, and we're using many of them. So take advantage of > our existing security; nobody can identify tunnels until it's finished. > - If blocks are unpredictable below the top, insert the top block only after > the rest are in. > - If blocks are predictable, we can still do preinsert/reveal, in spite of > having tunneled everything. > > EXPENSIVE POSSIBILITIES: (SPECULATIVE!) > - Healing. FEC should happen before decryption IMHO, for e.g. backups. So it > could conceivably be done on the network. > - Test the lower level inserts by fetching a few random blocks (list provided > by the originator, not known to the sub-inserters), before completing insert. > - Interesting, but probably not realistic, at least not without some means of > payment. On darknet, that might be possible with some simple Ripple-like hop > by hop accounting. Using a true currency such as bitcoin is IMHO too > politically/legally risky, as well as impractical. Also IMHO making such > things cost-based is risky security-wise; who's gonna run stuff at a loss > just to see the traffic? The CIA of course! Never pay for anonymising > services unless they are provably secure, e.g. coin blinding. > - Plus, how do we know we can we trust the group rendezvous point? Less of a > problem if it's only doing a single insert. > > Something closely related: > > Dead man's switch protocol: > - Generate dead man certificates. > - Onion route them to various nodes on the network. > - The nodes receiving them will, after a specified delay, probe a specific > set of keys (different each time), and if no outcome, will either insert a > key or start an onion route. > - The latter triggers a combination like #3 feedback above: If enough keys > arrive, and if whatever conditions are correct on the recipient (e.g. time), > it does something. > > General programmable solution? Maybe. We don't want anything that is > over-reliant on a single node. So no ssh, no SVN, etc. But a lot of stuff can > be built from small, strictly limited programs running on a handful of nodes. > What distinguishes Freenet from Tor and I2P: > - Freenet is a storage network. > - Most of its traffic is non-realtime. (Not strictly true but that's the > intention) > - Freenet does not rely on any single node. > - Freenet does not try to, or support, relying on any single node. > > Preinsert tweaks: > - Do we want to be able to probe and partially renew a preinsert? Possibly > without fetching it i.e. only the originator can do so. > - Or should a preinsert be usable *only* by revealing it? > - Do we want to be able to explicitly delete a preinsert? Since it's not a > real insert anyway? Of course this is unreliable ... > - And/or give it an expiration date? This at least is probably a good idea, > if we can implement it efficiently. > > > > CONCLUSIONS? > > We probably want all four options. And that's a lot of work! If that's true, > the question is what is most urgent. I guess the answer to that is #1: > rendezvous tunnels??? > AFAICS the priority order is: - Basic: random route on high HTL on inserts. - Fast but poor: Rendezvous tunnels? Use for inserts by default? - More advanced: Some sort of expensive source-routed tunneling for top blocks / chat posts. Only used to insert top block, one or two keys at once. Does not work (well/at all/safely) for whole big inserts, reinserts. - Reusable realtime tunneling for requests and realtime inserts. - Offline fire-and-forget, store-and-forward tunneling for secure bulk inserts/reinserts. - Distributed completion for bulk inserts. - Possibly preinsert/reveal as well. With or without tunneling.
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Devl mailing list [email protected] https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
