On Saturday 22 Dec 2012 00:37:39 Arne Babenhauserheide wrote: > A few comments: > > Am Freitag, 21. Dezember 2012, 00:31:25 schrieb Matthew Toseland: > > - Pad the small keys to be the same size as big keys, since there are likely > > mostly big keys. > > - Pad the keys with extra redundancy until you get to a standard number of > > keys. This will be some standard formula like 1-7 * a power of 2. > > - Use K and X to derive K_n and X_n for each block. > > → anonymize the keys?
Exactly. > > > into their pre-insert cache, including X_n. > > Why a special cache? Why not normal insert? We don't want the pre-inserted keys to stick around. Hmmm, what about DoS attacks against the pre-insert cache? Maybe we should use the store; we do want to free up the slots when we've inserted successfully though (in spite of random replacement this still makes the store more efficient). > > > The first node on the chain which has the pre-insert in its pre-insert cache > > decrypts the block and does a normal insert. > > I see an attack here: > > I insert many chunks which go to the same part of the keyspace. When I send > the reveal, I can start a distributed replacement attack. > > Any chance to avoid that? Ahhh, you mean you can bombard one keyspace location with lots of spam inserts, in the hopes of driving a single document out, and while avoiding the usual load management limits between you and the target? Hmmm... No immediately obvious solutions ... This is also possible even with "random route for the first N hops of an insert", a very simple solution that is planned for inserts and also has major security benefits. And it's even going to be true for any sort of tunneling, though it depends on how cheap it is to make new tunnels... Of course it's easy on opennet as you just keep announcing new nodes (another reason why we should introduce some limits on announcement, although ultimately this is futile). How important is it? Multi-location, self-healing SSKs are planned (using PSKs of course), which may help, but only multiply the number of targets by a small factor. Also, we can't have these mechanisms be specific to CHKs - SSKs have a greater need for protection. I guess it comes down to how effective will the attack be, given the structure of our datastores? How many keys on average do we need to send to a node to displace a target document, given they are close to the target location so will be routed to the target node, and given we don't know the salt value for that store? Won't that be a significant fraction of the datastore? Also, the most targeted keys are likely to be popular keys, in which case we only need a mechanism for them to be reinserted when found in somebody's cache. We have such mechanisms, although they may be imperfect. This is a censorship attack and so we need to have a robust answer for it ... If we can quantify it, e.g. say that it's possible but you'd need to continually commit a significant amount of bandwidth per targeted key depending on its popularity, that's enough for me ... Also, anything that makes finding keys more efficient will help, e.g. bloom filter sharing. > > > chain are still connected it will prove to them that the insert has been > > How do they prove it? Forwarding a portion of the reveal request? Not all of it, we don't want them to be able to decode the block; X_n is probably enough? > > > There are various increasingly complex solutions to starting the reveal > > somewhere other than the originator. If we do #1 above, we can exploit the > > fact that MassReveal is just K, X, and n (the number of blocks), i.e. it is > > tiny. However, even if we do #2 above, it's still small - just maybe not > > small enough for Dining Cryptographers. > > Oh, that’s the how. > > Nice! > > http://en.wikipedia.org/wiki/Dining_cryptographers_problem Right, we can use provably anonymous broadcast groups provided the message is very small, and provided that it can be announced all at once, before there is any churn in the membership of the group. The problem comes if we can't do the reveal for the whole file in one broadcast, or, if we do two stages, if a node receives both the original and the reveal so can connect the two broadcast groups. These sorts of things look very hard to execute though IMHO? Reveal could be infinitely scalable if we don't need the preinsert targets to return encryption keys, i.e. we can just reveal K and X and then have it span out to enough nodes, but it looks to me like we need to encrypt the reveal keys individually for good security. If the attacker has a node in the group that receive K and X (the DC broadcast group perhaps but also the set of nodes which translate K, X into individual reveal messages), he can identify *ALL* the blocks which went past him, even where he was not the replyer; if only the first hop reached can decrypt it, he needs a significantly higher density; he can decrypt the block if 1) he receives the reveal for the block, before it reaches the first node which had sent a key back to the originator, and 2) he has sent a key back to the originator (regardless of whether he is the node reached first). We could just send one at a time, but that means more round-trips, and round-trips are *bad*... So just giving K, X implies we probably want to partition it into smaller groups, tradeoffs? Encrypting properly allows any of the preinsert repliers to decrypt the block, but no other blocks (assuming there aren't any problems with HTL etc; we might consider changing that for preinserts). The thing is, the preinsert repliers won't be on the path for the reveal unless they are close to the target location. So there is a density-of-attackers requirement here: First he must be on the preinsert path (cm/n), then he must be on the reveal path (cm/n again, but not independant; it comes down to having a node near the target location). c = attacker nodes, m = total blocks inserted, n = total nodes. The limiting factor then is that for big inserts, we can't anonymize the reveal as a single unit. One solution to that is to divide preinserts into a limited number of groups, each inserted separately, with the groups getting bigger for bigger inserts. Each group only needs two keys, so we can keep the broadcasted data small. It means it will take longer for the reveal to complete; big inserts take longer, this is normal; but it might affect reliability; one key per node guarantees that the total time to reveal is only going to be the time for a slow insert. Another possibility would be to use a pyramid, like in splitfiles - pre-insert and then reveal the list of blocks to reveal! The pyramid would have a good fan-out factor, each layer becoming much bigger than the previous, so there should be very few such levels. And we can use anonymous broadcast for the bottom layer. Pre-insert the actual blocks 0...100,000. Each returns a pubkey. Create a "reveal file" consisting of the instructions for revealing the blocks. Pre-insert the reveal file to block 0..100. Anonymously broadcast instructions to reveal the reveal file, via DC broadcast, among a randomly (but robustly) constructed cell you happen to be momentarily a member of. The instruction to reveal the reveal file is executed by some nodes in the cell, in accordance with the standard policy. The nodes that receive the reveal of the reveal file, which originally got the pre-inserts of the reveal file, decrypt their chunks of the reveal file. They then have their instructions for posting reveal's. They post the reveals, which match up with the original preinserts. The original preinserts execute the final inserts of the actual blocks. Attacks: If you control the node receiving a chunk of the reveal file, and the node receiving a chunk of the original file, you can connect all the pre-inserts in that chunk, and therefore can get a predecessor sample on any nodes which you control which receive actual blocks from that file. Performance: Total reveal time is equal to a few slow inserts. This minimises the chance of nodes going offline etc. On a broader view of performance, the number of blocks lost before revealing may be an issue. Redundancy helps though. Timing might also give stuff away. We can identify two reveals as belonging to the same group because they were started at the same time. Delays and thresholds may help with this. > > Best wishes, > Arne
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Devl mailing list [email protected] https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
