On Sunday 23 March 2008 02:46, Ian Clarke wrote:
> Zooko thinks this may be relevant to Freenet - thoughts?
> 
> http://lists.zooko.com/pipermail/p2p-hackers/2008-March/001617.html
> 
> Ian.

Okay, so basically this attack is:

We can guess a likely form plaintext. The interesting data has very low 
entropy - say 8 bytes of entropy, which might be over more than 8 bytes of 
data. The rest is predictable. So we try each possible value, and compute its 
CHK. We then match that table (which can be precomputed, of course), against 
the CHKs we know about. Another case, for Freenet: If an attacker knows a 
splitfile will begin with a specific file, which is longer than one block, he 
can confirm this.

It's not IMHO a major attack on Freenet, unless you use Freenet as a 
distributed backup mechanism, which has been proposed, but not implemented; 
the main disadvantage is that Freeet is lossy (there are ways to reduce this 
but not eliminate it). There are certainly a few use cases where this might 
be a problem. However, there are much bigger problems with CHK-based 
splitfiles, as I've pointed out in the last few months: 

If an attacker is trying to trace the source of a document on an easily 
navigable network, and he is only able to connect to a limited number of 
nodes at a time, he is limited by the number of back-bearings he can take at 
any given point. So for example if we inserted the top block of a splitfile 
first, and he could recognise that (because a user posted it in advance, or 
because it's the target's SSK), he can then use *each block* to move towards 
the target. This is why we now insert the top block and provide the CHK to 
the user only after we've inserted all the below layers. 

Given that the attacker can only identify the insertor after the last block is 
inserted, the attacker can still check logs on nodes he controls, after he 
has the top block. These may give him a reasonable fix on the originator 
keyspace wise, however the amount of information is limited by the distance 
between the attacker and the insertor. But if he knows immediately that each 
block is part of the splitfile, he can connect to closer nodes using the data 
he gains, and rapidly increase the amount of data he collects.

Even if we insert the top block last, if people insert predictable files (as 
an insert on demand system such as Frost filesharing does, or something 
similar implemented manually using forums), the attacker can do exactly the 
same thing (if he can connect the identity to the likely content(s) in 
advance, or if he has so much capacity that he can trace every insert for 
predictable content going on at the time). This is the first, well known 
attack mentioned in the paper, applied to Freenet.

Of course, for requests of publicly known files, an attacker can do this sort 
of attack immediately. But preserving security for insertors is IMHO more 
important, and as I've shown we have significantly better security - IF the 
content is unpredictable. If the attacker can predict the first few blocks he 
can identify them, and *maybe* use that to latch onto the rest of the stream 
(e.g. by HTL; we've recently reduced our dependancy on HTL a lot). If he can 
predict the whole file he can do Really Bad Things.

Anyway, there are solutions to make inserts even more secure, but they come at 
a significant cost (although they can be implemented back compatibly):

1. Easy partial solution with slight cost:
Encrypt the whole splitfile with a key derived from the hash of the whole 
plaintext file. This eliminates prefix attacks. However, if the entropy of 
the whole file is too low, the attacker can still predict it before it is 
published. The tradeoff is that splitfiles which share a prefix no longer 
share the space savings for the first few blocks.

2. Easy full solution with large cost:
Encrypt the whole splitfile with a random key, or salt the splitfile 
encryption key(s) with a random key. This should make life very much harder 
for an attacker attempting to trace an insert. The problem is it eliminates 
all space savings from CHK-based splitfiles. This proposal is feasible 
immediately.

3. Hard full solution with low cost:
Commit/reveal. We insert the blocks encrypted to random locations. When the 
insert is done, we broadcast the key, and the nodes will then decrypt the 
blocks and insert the data to its proper final location. How to implement 
this exactly is unclear though. There are various parameters. It is very 
tempting to make the connectedness of the inserts explicit, and have them all 
subscribe to something that will deliver the decryption key. But we can't do 
this because then an attacker would just trace each group early on, then move 
on to the next, and wait for the reveal to show who the group belongs to. 
Whereas if we don't connect them, each will have to wait for a different key, 
which means re-running the insert with the decryption key, or having it 
subscribe to a key and wait for a second insert. We need to choose the node 
to do the insert, it needs to be as far away from the originator as possible. 
And we ideally want to burn all traces of the original insert *before* we 
release the decrypt key (so only those nodes eligible to do the insert 
remember anything). And ideally we want to have enough data to do some sort 
of accountability for inserts. These are conflicting goals, we may have to 
give up on accountability and accept that inserts will suck if too high a 
proportion of the network are attackers. This proposal requires a good deal 
more research!
Suggested proposal:
- Each block is encoded to a CHK, then encrypted with a different key.
- We do a stage 1 insert for each block. This includes the encrypted block, 
and a pointer to an SSK (preferably with a network global pubkey) where we 
will release the decryption key.
- This has a random drop phase, followed by an HTL countdown. Any node with 
HTL <= 2 stores the encrypted block into temporary space, and subscribes to 
the specified key.
- When all the inserts are done, we start to insert the SSKs.
- When the SSKs are found, the nodes that stored the encrypted keys decrypt 
them and then do a normal insert for the decrypted data.

Obviously this requires full passive requests, and furthermore requires them 
to persist across reboots and support a very large volume of SSKs. IMHO it 
would be possible to implement this eventually in 0.8/0.9. There may be 
easier possibilities, but most likely they wouldn't work for big inserts.

It might be possible to implement a separate registry of files by hash. This 
could be combined with solution 1 or 2 to get back our space savings. It 
would also enable freenet to integrate with other tools more easily. However, 
making it secure would be difficult. For example if we used the FMS web of 
trust as a basis, we'd have to ensure it didn't give away any more info than 
was already known by the identity publishing the file. But it might be 
possible: for example, if an identity publishes a file, the clients running 
for those nodes that trust him have a low trust-related probability of 
downloading it and verifying the hash and file size; if they do they would 
propagate it, and then their peers again have a low probability of verifying. 
A user could interfere in this process if they don't like the file. Hence the 
announcement might propagate across the web of trust, with verification, far 
enough to be seen without global polling, just as a new poster's identity 
might.

Another interesting attack (but only for requests): if the attacker can 
compromise some nodes, he can read the FailureTable. This will tell him for 
recently requested keys the nodes that have requested the key and that we 
have forwarded it to. He can also read the recently used UIDs table, which 
will also enable him to trace requestors (but he would have to compromise 
more nodes). And then there are request structures that haven't been GC'ed 
yet... But even without compromising nodes, requestors are much easier to 
trace than insertors, as shown above. Premix routing and some form of tunnels 
will help, but will not provide perfect security, in 0.8. Arguably #3 above 
is a form of tunneling.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080324/44662eec/attachment.pgp>

Reply via email to