Re: [p2p-hackers] convergent encryption reconsidered

zooko Wed, 26 Mar 2008 10:23:06 -0700

Jim:

Thanks for your detailed response on the convergent encryption issue.

In this post, I'll just focus on one very interesting question thatyou raise: "When do either of these attacks on convergent encryptionapply?".

In my original note I was thinking about the allmydata.org "Tahoe"Least Authority Filesystem. In this post I will attempt to followyour lead in widening the scope. In particular GNUnet and Freenetare currently active projects that use convergent encryption. Thelearn-partial-information attack would apply to either system if auser were using it with files that she intended not to divulge, butthat were susceptible to being brute-forced in this way by an attacker.



On Mar 20, 2008, at 10:56 PM, Jim McCoy wrote:


On Mar 20, 2008, at 12:42 PM, zooko wrote:

  Security engineers have always appreciated that convergent
  encryption allows an attacker to perform a
  confirmation-of-a-file attack -- if the attacker already knows
  the full plaintext of a file, then they can check whether a
  given user has a copy of that file.


The truth of this depends on implementation details, and is an
assertion that cannot be said to cover all or even most of the
potential use-cases for this technique.

You're right. I was writing the above in the context of Tahoe,where, as Brian Warner explained, we do not attempt to hide thelinkage between users and ciphertexts. What I wrote above doesn'tapply in the general case.

However, there is a very general argument about the applicability ofthese attacks, which is: "Why encrypt?".

If your system has strong anonymity properties, preventing peoplefrom learning which files are associated with which users, then youcan just store the files in plaintext.

Ah, but of course you don't want to do that, because even withoutbeing linked to users, files may contain sensitive information thatthe users didn't intend to disclose. But if the files contain suchinformation, then it might be acquired by the learn-partial-information attack.

When designing such a system, you should ask yourself "Whyencrypt?". You encrypt in order to conceal the plaintext fromsomeone, but if you use convergent encryption, and they can use thelearn-partial-information attack, then you fail to conceal theplaintext from them.

You should use traditional convergent encryption (without an addedsecret) if:


1.  You want to encrypt the plaintext, and
2.  You want convergence, and

3. You don't mind exposing the existence of that file (ignoring theconfirmation-of-a-file attack), and4. You are willing to bet that the file has entropy from theattacker's perspective which is greater than his computationalcapacity (defeating the learn-partial-information attack).

You should use convergent encryption with an added secret (asrecently implemented for the Tahoe Least Authority Filesystem) if:


1.  You want to encrypt the plaintext, and

2. You want convergence within the set of people who know the addedsecret, and3. You don't mind exposing the existence of that file to people inthat set, and4. You are willing to disclose the file to everyone in that set, orelse you think that people in that set to whom you do not wish todisclose the file will not try the learn-partial-information attack,or if they do that the file has entropy from their perspective whichis greater than their computational capacity.

I guess the property of unlinkability between user and file addressesissue 3 in the above list -- the existence of a file is a much lesssensitive bit of information than the existence of a file in aparticular user's collection.

It could also effect issue 4 by increasing the entropy the file hasfrom an attacker's perspective. If he knows that the ciphertextbelongs to you then he can try filling in the fields with informationthat he knows about you. Without that linkage, he has to try fillingin the fields with information selected from what he knows about allusers. But hiding this linkage doesn't actually help in the case theattacker is already using everything he knows about all users toattack all files in parallel.

Note that using an added secret does help in the parallel attackcase, because (just like salting passwords) it breaks the space oftargets up into separate spaces which can't all be attacked with thesame computation.

The first problem is isolating the original
ciphertext in the pool of storage.  If a file is encrypted using
convergent encryption and then run through an error-correction
mechanism to generate a number of shares that make up the file an
attacker first needs to be able to isolate these shares to generate
the orginal ciphertext.  FEC decoding speeds may be reasonably fast,
but they are not without some cost.  If the storage pool is
sufficiently large and you are doing your job to limit the ability of
an attacker to see which blocks are linked to the same FEC operation
then the computational complexity of this attack is significantly
higher than you suggest.


The attacker can do this job more easily, in two ways:

1. He doesn't need to erasure-decode in order to check whether agiven erasure-coded share was generated from a given plaintext. Hecan work forward from guessed-plaintext to encryption key to partialciphertext to partial erasure coded share, and check that. (BrianWarner already explained that this is actually even easier for anattacker in Tahoe, because he can then go from encryption key tostorage index and check that, but in this post I'm trying to addressthe more general case.)

2. In the "parallel attack", he doesn't need to figure out whicherasure-coded shares correspond to which files. For example, supposehe collects the first 32 bytes of many "blocks", where a block is theoutput of erasure coding after encryption. Then he tries plausibleplaintexts, generates the encryption key, generates the first fewbytes of ciphertext, and then generates the first 32 bytes of erasurecoded share. (The details vary here depending on the encryption anderasure coding, but you see that for typical encryption and typicalerasure coding, this is much less work than you might think.) Now hechecks if the 32 bytes that he generated appear in the set of 32-byteblock headers that he collected. If he gets a match, then he haslearned the full contents of a file in the system, although hedoesn't know which file or which user.



Regards,

Zooko

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]

Re: [p2p-hackers] convergent encryption reconsidered

Reply via email to