convergent encryption reconsidered -- salting and key-strengthening

zooko Mon, 31 Mar 2008 08:07:19 -0700

[This conversation is spanning three mailing lists --cryptography@metzdowd.com, [EMAIL PROTECTED], and tahoe-[EMAIL PROTECTED] . Some of the posts have not reached all three ofthose lists. I've manually added Jerry Leichter and Ivan Krstić tothe approved-senders set for p2p-hackers and tahoe-dev, so thatfurther posts by them will appear on those lists.]

On Mar 30, 2008, at 3:13 PM, Ivan Krstić wrote:


Unless I'm misunderstanding Zooko's writeup, he's worried about an
attacker going from a partially-known plaintext (e.g. a form bank
letter) to a completely-known plaintext by repeating the following
process:

1. take partially known plaintext
2. make a guess, randomly or more intelligently where possible,
    about the unknown parts
3. take the current integrated partial+guessed plaintext, hash
    to obtain convergence key
4. verify whether that key exists in the storage index
5. if yes, you've found the full plaintext. if not, repeat from '2'.

That's a brute force search.

That's right. Your comparison to normal old brute-force/dictionaryattack on passwords is a good one, and one that Jim McCoy and JerryLeichter have alluded to as well.


Think of it like this:

Passwords are susceptible to brute-force and/or dictionary attack.We can't, in general, prevent attackers from trying guesses at ourpasswords without also preventing users from using them, so insteadwe employ various techniques:

* salts (to break up the space of targets into subspaces, of whichat most one can be targeted by a given brute-force attack)* key strengthening (to increase by a constant factor the cost ofchecking a password)* rate-limits for on-line tries (i.e., you get only a small fixednumber of wrong guesses in a row before you are locked out for a time-out period)

However, secrets other than passwords are not usually susceptible tosuch attacks. You can store your True Name, credit card number, bankaccount number, mother's maiden name, and so forth, on the sameserver as your password, but you don't have to worry about usingsalts or key strengthening on those latter secrets, because theserver doesn't run a service that allows unauthenticated remotepeople to connect, submit a guess as to their value, and receiveconfirmation, the way it does for your password. (In other words,for such data we generally use an extreme form of the third defense,rate-limiting tries -- the number of guesses that an attacker gets islimited to none!)

Likewise, if you are going to store or transmit those kinds ofsecrets in encrypted form using a traditional randomly-generatedencryption key, then you don't have to worry about brute-force/dictionary attacks because your strong randomly-selected symmetricencryption key defies them.


The Key Point:

*** Convergent encryption exposes whatever data is put into it tothe sorts of attacks that already apply to passwords.

Convergent encryption had been invented, analyzed and used for manyyears, but to the best of my knowledge the first time that anyonenoticed this issue was March 16 of this year (at 3 AM Chicago Time),when Drew Perttula and Brian Warner made that observation. (Althoughto be fair some of the best-known uses of convergent encryptionduring these years have been sharing publicly-known files withstrangers, in which case I suppose it is assumed that the cleartextdoes not contain secrets.)

Now PBKDF2 is a combination of the first two defenses -- salting andkey strengthening. When you first suggested PBKDF2, I -- andapparently Jerry Leichter -- thought that you were suggesting itssalting feature as a solution. The solution that we've come up withfor Tahoe (described in my original note) is much like salting,except that the added value that gets mixed in is secret andunguessable, where I normally think of salt as non-secret.

Now I see that you are also emphasizing the key strengthening featureof PBKDF2.

"k" denotes symmetric encryption key, "p" denotes plaintext, "c"denotes ciphertext, "s" denotes salt, "E(key, plaintext)" isencryption, "H()" is secure hashing, "H^1000()" is secure hashing athousand times over, i.e."H(H(H(H(...))))" a thousand times.


Traditional encryption:

k = random()
c = E(k, p)

Traditional convergent encryption:

k = H(p)
c = E(k, p)

Tahoe-style convergent encryption with added secret ("s"); "s" canbe re-used for any number of files, but it is kept secret fromeveryone except those with whom you wish to converge storage.


s = random()
k = H(s, p)
c = E(k, p)

PBKDF2 (simplified); "s" can be re-used but is generally not, and itis not secret.


s = random()
k = H^1000(s, password)
c = E(k, p)

Now, one could imagine a variant of traditional convergent encryptionwhich added key strengthening:


k = H^1000(p)
c = E(k, p)

This would have a performance impact on normal everyday use of Tahoewithout, in my current estimation, making a brute-force/dictionaryattack infeasible. Key strengthening allows you to choose an amountof wasted CPU that you are willing to impose on your users duringnormal use, and multiply the attacker's costs by exactly thatamount. If the attacker has 2^64 computational capacity, and theusers are willing to waste 2^10 extra computrons on each file access,then the attacker's effective capacity is reduced to 2^54.

The trade-off is actually worse than it appears since the attacker isattacking multiple users at once (in traditional convergentencryption, he is attacking *all* users at once), so he gains aneconomy of scale, and can profitably invest in specialized tools,even specialized hardware such as a COPACOBANA [1]. At the veryleast he can profitably devote many CPU cores to churning out newguesses 24/7, while for normal users it is not profitable to allocatea 24/7 CPU load to strengthening their keys. The reason for thisdisparity is that the attacker gets to attack everyone at once forthe same cost as attacking only one target, where the defenders haveto pay each for his own defense.

Next, one could imagine a variant of Tahoe's convergent encryptionwith added secret which adds key strengthening:


s = random()
k = H^1000(s, p)
c = E(k, p)

This would likewise be costly to normal users, but moreover it is notneeded because the "s = random()" part of the algorithm locks out allattackers except those with whom s is shared from mounting such anattack at all.

Thank you for your comments on this issue. If you have furtherideas, especially as would be relevant to the Tahoe Least-AuthorityFilesystem, I would love to hear them.


Regards,

Zooko O'Whielacronx

[1] http://copacobana.org/
---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]

convergent encryption reconsidered -- salting and key-strengthening

Reply via email to