Re: [p2p-hackers] convergent encryption reconsidered -- salting and key-strengthening

2008-04-02 Thread zooko

On Mar 31, 2008, at 4:47 AM, Ivan Krstić wrote:

Tahoe doesn't run this service either. I can't use it to make guesses
at any of the values you mentioned. I can use it to make guesses at
whole documents incorporating such values, which is in most cases a
highly non-trivial distinction.

The way that I would phrase this is that convergent encryption  
exposes whatever data is put into it, in whatever batch-size is put  
into it, to brute-force/dictionary attacks.

If the data that you put in is unguessable, then you needn't worry  
about these attacks.  (Likewise, as Ben Laurie reminds us, using  
strong passwords is a sufficient defense against these attacks on  

You correctly emphasize that typical convergent encryption services  
(which operate on files, or, in the case of GNUnet, on 32 KiB  
blocks), and typical uses of those services (which typically store  
files as produced by apps written for traditional filesystems),  
batch together data in such a way that the aggregate is more likely  
to be unguessable than if each field were stored separately.  I don't  
disagree with this observation.

I am often reminded of Niels Ferguson's and Bruce Schneier's dictum,  
in the excellent _Practical_Cryptography_, that security needs to be  
a *local* property.  They argue that one should be able to tell  
whether a component is secure by inspecting that component itself,  
rather than by reasoning about interactions between that component  
and other components.

Concretely, convergent encryption with a per-user added secret, as  
currently implemented in Tahoe, can be shown to guarantee  
confidentiality of the data, regardless of what the data is.

Traditional convergent encryption can be shown to offer  
confidentiality only with the proviso that the data put into it  
conform to certain criteria -- criteria that cannot be verified by a  
computer nor by a user who is not a skilled security expert.

You may argue that the chance that a user would put non-comformant  
data into it is small.  I don't necessarily disagree, although before  
I became willing to bet on it I would require more quantitative  

However, arguing that component A is secure as long as component B  
behaves a certain way, and that component B is very likely to behave  
that way, is a different sort of argument than arguing that component  
A is secure regardless of the behavior of component B.

For one thing, the behavior of component B may change in the future.   
Concretely, people may write apps that store data in Tahoe in a way  
that previous apps didn't.  Those people will almost certainly be  
completely unaware of the nature of convergent encryption and brute- 
force/dictionary attacks.

Now obviously making the security properties of a system modular in  
this way might impose a performance cost.  In the case of Tahoe, that  
cost is the loss of universal convergence. analyzed  
the space savings due to convergence among our current customers and  
found that it was around 1% savings.  We ( intend to  
monitor the potential savings of universal convergence in an on-going  
way, and if it turns out that there are substantial benefits to be  
gained then I will revisit this issue and perhaps I will be forced to  
rely on an argument of the other form -- that users are unlikely to  
use it in an unsafe way.

Thank you again for your thoughtful comments on this issue.


Zooko O'Whielacronx

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: [p2p-hackers] convergent encryption reconsidered -- salting and key-strengthening

2008-03-31 Thread Ivan Krstić

On Mar 30, 2008, at 9:37 PM, zooko wrote:

You can store your True Name, credit card number, bank
account number, mother's maiden name, and so forth, on the same
server as your password, but you don't have to worry about using
salts or key strengthening on those latter secrets, because the
server doesn't run a service that allows unauthenticated remote
people to connect, submit a guess as to their value, and receive
confirmation, the way it does for your password.

Tahoe doesn't run this service either. I can't use it to make guesses  
at any of the values you mentioned. I can use it to make guesses at  
whole documents incorporating such values, which is in most cases a  
highly non-trivial distinction.

To make such guesses, I need to account for at least:

- file formats, since an e-mail message has a different on-disk
  representation depending on the recipient's e-mail client,

- temporal and transport variance, as PDF documents generally
  incorporate a generation timestamp, and e-mail messages include
  routing headers (with timestamps!),

- document modifications due to variables other than the one(s) being
  guessed, e.g. names, e-mail addresses, customized unsubscribe links.

I would be interested to see an actual real-world example of how a  
document would fall to this attack. It strikes me as a cute threat in  
theory, but uninteresting in practice.

 *** Convergent encryption exposes whatever data is put into it to
the sorts of attacks that already apply to passwords.

Sometimes, under highly peculiar circumstances, etc.

Convergent encryption had been invented, analyzed and used for many
years, but to the best of my knowledge the first time that anyone
noticed this issue was March 16 of this year

FWIW, I have discussed this threat verbally with colleagues when I was  
asked for possible designs for OLPC's server-based automatic backup  
system. I dismissed it at the time as 'not a real-world concern'. I  
might even have it in my notes, but those weren't published, so it's  

Now PBKDF2 is a combination of the first two defenses -- salting and
key strengthening.  When you first suggested PBKDF2, I -- and
apparently Jerry Leichter -- thought that you were suggesting its
salting feature as a solution.

Yeah, sorry, I wasn't being clear. I should've just said a key  
strengthening function rather than naming anything in particular.

This would have a performance impact on normal everyday use of Tahoe
without, in my current estimation, making a brute-force/dictionary
attack infeasible.

Adding, say, 5 seconds of computation to the time it takes to store a  
file is likely to be lost as noise in comparison with the actual  
network upload time, while still making an attacker's life  
_dramatically_ harder than now.

The trade-off is actually worse than it appears since the attacker is
attacking multiple users at once (in traditional convergent
encryption, he is attacking *all* users at once)

Again, is there a real-world example of the kind of data or documents  
that would show this to be a serious problem? While it's technically  
true that you're targeting all the users in parallel when brute  
forcing, note that if you're not actually hyper-targeting your attack,  
you need to brute force _all_ the variables I mention above in  
parallel, except in pathological cases -- and those, if you know of  
some, would be interesting for the discussion.

economy of scale, and can profitably invest in specialized tools,
even specialized hardware such as a COPACOBANA [1].

The OpenBSD eksblowfish/bcrypt design can't be bitsliced and generally  
doesn't lend itself well to large speedups in hardware, by design.



The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]